Monday, March 12, 2012

Mizozo's Updated Algorithm Code

With development slowly winding down, much of last week was spent in trying to develop a new way for categorizing articles on Mizozo. This may seem like a trivial matter, but in fact we consider this one of Mizozo's greatest features. Trying to identify those things that are popular, thus giving the user what is likely the hottest topic of the moment, is the purpose of the algorithm.

Last week we spent a great deal of time identifying how best to approach this problem. We identified several items that make for a good story. These items include:

  • Page Views
  • Number of Comments
  • Number of Responses
  • Article's Age
  • Publisher's Status

We then identified several minor things to take into account, such as possible trending of a story and if the story has made it to the featured list. As stories age, the articles must settle in some fashion that gives them a logical progression, most likely ordering them simply by their timestamp. Since a timestamp is a very large number, we decided to go with the article's ID, a simple auto_increment field which also gives us a good sequence.

After much deliberation, we decided to remove the Publisher's Status from the current article, allowing for a future enhancement. While we haven't fixed the sorting algorithm yet, the current version looks like this (please ignore the PHP code):


    // Date Coefficient
    $dCof = 1+((5-$days)*.1);
    if ($days < 1)
      $dCof = $dCof+((12-($hours/2))*.05);
    if ($dCof < 1)
      $dCof = 1;

    // Views Coefficient
    $vCof = $article['views_new']*2;

    // Comments Coefficient
    $cCof = ($article['comments_count']-$article['prev_comments_count'])*50;

    // Response Coefficient
    $rCof = 1+($article['responses']*.1);

    // Increasing Trend Coefficient
    $tCof = 1;
    if (($article['views_count']-$article['prev_views_count']) < $article['views_new'])
      $tCof = 1.3;

    // Featured Coefficient
    $fCof = 1+($article['is_featured']*.2);

    $article['score'] = ($article['score'] + ($article['id']+$vCof+$cCof)*$dCof*$rCof*$tCof*$fCof)/2;


What do you guys think? Does this algorithm do a decent job of ordering articles as they appear on Mizozo? Over time, this algorithm will be changed often, as it has in the past. Likewise, we plan on implementing a scoring system for our publishers which will also influence the overall score of each posted article.

No comments:

Post a Comment