Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: converted to 1.6 markup

The SpamAssassin Challenge

Wiki Markup(THIS IS A DRAFT; see \[ bug 5376 for discussion\])

Wiki MarkupThe \[ Netflix Prize\] is a machine-learning challenge from Netflix which 'seeks to substantially improve the accuracy of predictions about how much someone is going to love a movie based on their movie preferences.'

We in SpamAssassin have similar problems; maybe we can solve them in a similar way. We have:


Input: the test data: mass-check logs


We will take the [SpamAssassin] 3.2.0 mass-check logs, and split them into test and training sets; 90% for training, 10% for testing, is traditional. Any cleanups that we had to do during \[ bug 5270\] are re-applied.

The test set is saved, and not published.