Apache Lucene Mahout > index > QuickStart > ClassifyingYourData
Added by Isabel Drost, last edited by Isabel Drost on Oct 02, 2009  (view change)

Mahout_0.2

After you've done the QuickStart and are familiar with the basics of Mahout, it is time to build a classifier from your own data.

The following pieces may be useful for in getting started:

Input

For starters, you will need your data in an appropriate Vector format (which has changed since Mahout 0.1)

Text Preparation

Running the Process

Naive Bayes

Background: Naive Bayes Classification

Documentation of running naive bayes from the command line: bayesian-commandline

C-Bayes

Background: C-Bayes Classification

Documentation of running c-bayes from the command line: c-bayes-commandline

Random Forests

Background: Random Forests Classification

Documentation of running random forests from the command line: random-forests-commandline