Child pages
  • ONIP-4: Distribute langdetect model as Maven dependency
Skip to end of metadata
Go to start of metadata

Introduction and Goal

Now that OpenNLP has a langdetect model available for download it would be useful to distribute this model as a Maven dependency. Having the model available as a Maven dependency can make the model easier to acquire, use, and promote OpenNLP.  Any work done for this task is captured by the task  OPENNLP-1164 - Getting issue details... STATUS .

The langdetect model is built from the OpenNLP data repository in SVN at https://svn.apache.org/repos/bigdata/opennlp/trunk. It would be ideal to automate whatever process is chosen as much as possible to take the models built from that corpus and release them as Maven artifacts. At the time of writing, the langdetect model is the only model available for download but the process chosen should be able to support other types (sentence, token, namefinder, etc.) of models and languages of those models.

Proposed Process

(TODO: Look at how models are built from the corpus repo and see how the built models can be included in artifacts to release.)

  • No labels