Test Plan for Apache OpenNLP 1.6.0
This page contains the test plan for the 1.6.0 release.
The 1.6.0 release introduces API changes.
Apache OpenNLP 1.6.0 requires Java 1.7.
Compatibility Test with OpenNLP 1.5.0 SourceForge Models
The 1.5.0 SourceForge models must be fully compatible with the 1.6.0
release. In this test all the English models are tested for compatibility
on the English 300k sentences Leipzig Corpus. It is tested that
the output produced with the same model by both versions has the same md5 hash.
Load: the time in seconds to Apache OpenNLP took to load the model.
Avg: the average throughput of the module, in sentences per second.
Component | Model | Perf 1.5.2 | Perf 1.5.3 | Perf 1.6.0 | Tester | Passed | Comment |
---|---|---|---|---|---|---|---|
Sentence Detector | en-sent.bin | load: 0,093s | load: 0,099s
| load: 0,106s | William | False | Output does not match. |
Tokenizer | en-token.bin | load: 0,234s | load: 0,201s | load: 0,198s | William | Yes | RC2 |
Name Finder | en-ner-person.bin | load: 1,133s | load: 1,152s | load: 1,129s | William | Yes | RC2 |
POS Tagger | en-pos-maxent.bin | load: 1,764s | load: 1,703s
| load: 1,435s | William | Yes | RC2 |
POS Tagger | en-pos-perceptron.bin | load: 1,309s | load: 1,124s
| load: 1,170s | William | Yes | RC2 |
Chunker | en-chunker.bin | load: 0,667s | load: 0,510s | load: 0,562s | William | False | Output does not match. |
Parser | en-parser-chunking.bin | load: 8,353s | load: 7,335s | load: 7,200s | William | False | Output does not match. |
Note: Test was done on MacBook Pro 15", 2 GHz Core i7, 16GB Ram, 500GB HD running OS X 10.10.1
and Java 1.7.0_71-b14. The performance varies because light weight tasks have been performed in the background while testing.
Note: "Concurrent" in the comment means that both tests where started at the same time.
Regression Test Training with (private) English data
The training of both versions with the same data must produce
a model with identical output. The model output is tested with
the procedure from the previous test.
To pass the test the event hash and the model output must be identical.
Component | Model | Tester | Passed | Comment |
---|---|---|---|---|
Sentence Detector | en-sent.bin | Jörn | Yes | RC2 |
Tokenizer | en-token.bin | Jörn | Yes | RC2 |
POS Tagger | en-pos-maxent.bin | Jörn | Yes | RC2 |
POS Tagger | en-pos-perceptron.bin | Jörn | Yes | RC2 |
Parser | en-parser-chunking.bin | Jörn | No | RC2 |
Note: Time was measured with the time command, the value is the "real" time value.
Performance test with public data
Test the tagging performance with all the publicly available training
and test data for various languages.
It is assumed that the training will be done with a cutoff of 5 and 100 iterations,
if different values are used please write them into the comment.
Component | Data | Tester | Tagging Perf 1.5.2 | Tagging Perf 1.5.3 | Tagging Perf 1.6.0 | Comment |
---|---|---|---|---|---|---|
Name Finder | CONLL 2002 Dutch Person ned.testa | joern | Precision: 0.7552941176470588 | Precision: 0.7552941176470588 | Precision: 0.7552941176470588 Recall: 0.4566145092460882 F-Measure: 0.5691489361702128 |
|
Name Finder | CONLL 2002 Dutch Person ned.testb | joern | Precision: 0.8505025125628141 | Precision: 0.8505025125628141 | Precision: 0.8505025125628141 Recall: 0.6165755919854281 F-Measure: 0.7148891235480465 |
|
Name Finder | CONLL 2002 Dutch Organization ned.testa | joern | Precision: 0.8561872909698997 | Precision: 0.8561872909698997 | Precision: 0.8561872909698997 Recall: 0.37317784256559766 F-Measure: 0.5197969543147207 |
|
Name Finder | CONLL 2002 Dutch Organization ned.testb | joern | Precision: 0.7830374753451677 | Precision: 0.7830374753451677 | Precision: 0.7830374753451677 Recall: 0.4501133786848073 F-Measure: 0.5716342692584593 |
|
Name Finder | CONLL 2002 Dutch Location ned.testa | joern | Precision: 0.8458333333333333 | Precision: 0.8458333333333333 | Precision: 0.8458333333333333 Recall: 0.42379958246346555 F-Measure: 0.564673157162726 |
|
Name Finder | CONLL 2002 Dutch Location ned.testb | joern | Precision: 0.8816326530612245 | Precision: 0.8816326530612245 | Precision: 0.8816326530612245 Recall: 0.5581395348837209 F-Measure: 0.6835443037974683 |
|
Name Finder | CONLL 2002 Dutch Misc ned.testa | joern | Precision: 0.8354114713216958 | Precision: 0.8354114713216958 | Precision: 0.8354114713216958 Recall: 0.44786096256684493 F-Measure: 0.5831157528285466 |
|
Name Finder | CONLL 2002 Dutch Misc ned.testb | joern | Precision: 0.8264984227129337 | Precision: 0.8264984227129337 | Precision: 0.8264984227129337 Recall: 0.44144903117101936 F-Measure: 0.5755079626578803 |
|
Name Finder | CONLL 2002 Combined ned.testa | jkosin |
| Precision: 0.7570754716981132 Recall: 0.4566145092460882 F-Measure: 0.5696539485359361 | Precision: 0.7570754716981132 Recall: 0.4566145092460882 F-Measure: 0.5696539485359361 | Old results couldn't be reproduced. Training 1.53 and 1.6.0-rc2 with the same data file. |
Name Finder | CONLL 2002 Dutch Combined ned.testb | jkosin |
| Precision: 0.8479899497487438 Recall: 0.6147540983606558 F-Measure: 0.7127771911298839 | Precision: 0.8479899497487438 Recall: 0.6147540983606558 F-Measure: 0.7127771911298839 | Old results couldn't be reproduced. Training 1.53 and 1.6.0-rc2 with the same data file. |
Name Finder | CONLL 2002 Spanish Person esp.testa | jkosin | Precision: 0.9010695187165776 | Precision: 0.9010695187165776 |
| |
Name Finder | CONLL 2002 Spanish Person esp.testb | jkosin | Precision: 0.9195205479452054 | Precision: 0.9195205479452054 |
| |
Name Finder | CONLL 2002 Spanish Organization esp.testa | jkosin | Precision: 0.8288942695722357 | Precision: 0.8288942695722357 |
| |
Name Finder | CONLL 2002 Spanish Organization esp.testb | jkosin | Precision: 0.8036277602523659 | Precision: 0.8036277602523659 |
| |
Name Finder | CONLL 2002 Spanish Location esp.testa | jkosin | Precision: 0.7743016759776536 | Precision: 0.7743016759776536 |
| |
Name Finder | CONLL 2002 Spanish Location esp.testb | jkosin | Precision: 0.8301886792452831 | Precision: 0.8301886792452831 |
| |
Name Finder | CONLL 2002 Spanish Misc esp.testa | jkosin | Precision: 0.6492890995260664 | Precision: 0.6492890995260664 |
| |
Name Finder | CONLL 2002 Spanish Misc esp.testb | jkosin | Precision: 0.686046511627907 | Precision: 0.686046511627907 |
| |
Name Finder | CONLL 2002 Spanish Combined esp.testa | jkosin | Precision: 0.7005423249233671 | Precision: 0.7047866069323273 | 1000 iterations | |
Name Finder | CONLL 2002 Spanish Combined esp.testb | jkosin | Precision: 0.756635931824532 | Precision: 0.7588711930706902 | 1000 iterations | |
Name Finder | CONLL 2003 English Person eng.testa | jkosin | Precision: 0.9523195876288659 | Precision: 0.9523195876288659 |
| |
Name Finder | CONLL 2003 English Person eng.testb | jkosin | Precision: 0.9391727493917275 | Precision: 0.9391727493917275 |
| |
Name Finder | CONLL 2003 English Organization eng.testa | jkosin | Precision: 0.8768046198267565 | Precision: 0.8768046198267565 |
| |
Name Finder | CONLL 2003 English Organization eng.testb | jkosin | Precision: 0.8435980551053485 | Precision: 0.8435980551053485 |
| |
Name Finder | CONLL 2003 English Location eng.testa | jkosin | Precision: 0.9361421988150099 | Precision: 0.9361421988150099 |
| |
Name Finder | CONLL 2003 English Location eng.testb | jkosin | Precision: 0.9206349206349206 | Precision: 0.9206349206349206 |
| |
Name Finder | CONLL 2003 English Misc eng.testa | jkosin | Precision: 0.9027982326951399 | Precision: 0.9027982326951399 |
| |
Name Finder | CONLL 2003 English Misc eng.testb | jkosin | Precision: 0.8592436974789915 | Precision: 0.8592436974789915 |
| |
Name Finder | CONLL 2003 English Combined eng.testa | jkosin | Precision: 0.861812521618817 | Precision: 0.8640608785887236 | 1000 iterations | |
Name Finder | CONLL 2003 English Combined eng.testb | jkosin | Precision: 0.8041311831853597 | Precision: 0.8064866823699945 | 1000 iterations | |
Name Finder | CONLL 2003 German Person deu.testa | jkosin | Precision: 0.9132653061224489 | Precision: 0.9132653061224489 |
| |
Name Finder | CONLL 2003 German Person deu.testb | jkosin | Precision: 0.8732106339468303 | Precision: 0.8732106339468303 |
| |
Name Finder | CONLL 2003 German Organization deu.testa | jkosin | Precision: 0.8407224958949097 | Precision: 0.8407224958949097 |
| |
Name Finder | CONLL 2003 German Organization deu.testb | jkosin | Precision: 0.8014705882352942 | Precision: 0.8014705882352942 |
| |
Name Finder | CONLL 2003 German Location deu.testa | jkosin | Precision: 0.7816326530612245 | Precision: 0.7816326530612245 |
| |
Name Finder | CONLL 2003 German Location deu.testb | jkosin | Precision: 0.8033826638477801 | Precision: 0.8033826638477801 |
| |
Name Finder | CONLL 2003 German Misc deu.testa | jkosin | Precision: 0.7055555555555556 | Precision: 0.7055555555555556 |
| |
Name Finder | CONLL 2003 German Misc deu.testb | jkosin | Precision: 0.6601307189542484 | Precision: 0.6601307189542484 |
| |
Name Finder | CONLL 2003 German Combined deu.testa | jkosin | Precision: 0.7718859429714857 | Precision: 0.7783891945972986 | OPENNLP-417 | |
Name Finder | CONLL 2003 German Combined deu.testb | jkosin | Precision: 0.7467566165023353 | Precision: 0.749351323300467 | OPENNLP-417 | |
POS Tagger | CONLL 2006 Danish | Jörn / ? | Accuracy: 0.9511278195488722 | Accuracy: 0.9512987012987013 | Jörn: Same result as other tester | |
POS Tagger | CONLL 2006 Dutch | Jörn | Accuracy: 0.9324977618621307 | Accuracy: 0.9324977618621307 |
| |
POS Tagger | CONLL 2006 Portuguese | Jörn / ? | Accuracy: 0.9659110277825124 | Accuracy: 0.9659110277825124 | Jörn: Same result as other tester | |
POS Tagger | CONLL 2006 Swedish | Jörn | Accuracy: 0.9275106082036775 | Accuracy: 0.9275106082036775 |
| |
Chunker | CONLL 2000 | William | Precision: 0.9257575757575758 | Precision: 0.9257575757575758 |
| |
Sentence Detector | Arvores Deitadas | William |
| Precision: 0.9891491491491492 | PERCEPTRON Cutoff 0 | |
Tokenizer | Arvores Deitadas | William |
| Precision: 0.9995231988260895 | PERCEPTRON Cutoff 0 | |
Chunker | Arvores Deitadas | William | Precision: 0.9404684925220583 | Precision: 0.9562405864042575 | OPENNLP-541, OPENNLP-423 |
Test UIMA Integration
The test ensures that the Analysis Engine can run and not not
crash trough simple runtime time code errors. We need to add
more sophisticated testing with the next releases.
Analysis Engine | Tester | Passed | Comment |
---|---|---|---|
Sentence Detector |
|
|
|
Sentence Detector Trainer |
|
|
|
Tokenizer ME |
|
|
|
Tokenizer Trainer |
|
|
|
Name Finder |
|
|
|
Name Finder Trainer |
|
|
|
Chunker |
|
|
|
Chunker Trainer |
|
|
|
POS Tagger |
|
|
|
POS Tagger Trainer |
|
|
|
Parser |
|
|
|
createPear.sh | Jörn |
|
|
Sample PEAR | Jörn |
|
|
Distribution Review
Please ensure that the listed files below are included in the distributions
and are in a good state.
Package | File or Test | Tester | Passed | Comment |
---|---|---|---|---|
Binary | LICENSE | William | No | Needs review. We don't distribute JWNL. |
Binary | NOTICE | William | No | Copyright 2010, 2013. We don't distribute JWNL. |
Binary | README | William | No | Needs review |
Binary | RELEASE_NOTES.html | William | Yes |
|
Binary | Test signatures: .md5, .sha1, .asc | Jörn |
|
|
Binary | JIRA issue list created | William | Yes |
|
Binary | Contains maxent, tools, uima and jwnl jars | William | No | It contains only tools and uima, is it correct? |
Source | LICENSE | Jörn |
|
|
Source | NOTICE | Jörn |
| |
Source | Test signatures: .md5, .sha1, .asc | Jörn |
|
|
Source | Can build from source? | Jörn |
|
|
Notes about testing
Compatibility tests
The following commands can be used to reproduce the compatibility tests with Leipzig corpus.
# Corpus preparation: the following command will create documents from the corpus. Sed is used to remove the language prefix sh bin/opennlp DoccatConverter leipzig -data ../eng_news_2010_300K-text/eng_news_2010_300K-sentences.txt -encoding UTF-8 -lang en | sed -E 's/^en[[:space:]]//g' > ../out-tokenized-documents.test # Corpus preparation: this forces the detokenization of the documents sh bin/opennlp SentenceDetectorConverter namefinder -data ../out-tokenized-documents.test -encoding UTF-8 -detokenizer trunk/opennlp-tools/lang/en/tokenizer/en-detokenizer.xml > ../out-documents.test # Now the actually tests. Execute it for the previous release and for the current RC. Compare the output using diff: time sh bin/opennlp SentenceDetector ../models/en-sent.bin < ../out-documents.test > ../out-sentences_1.5.2.test time sh bin/opennlp TokenizerME ../models/en-token.bin < ../out-sentences_1.5.2.test > ../out-toks_1.5.2.test time sh bin/opennlp TokenNameFinder ../models/en-ner-person.bin < ../out-toks_1.5.2.test > ../out-ner_1.5.2.test time sh bin/opennlp POSTagger ../models/en-pos-maxent.bin < ../out-toks_1.5.2.test > ../out-pos_maxent_1.5.2.test time sh bin/opennlp POSTagger ../models/en-pos-perceptron.bin < ../out-toks_1.5.2.test > ../out-pos_pers_1.5.2.test time sh bin/opennlp ChunkerME ../models/en-chunker.bin < ../out-pos_pers_1.5.2.test > ../out-chk_1.5.2.test time sh bin/opennlp Parser ../models/en-parser-chunking.bin < ../out-toks_1.5.2.test > ../out-parse_1.5.2.test