July 2013, Apache Lucene™ 4.4 available
The Lucene PMC is pleased to announce the release of Apache Lucene 4.4
Apache Lucene is a high-performance, full-featured text search engine
library written entirely in Java. It is a technology suitable for nearly
any application that requires full-text search, especially cross-platform.
This release contains numerous bug fixes, optimizations, and
improvements, some of which are highlighted below. The release
is available for immediate download at:
See the CHANGES.txt file included with the release for a full list of
Lucene 4.4 Release Highlights:
* New Replicator module: replicate index revisions between server and
client. See http://shaierera.blogspot.com/2013/05/the-replicator.html
* New AnalyzingInfixSuggester: finds suggestions based on matches to any
tokens in the suggestion, not just based on pure prefix matching. See
* New PatternCaptureGroupTokenFilter: emit multiple tokens, one for each
capture group in one or more Java regexes.
* New Lucene Facet module features:
* Added dynamic (no taxonomy index used) numeric range faceting (see
* Arbitrary Querys are now allowed for per-dimension drill-down on
DrillDownQuery and DrillSideways, to support future dynamic faceting.
* New FacetResult.mergeHierarchies: merge multiple FacetResult of the
same dimension into a single one with the reconstructed hierarchy.
* FST's Builder can now handle more than 2.1 billion "tail nodes" while
building a minimal FST.
* FieldCache Ints and Longs now use bit-packing to save memory. String fields
have more efficient compression if there are many unique terms.
* Improved compression for NumericDocValues for dates and fields with very
small numbers of unique values.
* New IndexWriter.hasUncommittedChanges(): returns true if there are changes
that have not been committed.
* multiValuedSeparator in PostingsHighlighter is now configurable, for cases
where you want a different logical separator between field values.
* NorwegianLightStemFilter and NorwegianMinimalStemFilter have been extended
to handle "nynorsk".
* New ScandinavianFoldingFilter and ScandinavianNormalizationFilter.
* Easier compressed norms: Lucene42NormsFormat now takes an overhead
parameter, allowing for values other than PackedInts.FASTEST.
* Analyzer now has an additional tokenStream(String fieldName, String text)
method, so wrapping by StringReader for common use is no longer needed.
* New SimpleMergedSegmentWarmer: just ensures that data structures
(terms, norms, docvalues, etc.) are initialized.
* IndexWriter flushes segments to the compound file format by default.
* Various bugfixes and optimizations since the 4.3.1 release.
Please read CHANGES.txt for a full list of new features.
Please report any feedback to the mailing lists
Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using
may not have replicated the release yet. If that is the case, please
try another mirror. This also goes for Maven access.