...
IndexWriter
allows you to delete by Term
or by Query
. The deletes are buffered and then periodically flushed to the index, and made visible once commit()
or close()
is called.
IndexReader
can also delete documents, by Term
or document number, but you must close any open IndexWriter
before using IndexReader
to make changes (and, vice/versa). IndexReader
also buffers the deletions and does not write changes to the index until close()
is called, but if you use that same IndexReader
for searching, the buffered deletions will immediately take effect. Unlike IndexWriter
's delete methods, IndexReader
's methods return the number of documents that were deleted.
Generally it's best to use IndexWriter
for deletions, unless 1) you must delete by document number, 2) you need your searches to immediately reflect the deletions or 3) you must know how many documents were deleted for a given deleteDocuments invocation.
If you must delete by document number but would otherwise like to use IndexWriter
, one common approach is to make a primary key field, that holds a unique ID string for each document. Then you can delete a single document by creating the Term
containing the ID, and passing that to If you would like to delete documents by document number, IndexWriter
provides tryDeleteDocument
. Note however that this method only succeeds if the segment where the doc ID belongs has not been merged away. It is generally preferred to use a primary key field that holds a unique ID for each document and to use this field to delete by Term
by passing it to IndexWriter
's deleteDocuments(Term)
method.
Once a document is deleted it will not appear in TermDocs
nor TermPositions
enumerations, nor any search results. Attempts to load the document will result in an exception. The presence of this document may still be reflected in the docFreq
statistics, and thus alter search scores, though this will be corrected eventually as segments containing deletions are merged.
...