On 10/9/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:
don't forget to optimize your index every now and then as well... deleting
a document just marks it as "deleted" it still gets inspectected by every
query during scoring at least once to see that it can skip it, optimizing
is the only thing that truely removes the "deleted" documents.

I'd refine that statement to "optimizing is the easiest way to remove
any deleted documents that still exist in the index".

Deleted documents are removed from segments that are merged, so it
depends on things like the mergeFactor, maxBufferedDocs, and where the
deleted docs are in the index (in the smallest or largest segments).
Some deleted docs will be removed quickly, but some won't.

Optimizing an index also has a beneficial effect on search speed even
beyond removing all of the deleted docs.  Each index segment is
actually a complete index on it's own... so if search is generally
O(log(N)), searching across M segments of since N will take M *
log(N).  If those segments are "optimized" into a single segment, the
search will be O(log(M*N)).

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to