I am looking at the IndexWriter source code - and I could not find
a method (private) to delete by doc id.
Where is it hiding?
Best,
- C.B.
On Jan 19, 2008 1:07 PM, Michael McCandless <
[EMAIL PROTECTED]> wrote:
Good question....
So far, this method has not been carried over to IndexWriter because
in general it's not really safe, since there's no way to get an
accurate docID from IndexWriter itself.
You can't really "know" when IndexWriter does merges that compacts
deletes and thus changes docIDs. So, if you open a reader on the
side, get a docID you want to delete, and then go and ask IndexWriter
to delete that docID, you may in fact delete the wrong document. In
2.3 , where segment merges are now done with a background thread,
it's
even worse, because a merge could complete and be committed, thus
changing docIDs, at any time...
See complex discussion here:
http://markmail.org/message/wxqel3gd6cmavk5a
As of 2.3, the low level infrastructure was added to IndexWriter for
deleting by document ID, but this is not exposed publicly (this was a
side effect of LUCENE-1112). It's only used, internally, to delete a
document if an exception is hit while indexing it. In theory, you
could then subclass IndexWriter and tap into this infrastructure to
delete by docID, but, you're entering dangerous territory!
Do you have a specific use case in mind here? I think we'd like to
make this option available someday in IndexWriter, but doing so now
(when there is no way to get a "reliable" docID) seems too
dangerous...
Mike
Cam Bazz wrote:
> Hello,
>
> How do I delete a specific document from an indexwriter? I
> understand there
> is deleteDocuments(term) which deletes all the documents matching
> the term.
> But what if I want to delete a document that has more then one
term in
> specific. I can search the document with a boolean query, and then
> get the
> doc id.
> I know that doc ids are temporary, but can I not use it for delete?
>
> IndexReader has a delete by doc id method, but I am not sure how to
> use this
> when using an indexwriter.
>
> Best,
> C.B.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]