For this case, too, you will need to use an IndexReader, or use IndexSearcher to run that particular search and then delete the docIDs returned using the IndexReader.

Though, be sure to first iterate through all hits, gathering all docIDs. And then in 2nd pass, do the deletions. Otherwise you'll hit this issue:

    https://issues.apache.org/jira/browse/LUCENE-1096

(Unless you're using 2.3).

You can also use Solr, which provides "delete by query".

Mike

Cam Bazz wrote:

Hello Mike;

How about deleting by a compount term?

for example if I have a document with two fields srcId and dstId
and I want to delete the document where srcId=1 and dstId=2

right now there exists a IndexWriter.deleteDocuments(Term t) but with that I
can only delete lets say where srcId=something.

I am sure there is a workaround but I could not find it.

Best,

On Jan 19, 2008 1:07 PM, Michael McCandless <[EMAIL PROTECTED]>
wrote:


Good question....

So far, this method has not been carried over to IndexWriter because
in general it's not really safe, since there's no way to get an
accurate docID from IndexWriter itself.

You can't really "know" when IndexWriter does merges that compacts
deletes and thus changes docIDs.  So, if you open a reader on the
side, get a docID you want to delete, and then go and ask IndexWriter
to delete that docID, you may in fact delete the wrong document.  In
2.3, where segment merges are now done with a background thread, it's
even worse, because a merge could complete and be committed, thus
changing docIDs, at any time...

See complex discussion here:

    http://markmail.org/message/wxqel3gd6cmavk5a

As of 2.3, the low level infrastructure was added to IndexWriter for
deleting by document ID, but this is not exposed publicly (this was a
side effect of LUCENE-1112).  It's only used, internally, to delete a
document if an exception is hit while indexing it.  In theory, you
could then subclass IndexWriter and tap into this infrastructure to
delete by docID, but, you're entering dangerous territory!

Do you have a specific use case in mind here?  I think we'd like to
make this option available someday in IndexWriter, but doing so now
(when there is no way to get a "reliable" docID) seems too dangerous...

Mike

Cam Bazz wrote:

Hello,

How do I delete a specific document from an indexwriter? I
understand there
is deleteDocuments(term) which deletes all the documents matching
the term.
But what if I want to delete a document that has more then one term in
specific. I can search the document with a boolean query, and then
get the
doc id.
I know that doc ids are temporary, but can I not use it for delete?

IndexReader has a delete by doc id method, but I am not sure how to
use this
when using an indexwriter.

Best,
C.B.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to