Thanks for the advice everyone, I'll try updateDocument() for now. Sean
On Thu, Jul 12, 2012 at 3:25 PM, Michael McCandless <luc...@mikemccandless.com> wrote: > On Thu, Jul 12, 2012 at 6:17 PM, Simon Willnauer > <simon.willna...@gmail.com> wrote: >> Sean seriously a couple of hundred docs a second, don't bother just >> use updateDocument. My benchmarks show that there is only a smallish >> impact during indexing especially with concurrent flushing in lucene >> 4. I don't know how resource intensive your analysis chain is but on a >> decent machine you can easily go > 20k docs a second with >> updateDocument. >> >> If you want to give deleteByDocid a try for kicks I'd be curious how >> you solve some of the really tricky issues! :) > > This (add delete-by-docID to IndexWriter) has been fairly frequently > requested... > > But the problem is docIDs can suddenly change up whenever a merge > commits, so I don't see how we can add it in general. > > That said, there is an initial patch here: > > https://issues.apache.org/jira/browse/LUCENE-4203 > > It adds IW.tryDeleteDocument(AtomicReader reader, int docID), with the > requirement that the reader is a near-real-time reader obtained from > the writer. The delete will succeed (return true) if that reader has > not yet been merged away, else it fails (returns false) and you have > to do the delete the "normal" way (by Term). > > I won't have much time to get back to that issue in the near future so > feel free to take it! > > Mike McCandless > > http://blog.mikemccandless.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org