Hi Hilton, Hilton Campbell wrote: > Yes, that's actually come up. The document ids are indeed changing which is > causing problems. I'm still trying to work it out myself, but any help > would most definitely be appreciated. > > Thanks, > Hilton Campbell > > -----Original Message----- > From: Antony Bowesman [mailto:[EMAIL PROTECTED] > Sent: Wednesday, June 06, 2007 11:36 PM > To: java-user@lucene.apache.org > Subject: Re: How can I search over all documents NOT in a certain subset? > > Steven Rowe wrote: >> Conceptually (caveat: untested), you could: >> >> 1. Extend Filter[1] (call it DejaVuFilter) to hold a BitSet per >> IndexReader. The BitSet would hold one bit per doc[2], each initialized >> to true. >> >> 2. Unset a DejaVuFilter instance's bit for each of your top N docs by >> walking the TopDocs returned by Searcher.search(Query,Filter,int)[3]. >> Initially, you could pass in null for the Filter, and then for all >> following calls, an instance of DejaVuFilter. > > Just a thought... > > If Hilton wants to be aware of new Documents in the index since the previous > search, this requires opening a new IndexReader. > > If only Documents have been added to the index I expect, but am not > sure, that the bits from the old IndexReader are still valid for the > document numbers in the new Reader. However, if there have been > deletions or optimisation has occurred between reader instances, then > the document ids from the old reader may not represent the same > documents in the new reader, so the Filter for the old reader will > not be valid for the new search against the new reader and you may > get false matches. > > I don't think there will be a problem if there are no deletions.
My bad for not pointing out this shortcoming. Karl Wettin's patch may be useful to you: <https://issues.apache.org/jira/browse/LUCENE-879> Steve -- Steve Rowe Center for Natural Language Processing http://www.cnlp.org/tech/lucene.asp --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]