I thought along the line of dummy:all as well, but for some reason, I chose bitset. I am not sure if it matters which route to go.
My situation is that for now, I have say 10000 documents, but, maybe I incease that by 100 time or more. I would guess that half of the documents qualifies, and half don't. It appears that everyone suggests that I take MatchAllDocsQuery from the trunk. Is this the choice regardless of the number of documents I have ? Thanks, On 1/6/06, Beady Geraghty <[EMAIL PROTECTED]> wrote: > > Thank you all for your answer. > > On 1/6/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > > Should we should detect the case of all negative clauses and throw in > > a MatchAllDocsQuery? > > > > I guess this would be done in the QueryParser, but one could also make > > a case for doing it in the BooleanQuery. > > > > -Yonik > > > > On 1/6/06, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > With Lucene's trunk, there is a MatchAllDocsQuery. You could use > > > this in a BooleanQuery with your negative-only query. > > > > > > Another option, if you're at Lucene 1.4.3 is to index the same value > > > for a dummy field for every document (say like "dummy:all") and use a > > > TermQuery in a BooleanQuery with the negative-only query. > > > > > > As for BitSet's, if you need to go that route a QueryFilter would > > > give you the BitSet back that you could easily complement, but that > > > might be a bit overkill for what you need given the option above. > > > > > > Erik > > > > > > > > > On Jan 6, 2006, at 12:04 PM, Beady Geraghty wrote: > > > > > > > I would like to do queries that are negative. I mean a query with > > > > only negative terms and phrases. For example, retrieve all > > > > documents that do not contain the term "apple". > > > > > > > > For now, I have a limited set of documents (say, 10000) to index. > > > > I can create a bitset that represents the search result of hits on > > > > "apple". > > > > Then I complement (XOR) the result. > > > > Each bit corresponds to a document ID. > > > > My question is : > > > > Inside Lucene, are the hits represented in some form of a bitset. > > > > Can I get at it directly. I saw the BitSet class. (I now use > > > > Java's Bitset class). > > > > Assuming that hits are internally represented as bitset, for a > > > > small number of documets, the bitset won't be very big, > > > > and if there are plenty of hits and many many more documents, > > > > is the bitset still kept entirely > > > > in memory as well ? > > > > > > > > Thank you > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > >