I would like to do queries that are negative. I mean a query with only negative terms and phrases. For example, retrieve all documents that do not contain the term "apple".
For now, I have a limited set of documents (say, 10000) to index. I can create a bitset that represents the search result of hits on "apple". Then I complement (XOR) the result. Each bit corresponds to a document ID. My question is : Inside Lucene, are the hits represented in some form of a bitset. Can I get at it directly. I saw the BitSet class. (I now use Java's Bitset class). Assuming that hits are internally represented as bitset, for a small number of documets, the bitset won't be very big, and if there are plenty of hits and many many more documents, is the bitset still kept entirely in memory as well ? Thank you