> Plan A sounds better because I don't want to consider the entire collection > and then remove results from it.
Fine, your choice. > However, the same code has to work with 2 different collections. The first > one has 30.000 docs the other one 90.000. No problem. The number of docs is irrelevant. > How can I get the total amount of docs from a collection ? IndexReader.numDocs(). See also maxDoc() and numDeletedDocs(). -- Ian. > On 29 March 2011 21:48, Ian Lea <ian....@gmail.com> wrote: > >> Here are a couple of ideas. >> >> Plan A. >> >> Think of a number, say 10, retrieve n * 10 docids in your search and >> then loop round java.util.Random.nextInt(n * 10) until you've got >> enough. >> >> Plan B. >> >> Reverse your MUST NOT search to get a list of docids that you don't >> want, then loop round Random.nextInt(indexreader.numDocs()), selecting >> those that are not deleted (!indexreader.isDeleted(docid)) and are not >> in your exclusion list. >> >> >> I'm sure there are other ways, probably better. >> >> >> -- >> Ian. >> >> >> On Tue, Mar 29, 2011 at 8:00 PM, Patrick Diviacco >> <patrick.divia...@gmail.com> wrote: >> > Ok I've solved the first part of the problem. I'm now selecting all >> > documents that do not contain a given term with a BooleanFilter >> > and FilterClause, MUST NOT. >> > >> > I still have to understand how to retrieve random documents and limit the >> > number of retrieved docs to N. >> > >> > thanks >> > >> > On 29 March 2011 20:40, Patrick Diviacco <patrick.divia...@gmail.com> >> wrote: >> > >> >> Is there a Filter to get a limited number of random collection docs from >> >> the index which DO NOT contain a specific term ? >> >> >> >> i.e. term="pizza" >> >> >> >> I want to run the query against 10 random documents of the collection >> that >> >> do not contain the term "pizza". >> >> >> >> thanks >> >> >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org