Plan A sounds better because I don't want to consider the entire collection and then remove results from it.
However, the same code has to work with 2 different collections. The first one has 30.000 docs the other one 90.000. How can I get the total amount of docs from a collection ? thanks On 29 March 2011 21:48, Ian Lea <ian....@gmail.com> wrote: > Here are a couple of ideas. > > Plan A. > > Think of a number, say 10, retrieve n * 10 docids in your search and > then loop round java.util.Random.nextInt(n * 10) until you've got > enough. > > Plan B. > > Reverse your MUST NOT search to get a list of docids that you don't > want, then loop round Random.nextInt(indexreader.numDocs()), selecting > those that are not deleted (!indexreader.isDeleted(docid)) and are not > in your exclusion list. > > > I'm sure there are other ways, probably better. > > > -- > Ian. > > > On Tue, Mar 29, 2011 at 8:00 PM, Patrick Diviacco > <patrick.divia...@gmail.com> wrote: > > Ok I've solved the first part of the problem. I'm now selecting all > > documents that do not contain a given term with a BooleanFilter > > and FilterClause, MUST NOT. > > > > I still have to understand how to retrieve random documents and limit the > > number of retrieved docs to N. > > > > thanks > > > > On 29 March 2011 20:40, Patrick Diviacco <patrick.divia...@gmail.com> > wrote: > > > >> Is there a Filter to get a limited number of random collection docs from > >> the index which DO NOT contain a specific term ? > >> > >> i.e. term="pizza" > >> > >> I want to run the query against 10 random documents of the collection > that > >> do not contain the term "pizza". > >> > >> thanks > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >