Plan A sounds better because I don't want to consider the entire collection
and then remove results from it.

However, the same code has to work with 2 different collections. The first
one has 30.000 docs the other one 90.000.

How can I get the total amount of docs from a collection ?
thanks




On 29 March 2011 21:48, Ian Lea <ian....@gmail.com> wrote:

> Here are a couple of ideas.
>
> Plan A.
>
> Think of a number, say 10, retrieve n * 10 docids in your search and
> then loop round java.util.Random.nextInt(n * 10) until you've got
> enough.
>
> Plan B.
>
> Reverse your MUST NOT search to get a list of docids that you don't
> want, then loop round Random.nextInt(indexreader.numDocs()), selecting
> those that are not deleted (!indexreader.isDeleted(docid)) and are not
> in your exclusion list.
>
>
> I'm sure there are other ways, probably better.
>
>
> --
> Ian.
>
>
> On Tue, Mar 29, 2011 at 8:00 PM, Patrick Diviacco
> <patrick.divia...@gmail.com> wrote:
> > Ok I've solved the first part of the problem. I'm now selecting all
> > documents that do not contain a given term with a BooleanFilter
> > and FilterClause, MUST NOT.
> >
> > I still have to understand how to retrieve random documents and limit the
> > number of retrieved docs to N.
> >
> > thanks
> >
> > On 29 March 2011 20:40, Patrick Diviacco <patrick.divia...@gmail.com>
> wrote:
> >
> >> Is there a Filter to get a limited number of random collection docs from
> >> the index which DO NOT contain a specific term ?
> >>
> >> i.e. term="pizza"
> >>
> >> I want to run the query against 10 random documents of the collection
> that
> >> do not contain the term "pizza".
> >>
> >> thanks
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to