Excellent, caching filters seem to fit the bill best so will use those
with the flags stored in the underlying index in the format you
suggested.  Thank you for the assistance.

Larry

-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED] 
Sent: Friday, November 10, 2006 12:27 PM
To: java-user@lucene.apache.org
Subject: Re: Searching by bit masks

Erick Erickson wrote:
> Something like
> Document doc = new Document();
> doc.add("flag1", "Y");
> doc.add("flag2", "Y");
> IndexWriter.add(doc);

Fields have overheads.  It would be more efficient to implement this as 
a single field with a different value for each boolean flag (as others 
have suggested).

> Another approach: create a set of Lucene Filters (really, these are
just
> Java bitsets), one for each flag. All this is a bitset with one bit
for 
> each
> document, or about 1M of memory per flag with 8M docs. So you'd
populate
> flag1Filter, flag2Filter... and have these ready whenever you needed
them.

Cached filters will be faster especially when a large portion of the 
documents have the flag set.  If, for example, you have a flag that is 
set in half the documents that is specified in half the queries, then a 
cached filter will have a large impact on not only the performance of 
those queries but on the performance of your service as a whole.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to