Excellent, caching filters seem to fit the bill best so will use those with the flags stored in the underlying index in the format you suggested. Thank you for the assistance.
Larry -----Original Message----- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Friday, November 10, 2006 12:27 PM To: java-user@lucene.apache.org Subject: Re: Searching by bit masks Erick Erickson wrote: > Something like > Document doc = new Document(); > doc.add("flag1", "Y"); > doc.add("flag2", "Y"); > IndexWriter.add(doc); Fields have overheads. It would be more efficient to implement this as a single field with a different value for each boolean flag (as others have suggested). > Another approach: create a set of Lucene Filters (really, these are just > Java bitsets), one for each flag. All this is a bitset with one bit for > each > document, or about 1M of memory per flag with 8M docs. So you'd populate > flag1Filter, flag2Filter... and have these ready whenever you needed them. Cached filters will be faster especially when a large portion of the documents have the flag set. If, for example, you have a flag that is set in half the documents that is specified in half the queries, then a cached filter will have a large impact on not only the performance of those queries but on the performance of your service as a whole. Doug --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]