Mike, Mike, I have an implementation of FieldCacheTermsFilter (which uses field cache to filter for a predefined set of terms) around if either of you are interested. It is faster than materializing the filter roughly when the filter matches more than 1% of the documents.
So it's not better for a large set of small filters (which you can materialize on the spot) but it is better for a small set (but more than 32) large filters. Let me know if you're interested and I'll send it in. Tim On 12/10/08 3:34 AM, "Michael McCandless" <[EMAIL PROTECTED]> wrote: > > In your approach, roughly how many filters do you have cached? It > seems like it could be quite a few (one for each color, one for each > type, etc)? > > You might be able to modify the new (on Lucene trunk) > FieldCacheRangeFilter to achieve this same filtering without actually > having to materialize the full bitset for each. > > Mike > > Michael Stoppelman wrote: > >> Yeah looks similar to what we've implemented for ourselves (although I >> haven't looked at the implementation). We've got quite a custom >> version of >> lucene at this point. Using Solr at this point really isn't a viable >> option, >> but thanks for pointing this out. >> >> M >> >> On Tue, Dec 9, 2008 at 1:47 AM, Michael McCandless < >> [EMAIL PROTECTED]> wrote: >> >>> >>> This use case sounds alot like faceted navigation, which Solr >>> provides. >>> >>> Mike >>> >>> >>> Michael Stoppelman wrote: >>> >>> Hi all, >>>> >>>> I'm working on upgrading to Lucene 2.4.0 from 2.3.2 and was trying >>>> to >>>> integrate the new DodIdSet changes since >>>> o.a.l.search.Filter#bits() method >>>> is now depreciated. For our app we actually heavily rely on bits >>>> from the >>>> Filter to do post-query filtering (I explain why below). >>>> >>>> For example, if someone searches for product: "ipod" and then >>>> filters a >>>> type: "nano" (e.g. mini/nano/regular) AND color: "red" (e.g. >>>> red/yellow/blue). In our current model the results are gathered in >>>> the >>>> following way: >>>> >>>> 1) "ipod" w/o attributes is run and the results are stored in a >>>> hitcollector >>>> 2) "ipod" results are now filtered for color="red" AND type="mini" >>>> using >>>> the >>>> lucene Filters >>>> 3) The filtered results are returned to the user. >>>> >>>> The reason that the attributes are filtered post-query is so that >>>> we can >>>> return the other types and colors the user can filter by in the >>>> future. >>>> Meaning the UI would be able to show "blue", "green", "pink", >>>> etc... if we >>>> pre-filtered results by color and type before hand we wouldn't >>>> know what >>>> the >>>> other filter options would be there for a broader result set. >>>> >>>> Does anyone else have this use case? I'd imagine other folks are >>>> probably >>>> doing similar things to accomplish this. >>>> >>>> M >>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>> For additional commands, e-mail: [EMAIL PROTECTED] >>> >>> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]