We have SortedSetDocValuesField.newSlowRangeQuery() which does something close 
to what you want here, I think.

> On 26 Oct 2021, at 15:23, Michael McCandless <luc...@mikemccandless.com 
> <mailto:luc...@mikemccandless.com>> wrote:
> 
> Hi Team,
> 
> I was discussing this problem with Greg Miller (also at Amazon Product 
> Search):
> 
> If I want to make a query that filters out a few primary keys (ASIN in our 
> Amazon Product Search world), I can make a TermInSetQuery and add it as a 
> MUST_NOT onto a BooleanQuery that has all the other interesting clauses for 
> my query.
> 
> But if I have many, many ASINs to filter out, at some point it may become 
> more efficient to just use doc values and filter them out like Solr's 
> "post-filter" / during collection, e.g. by loading the BINARY value or SORTED 
> (globalized) ordinal, and checking e.g. a HashSet to see if it should be 
> skipped.  Not using the inverted index at all...
> 
> Do we already have such a "slow DV TermInSet" query?
> 
> It seems like it could belong in SortedDocValues where we already have 
> newSlowRangeQuery, newSlowExactQuery, we could add a newSlowInSetQuery?
> 
> And then we could make an IndexOrDocValuesQuery with both the TermInSetQuery 
> and this SDV.newSlowInSetQuery?
> 
> Or maybe there is already a good way to do this in Lucene?
> 
> Thanks!,
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com <http://blog.mikemccandless.com/>

Reply via email to