On 5/16/2022 2:32 PM, WU, Zhiqing wrote:
(Note: prodAllGeneric_txt_sort is a field, X is the details of a query)

If X is "(prodAllGeneric_txt_sort:\"Phi; \")", Solr finds 3 documents
If X is "(-prodAllGeneric_txt_sort:*)", Solr finds 8 documents
If X is "(prodAllGeneric_txt_sort:\"Phi; \") OR
(-prodAllGeneric_txt_sort:*)", Solr only find 3 documents.
What is wrong? I think Solr should find 11 documents. It seems OR does not
work this time.

You're running into a little quirk of Lucene query syntax.  That quirk is that you can't actually do a purely negative query.

The only reason that your middle query even works is that Solr is able to detect the unworkable query and fix it for you behind the scenes.  If you provided that query to Lucene directly, it probably wouldn't work.  The third query is too complex for Solr to detect the problem.

When you use that kind of syntax what you are telling Lucene is that you want to subtract documents from the resultset.  So what your third query ends up being parsed as (in plain language):

"Start with all documents that match a specific string in this field, and then subtract all documents where this field exists." So you get the three documents you started with, because the second clause in the query is a subtraction.

You've also got an extreme inefficiency because you're using a wildcard query.  A range query is FAR more efficient.

What you want to send to get the results you want as efficiently as possible is this (which does not include the quote escaping that was in what you pasted):

prodAllGeneric_txt_sort:"Phi; " OR (*:* -prodAllGeneric_txt_sort:[* TO *])

I did not include the parentheses around the first clause because they are unnecessary for that syntax.  If your actual query is more complicated, then you might want to re-add them.

Thanks,
Shawn

Reply via email to