Hi Patryk,

Thanks for experimenting. 

> Escaping parentheses worked for me though:

Yes, with selective escaping of the parentheses but that's not equivalent to 
quoting and seems to require no other special characters than the 2 
parentheses. 
When sending the query/field value through a proper escape function (it is 
often user input) it would escape the spaces:
curl 
'http://localhost:8983/solr/demo/select?rows=5&q=inStock:true+AND+%7B%21boost+b%3Dmanufacturedate_dt%7Dfeatures%3A%5C%28hello%5C%20with%5C%20an%5C%20accent%5C%20over%5C%20the%5C%20e%5C%29'

Result:
{
  "responseHeader":{
    "status":400,
    "QTime":26,
    "params":{
      "q":"inStock:true AND {!boost b=manufacturedate_dt}features:\\(hello\\ 
with\\ an\\ accent\\ over\\ the\\ e\\)",
      "rows":"5"
    }
  },
  "error":{
    
"metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.parser.TokenMgrError"],
    "msg":"org.apache.solr.search.SyntaxError: Cannot parse 
'features:\\(hello\\': Lexical error at line 1, column 18.  Encountered: <EOF> 
(in lexical state 3)",
    "code":400
  }
}


Also works if moving inStock:true to the end:
curl 
'http://localhost:8983/solr/demo/select?rows=5&q=%7B%21boost+b%3Dmanufacturedate_dt%7Dfeatures%3A%5C%28hello%5C%20with%5C%20an%5C%20accent%5C%20over%5C%20the%5C%20e%5C%29+AND+inStock:true'

{ 
  "responseHeader":{
    "status":0,
    "QTime":40,
    "params":{
      "q":"{!boost b=manufacturedate_dt}features:\\(hello\\ with\\ an\\ 
accent\\ over\\ the\\ e\\) AND inStock:true",
      "rows":"5"
    }
  },
  "response":{
    "numFound":3,
    "start":0,
    "numFoundExact":true,
    "docs":[{


I think this is a bug that should be tracked in Jira but I understand that 
there will be little interest to invest time given that there is a workaround 
(ensure the {!something ...} part is first in the query).

Hope this thread is helpful for anyone else encountering the issue.
Thomas Å.


> On 18 Sep 2024, at 17:51, Patryk Mazurkiewicz <pmaz...@gmail.com> wrote:
> 
> Hi Thomas,
> 
> Apparently, the parser becomes more sensitive to special characters when the 
> local parameter is in the middle of a query. Possibly because an increased 
> need for context switching.
> Escaping parentheses worked for me though:
> 
> curl 
> 'http://localhost:8983/solr/demo/select?rows=5&q=inStock:true+AND+%7B%21boost+b%3Dmanufacturedate_dt%7Dfeatures%3A%5C%28hello%20with%20an%20accent%20over%20the%20e%5C%29'
> 
> Previous example:
> curl 
> 'http://localhost:8983/solr/demo/select?q=inStock:true+AND+%7B%21boost+b%3Dmanufacturedate_dt%7Dname%3Athe%20%5C%28concept%5C%29.xml'
> 
> Kind regards
> 
> 
> On 2024/09/16 21:23:39 Thomas Åkesson wrote:
>> Hi,
>> Thanks for suggesting local parameters. Unfortunately I can't refactor the 
>> codebase to make use of them because these queries are placed in fq 
>> statements that are treated as individual units by the UI. It would require 
>> massive refactoring and higher complexity as a consequence. The graph 
>> traversal is one of the possible fq statements but I am reproducing with 
>> {!boost ...} because it is much simpler and possible to demonstrate with the 
>> SolR demo dataset.
>> 
>> I forgot to mention that I am reproducing with the SolR demo dataset, e.g.:
>> docker run --name solr_demo -d -p 8983:8983 solr solr-demo
>> 
>> I have improved the sample queries to more clearly indicate that this is 
>> likely a bug, see below. The debugQuery=true does not really help much since 
>> the failure is earlier in the process.
>> 
>> Fails:
>> curl 
>> 'http://localhost:8983/solr/demo/select?q=inStock:true+AND+%7B%21boost+b%3Dmanufacturedate_dt%7Dfeatures%3A%22%28hello%20with%20an%20accent%20over%20the%20e%29%22&rows=5&debugQuery=true'
>> 
>> 
>> Works:
>> curl 
>> 'http://localhost:8983/solr/demo/select?q=%7B%21boost+b%3Dmanufacturedate_dt%7Dfeatures%3A%22%28hello%20with%20an%20accent%20over%20the%20e%29%22+AND+inStock:true&rows=5&debugQuery=false'
>> 
>> NOTE: The only difference is moving the inStock:true part of the query to 
>> the end. Should be equivalent.
>> 
>> Cannot enable debugQuery because it fails with:
>> "msg":"java.lang.IllegalAccessException: access violation: class 
>> org.apache.solr.schema.DatePointField$DatePointFieldSource, from public 
>> Lookup",
>> 
>> Thanks,
>> Thomas Å.
>> 
>> 
>>> On 14 Sep 2024, at 22:07, Mikhail Khludnev <mk...@apache.org 
>>> <http://apache.org/>> wrote:
>>> 
>>> Hello Thomas.
>>> I think extensive use of local parameters references is easy than fighting
>>> with escaping
>>> q=inStock:true AND {!boost b=manufacturedate_dt v=$nameq}&nameq={!field
>>> f=name}the (concept).xml
>>> or even ...{!field f=name v=$nameval}&nameval=the (concept).xml
>>> also depending on the field type you'd better use {!term}.
>>> To continue post debugQuery=true output, when your query is parsed but
>>> misbehave.
>>> 
>>> On Sat, Sep 14, 2024 at 11:57 AM Thomas Åkesson <th...@fastmail.se 
>>> <http://fastmail.se/>>
>>> wrote:
>>> 
>>>> Hi,
>>>> We have come across a potentially incorrect lexical error with the
>>>> following combination in the query:
>>>> - Parentheses in the search term (tried both escaping and quoting)
>>>> - Query parser within {}. Actual use-case is using !graph but clearer
>>>> reproduction with something simple like !boost.
>>>> - Using AND before the {!something ...} query.
>>>> 
>>>> This query fails:
>>>> curl '
>>>> http://localhost:8983/solr/demo/select?q=inStock:true+AND+%7B%21boost+b%3Dmanufacturedate_dt%7Dname%3Athe%2520%5C%28concept%5C%29.xml
>>>> '
>>>> {
>>>> "responseHeader":{
>>>>   "status":400,
>>>>   "QTime":1,
>>>>   "params":{
>>>>     "q":"inStock:true AND {!boost
>>>> b=manufacturedate_dt}name:the%20\\(concept\\).xml"
>>>>   }
>>>> },
>>>> "error":{
>>>> 
>>>> "metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.parser.TokenMgrError"],
>>>>   "msg":"org.apache.solr.search.SyntaxError: Cannot parse
>>>> 'name:the%20\\(concept\\': Lexical error at line 1, column 22.
>>>> Encountered: <EOF> (in lexical state 3)",
>>>>   "code":400
>>>> }
>>>> 
>>>> 
>>>> Fails when quoted (no escaping needed?):
>>>> curl '
>>>> http://localhost:8983/solr/demo/select?q=inStock:true+AND+%7B%21boost+b%3Dmanufacturedate_dt%7Dname%3A%22the%2520%28concept%29.xml%22
>>>> '
>>>> {
>>>> "responseHeader":{
>>>>   "status":400,
>>>>   "QTime":1,
>>>>   "params":{
>>>>     "q":"inStock:true AND {!boost
>>>> b=manufacturedate_dt}name:\"the%20(concept).xml\""
>>>>   }
>>>> },
>>>> "error":{
>>>> 
>>>> "metadata":["error-class","org.apache.solr.common.SolrException","root-error-class","org.apache.solr.parser.TokenMgrError"],
>>>>   "msg":"org.apache.solr.search.SyntaxError: Cannot parse
>>>> 'name:\"the%20(concept': Lexical error at line 1, column 21.  Encountered:
>>>> <EOF> after prefix \"\\\"the%20(concept\" (in lexical state 3)",
>>>>   "code":400
>>>> }
>>>> 
>>>> Removing the closing parenthesis works:
>>>> curl '
>>>> http://localhost:8983/solr/demo/select?q=inStock:true+AND+%7B%21boost+b%3Dmanufacturedate_dt%7Dname%3Athe%2520%5C%28concept.xml
>>>> '
>>>> 
>>>> Removing the statement before works:
>>>> curl '
>>>> http://localhost:8983/solr/demo/select?q=%7B%21boost+b%3Dmanufacturedate_dt%7Dname%3Athe%2520%5C%28concept%5C%29.xml
>>>> '
>>>> 
>>>> 
>>>> Has anyone seen this issue before? Tested with 8.3 and 9.6.1.
>>>> 
>>>> Thanks in advance,
>>>> Thomas Å.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> -- 
>>> Sincerely yours
>>> Mikhail Khludnev

Reply via email to