[
https://issues.apache.org/jira/browse/SOLR-6009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Evan Sayer updated SOLR-6009:
-----------------------------
Description:
edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving
a RegexpQuery. Steps to reproduce on 4.7.2:
1) remove the explicit <field /> definition for 'text'
2) add a catch-all '*' dynamic field of type text_general
<dynamicField name="*" type="text_general" multiValued="true" indexed="true"
stored="true" />
3) index the exampledocs/ data
4) run a query like the following:
{code}
http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/&debugQuery=true
{code}
The debugQuery output will look like this:
{code}
<lst name="debug">
<str name="rawquerystring">{!edismax qf='text'} /.*elec.*/</str>
<str name="querystring">{!edismax qf='text'} /.*elec.*/</str>
<str name="parsedquery">(+RegexpQuery(:/.*elec.*/))/no_coord</str>
<str name="parsedquery_toString">+:/.*elec.*/</str>
{code}
If you copy/paste the parsed-query into a text editor or something, you can see
that the field-name isn't actually blank. The IMPOSSIBLE_FIELD_NAME ends up in
there.
I haven't been able to reproduce this behavior on 4.7.2 without getting rid of
the explicit field definition for 'text' and using a dynamicField, which is how
things are setup on the machine where this issue was discovered. The query
isn't quite right with the explicit field definition in place either, though:
{code}
<lst name="debug">
<str name="rawquerystring">{!edismax qf='text'} /.*elec.*/</str>
<str name="querystring">{!edismax qf='text'} /.*elec.*/</str>
<str name="parsedquery">(+DisjunctionMaxQuery((text:elec)))/no_coord</str>
<str name="parsedquery_toString">+(text:elec)</str>
{code}
numFound=0 for both of these. This site is useful for looking at the
characters in the first variant:
http://rishida.net/tools/conversion/
was:
edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries involving
a RegexpQuery. Steps to reproduce on 4.7.2:
1) remove the explicit <field /> definition for 'text'
2) add a catch-all '*' dynamic field of type text_general
<dynamicField name="*" type="text_general" multiValued="true" indexed="true"
stored="true" />
3) index the exampledocs/ data
4) run a query like the following:
{code}
http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/&debugQuery=true
{code}
The debugQuery output will look like this:
<lst name="debug">
<str name="rawquerystring">{!edismax qf='text'} /.*elec.*/</str>
<str name="querystring">{!edismax qf='text'} /.*elec.*/</str>
<str name="parsedquery">(+RegexpQuery(:/.*elec.*/))/no_coord</str>
<str name="parsedquery_toString">+:/.*elec.*/</str>
If you copy/paste the parsed-query into a text editor or something, you can see
that the field-name isn't actually blank. The IMPOSSIBLE_FIELD_NAME ends up in
there.
I haven't been able to reproduce this behavior on 4.7.2 without getting rid of
the explicit field definition for 'text' and using a dynamicField, which is how
things are setup on the machine where this issue was discovered. The query
isn't quite right with the explicit field definition in place either, though:
<lst name="debug">
<str name="rawquerystring">{!edismax qf='text'} /.*elec.*/</str>
<str name="querystring">{!edismax qf='text'} /.*elec.*/</str>
<str name="parsedquery">(+DisjunctionMaxQuery((text:elec)))/no_coord</str>
<str name="parsedquery_toString">+(text:elec)</str>
numFound=0 for both of these. This site is useful for looking at the
characters in the first variant:
http://rishida.net/tools/conversion/
> edismax mis-parsing RegexpQuery
> -------------------------------
>
> Key: SOLR-6009
> URL: https://issues.apache.org/jira/browse/SOLR-6009
> Project: Solr
> Issue Type: Bug
> Components: query parsers
> Affects Versions: 4.7.2
> Reporter: Evan Sayer
>
> edismax appears to be leaking its IMPOSSIBLE_FIELD_NAME into queries
> involving a RegexpQuery. Steps to reproduce on 4.7.2:
> 1) remove the explicit <field /> definition for 'text'
> 2) add a catch-all '*' dynamic field of type text_general
> <dynamicField name="*" type="text_general" multiValued="true" indexed="true"
> stored="true" />
> 3) index the exampledocs/ data
> 4) run a query like the following:
> {code}
> http://localhost:8983/solr/collection1/select?q={!edismax%20qf=%27text%27}%20/.*elec.*/&debugQuery=true
> {code}
> The debugQuery output will look like this:
> {code}
> <lst name="debug">
> <str name="rawquerystring">{!edismax qf='text'} /.*elec.*/</str>
> <str name="querystring">{!edismax qf='text'} /.*elec.*/</str>
> <str name="parsedquery">(+RegexpQuery(:/.*elec.*/))/no_coord</str>
> <str name="parsedquery_toString">+:/.*elec.*/</str>
> {code}
> If you copy/paste the parsed-query into a text editor or something, you can
> see that the field-name isn't actually blank. The IMPOSSIBLE_FIELD_NAME ends
> up in there.
> I haven't been able to reproduce this behavior on 4.7.2 without getting rid
> of the explicit field definition for 'text' and using a dynamicField, which
> is how things are setup on the machine where this issue was discovered. The
> query isn't quite right with the explicit field definition in place either,
> though:
> {code}
> <lst name="debug">
> <str name="rawquerystring">{!edismax qf='text'} /.*elec.*/</str>
> <str name="querystring">{!edismax qf='text'} /.*elec.*/</str>
> <str name="parsedquery">(+DisjunctionMaxQuery((text:elec)))/no_coord</str>
> <str name="parsedquery_toString">+(text:elec)</str>
> {code}
> numFound=0 for both of these. This site is useful for looking at the
> characters in the first variant:
> http://rishida.net/tools/conversion/
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]