Georg Sorst created SOLR-9968:
---------------------------------
Summary: Cannot use special characters in Suggester Context Query
Key: SOLR-9968
URL: https://issues.apache.org/jira/browse/SOLR-9968
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: Suggester
Affects Versions: 6.3, 6.0
Reporter: Georg Sorst
h4. Reproduce
1. Configure the Suggester to use a {{contextField}}, eg. {{context}}
2. Add a document containing special characters in that field, eg. '{{c#x}}'
3. Use a context query with the Suggester, eg.
{noformat}suggest.cfq=context:c#x{noformat}
* Escaping the character makes no difference, eg.
{noformat}suggest.cfq=context:c\#x{noformat}
h4. What happens
The suggestions are not properly filtered
h4. What should happen
The suggestions should be limited to documents where the field {{context}} is
'{{c#x}}'
----
What happens is this:
1. {{SolrSuggester.contextFilterQueryAnalyzer}} is hardwired to use
{{StandardTokenizer}}
2. The context query is parsed like this:
{code:title=SolrSuggester.parseContextFilterQuery}
query = new
StandardQueryParser(contextFilterQueryAnalyzer).parse(contextFilter,
CONTEXTS_FIELD_NAME);
{code}
3. The {{StandardQueryParser}} together with {{StandardTokenizer}} will turn
the context query into '{{context:c context:x}}'
4. This is used for filtering the suggestions
5. Thus, the suggestion where {{context}} is '{{c(x}}' is not returned
So, the question is, how to get the parser and tokenizer to use these special
characters verbatim? Two ways I can think of:
* Make {{contextFilterQueryAnalyzer}} configurable so {{KeywordTokenizer}} can
be used
* Use the analyzer defined for the context field in the schema
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]