QueryReplacement token handling

Ryan Josal Fri, 15 Jan 2016 16:03:54 -0800

I ran into an issue that seems to be undesirable behavior to me, but
considering how long query elevation has been around, maybe there are use
cases for it as is.


I set the QE analyzer to WhitespaceTokenizer -> LowercaseFilter.

When QE loads the xml file and it analyzes the queries into keys, it
appends tokens with no separators.  As a result "foo bar" and "foobar"
process into the same key "foobar".  The reason this is unexpected to me is
because the query "foobar" would not match "foobar" with that analyzer.  So
it breaks if I put both in my XML file.  So my thought would be to insert a
NUL char or something between the chars to detect the difference.  This
issue could potentially make two queries with very different meanings
become ambiguous: "rights now" and "right snow".

I think I may be able to work around this by using a charfilter alone with
keywordtokenizer.

Is this a bug, or desired?

Ryan

QueryReplacement token handling

Reply via email to