[ https://issues.apache.org/jira/browse/SOLR-15407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17344695#comment-17344695 ]
Alessandro Benedetti edited comment on SOLR-15407 at 5/14/21, 4:22 PM: ----------------------------------------------------------------------- I opened an initial Pull Request to mainly discuss what to do with queries containing numerical terms, numerical query fields and sow=false There were numerous tests in org.apache.solr.search.TestExtendedDismaxParser#testFocusQueryParser for those use cases. e.g. query=terminator 100 queryFields= field_text_general integer_i sow=false What do we want as a final query? *option1* (we are consistent with sow=false and not split on whitespace happens happens also for the numerical field) field_text_general:(terminator 100) <integer_i:(terminator 100) -> this bit in disjunction disappears because query value not a number)> OR *option2* (we bring inconsistency, so for numerical fields we basically ignore sow and just split on whitespace and build boolean queries) field_text_general:(terminator 100) | integer_i:(100) [~mdrob][~munendrasn][~romseygeek][~dsmiley][~sarowe] what do you think? Once agreed I'll clean up the PR and add additional tests/changes if needed was (Author: alessandro.benedetti): I opened an initial Pull Request to mainly discuss what to do with queries containing numerical terms, numerical query fields and sow=false There were numerous tests in org.apache.solr.search.TestExtendedDismaxParser#testFocusQueryParser for those use cases. e.g. query=terminator 100 queryFields= field_text_general integer_i sow=false What do we want as a final query? *option1* (we are consistent with sow=false and not split on whitespace happens happens also for the numerical field) field_text_general:(terminator 100) | <integer_i:(terminator 100) -> this disappears because query value not a number)> OR *option2* (we bring inconsistency, so for numerical fields we basically ignore sow and just split on whitespace and build boolean queries) field_text_general:(terminator 100) | integer_i:(100) [~mdrob][~munendrasn][~romseygeek][~dsmiley][~sarowe] what do you think? Once agreed I'll clean up the PR and add additional tests/changes if needed > eDismax sow=false doesn't work with string field types > ------------------------------------------------------ > > Key: SOLR-15407 > URL: https://issues.apache.org/jira/browse/SOLR-15407 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers > Affects Versions: 8.8.2 > Reporter: Alessandro Benedetti > Priority: Major > > Currently, the sow=false should not tokenize the input user query text and > delegate to each field for query time text analysis. > But what happens if one of the queries involved is not analyzed? > For example, because it is a string field type? > Terms are split and the query generated is broken: > {code:java} > assertU(adoc("id", "75", "trait_ss", "multi term")); > public void testSplitOnWhitespace_stringField_shouldBuildSingleClause() > throws Exception > { > assertJQ(req("qf", "trait_ss", "defType", "edismax", "q", "multi > term", "sow", "false"), > "/response/numFound==1", "/response/docs/[0]/id=='75'"); > String parsedquery; > parsedquery = getParsedQuery( > req("qf", "trait_ss", "q", "multi term", "defType", "edismax", > "sow", "false", "debugQuery", "true")); > assertThat(parsedquery, anyOf(containsString("((trait_ss:multi > term))"))); > } > {code} > This test would be currently broken. > The current parsed query is wrongly: > (trait_ss:multi trait_ss:term) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org