alessandrobenedetti commented on pull request #129: URL: https://github.com/apache/solr/pull/129#issuecomment-847276421
In the meantime I was still thinking about this and I still think it is a bug: If we set a field type to be keyword analysed (so producing the same token as a String field), the sow works correctly and we have the same behaviour I am introducing with the fix. On Mon, 24 May 2021, 16:59 David Smiley, ***@***.***> wrote: > ***@***.**** commented on this pull request. > > It seems we don't agree on this yet. > ------------------------------ > > In solr/core/src/test/org/apache/solr/search/TestExtendedDismaxParser.java > <https://github.com/apache/solr/pull/129#discussion_r637963606>: > > > @@ -1771,6 +1787,35 @@ public void testSplitOnWhitespace_Basic() throws Exception { > assertThat(parsedquery, anyOf(containsString("((name:stigma | title:stigma))"), containsString("((title:stigma | name:stigma))"))); > } > > + @Test > + public void testSplitOnWhitespace_stringField_shouldBuildSingleClause() throws Exception > > Based on the test name, I'd expect sow=true each time. Maybe just drop > this part of the method name. > ------------------------------ > > In solr/core/src/test/org/apache/solr/search/TestExtendedDismaxParser.java > <https://github.com/apache/solr/pull/129#discussion_r637963928>: > > > @@ -1771,6 +1787,35 @@ public void testSplitOnWhitespace_Basic() throws Exception { > assertThat(parsedquery, anyOf(containsString("((name:stigma | title:stigma))"), containsString("((title:stigma | name:stigma))"))); > } > > + @Test > + public void testSplitOnWhitespace_stringField_shouldBuildSingleClause() throws Exception > + { > + assertJQ(req("qf", "trait_ss", "defType", "edismax", "q", "multi term", "sow", "false"), > + "/response/numFound==1", "/response/docs/[0]/id=='75'"); > + > + String parsedquery = getParsedQuery( > + req("qf", "trait_ss", "q", "multi term", "defType", "edismax", "sow", "false", "debugQuery", "true")); > + assertThat(parsedquery, anyOf(containsString("((trait_ss:multi term))"))); > + } > + > + @Test > + public void testSplitOnWhitespace_numericField_shouldBuildAlwaysMultiClause() throws Exception > > Again, just drop "testSplitOnWhitespace_" from the method name, I think. > ------------------------------ > > In solr/core/src/test/org/apache/solr/search/TestExtendedDismaxParser.java > <https://github.com/apache/solr/pull/129#discussion_r638071458>: > > > @@ -1771,6 +1787,35 @@ public void testSplitOnWhitespace_Basic() throws Exception { > assertThat(parsedquery, anyOf(containsString("((name:stigma | title:stigma))"), containsString("((title:stigma | name:stigma))"))); > } > > + @Test > + public void testSplitOnWhitespace_stringField_shouldBuildSingleClause() throws Exception > + { > + assertJQ(req("qf", "trait_ss", "defType", "edismax", "q", "multi term", "sow", "false"), > > This is a change in behavior, and I think it's not a good change. For a > non-tokenized field (StrField in this case), I think we should ignore > whatever "sow" is and split on whitespace any way, thus here have two terms > to match. It would be straight-forward to document this (no differences > between numbers and StrField). > > I think it could be reasonable to try both ways (both split and don't > split) and then put a DisjunctionMaxQuery over the two, though I'd prefer > not. > ------------------------------ > > In solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java > <https://github.com/apache/solr/pull/129#discussion_r638074368>: > > > } else { > List<Query> subqs = new ArrayList<>(); > for (String queryTerm : queryTerms) { > try { > subqs.add(ft.getFieldQuery(parser, sf, queryTerm)); > - } catch (Exception e) { // assumption: raw = false only when called from ExtendedDismaxQueryParser.getQuery() > - // for edismax: ignore parsing failures > + } catch (Exception e) { > + /* > + This happens when a field tries to parse a query term of incompatible type > + e.g. > + a numerical field trying to parse a textual query term > + */ > + subqs.add(new MatchNoDocsQuery(queryTerm + " is not compatible with " + field)); > > It appears this change (the addition of MatchNoDocsQuery here) has no > effect but maybe I'm mistaken? > ------------------------------ > > In solr/core/src/java/org/apache/solr/parser/SolrQueryParserBase.java > <https://github.com/apache/solr/pull/129#discussion_r638072636>: > > > return new RawQuery(sf, queryTerms); > } else { > if (queryTerms.size() == 1) { > return ft.getFieldQuery(parser, sf, queryTerms.get(0)); > + } else if(ft instanceof StrField){ > > In essence, I think the behavior I see here was correct *before* -- no > special case for either StrField or numerics. In the context of the logic > that reaches this point, the field is already ft.isTokenized==false. > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/apache/solr/pull/129#pullrequestreview-666761251>, or > unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAD5JK7YA7MGLQJO5BK2HM3TPJZUZANCNFSM444WCCXQ> > . > -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org