rseitz opened a new pull request, #1463:
URL: https://github.com/apache/solr/pull/1463

   https://issues.apache.org/jira/browse/SOLR-16594
   
   # Description
   
   When a query spans multiple fields, edismax first generates a clause for 
each field. Then it attempts to rewrite the clauses so there is one for each 
term. But sometimes it gives up on the rewrite.
   
   This PR introduces a better strategy for rewriting a field-centric query as 
a term-centric one, so that edismax's behavior will be more consistent. It will 
"give up" less often. The idea is to propogate the startOffsets from Tokens 
emitted by the field analyzers so they are stored on the TermQuery and 
SynonymQuery instances that are created in the parsing process. When rewriting 
a field-centric query as a term-centric one, edismax is now able to able to do 
the grouping based on startOffset.
   
   # Solution
   
   TermQueryWithOffset and SynonymQueryWithOffset have been created to store a 
query with its corresponding startOffset.
   
   QueryBuilder from the lucene repository has been provisionally copied here 
as SolrQueryBuilder and modified to use the new TermQueryWithOffset and 
SynonymQueryWithOffset classes.
   
   ExtendedDismaxQParser has been updated to use startOffsets as a basis for 
regrouping field-centric clauses into term-centric ones. This happens in 
getAliasedMultiTermQuery(). In the case where startOffets are not available, 
the logic that had been in place previously is applied.
   
   # Tests
   
   TestExtendedDismaxParser has been updated to reflect the new behavior. 
   This is a provisional PR submitted in hopes of gathering feedback.
   More tests will be written.
   
   The following two tests are currently failing:
   org.apache.solr.search.TestExtendedDismaxParser.testAliasing
   org.apache.solr.search.QueryEqualityTest.testBlockJoin
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [x] I have reviewed the guidelines for [How to 
Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms 
to the standards described there to the best of my ability.
   - [x] I have created a Jira issue and added the issue ID to my pull request 
title.
   - [x] I have given Solr maintainers 
[access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork)
 to contribute to my PR branch. (optional but recommended)
   - [x] I have developed this patch against the `main` branch.
   - [x] I have run `./gradlew check`.
   - [x] I have added tests for my changes.
   - [ ] I have added documentation for the [Reference 
Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to