[
https://issues.apache.org/jira/browse/LUCENE-7848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Gibney updated LUCENE-7848:
-----------------------------------
Attachment: LUCENE-7848-branching-spanOr.patch
"Could be a bug somewhere in span queries."^ -- I think the remaining problem
here is that only one branch (the shortest) of a SpanOrQuery is evaluated, at
which point the "spanOr" is designated a match (or not) of the
width/positionEnd of the shortest branch. When the branches of a "spanOr"
differ in length (as they will as a matter of course for uses of GraphFilters
such as in the above test), the shorter branch is evaluated, but if a longer
branch is also a match, it affects the offset of subsequent tokens, and the
enclosing "spanNear" sees a larger-than-expected slop, and fails to match.
[^LUCENE-7848-branching-spanOr.patch] adjusts SpanOrQuery to support repeated
calls to nextStartPosition() which return the same startPosition, but different
endPositions. The subSpan clauses of the "spanOr" are popped off the
priorityQueue, retained, and restored upon exhaustion of subSpans (when it's
time to move on to the next potential match). Some corresponding changes were
necessary to make NearSpansOrdered aware of the new "spanOr" behavior, and
conditionally evaluate as many branches of "spanOr" clauses as necessary to
match (or not) on the full "nearSpan".
There may be other modifications needed in code that can call the modified
"spanOr" and would need to be aware of its new behavior, but with this patch
applied, all the tests in the TestWordDelimiterGraphFilter pass (including the
new testLucene7848()).
> QueryBuilder.analyzeGraphPhrase does not handle gaps correctly
> --------------------------------------------------------------
>
> Key: LUCENE-7848
> URL: https://issues.apache.org/jira/browse/LUCENE-7848
> Project: Lucene - Core
> Issue Type: Bug
> Affects Versions: 6.5, 6.6
> Reporter: Jim Ferenczi
> Attachments: capture-3.png, LUCENE-7848-branching-spanOr.patch,
> LUCENE-7848.patch, LUCENE-7848.patch
>
>
> Position increments greater than 1 are ignored when the query builder creates
> a graph phrase query.
> Instead it should use SpanNearQuery.addGap for pos incr > 1.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]