[ 
https://issues.apache.org/jira/browse/LUCENE-6373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14491577#comment-14491577
 ] 

Paul Elschot edited comment on LUCENE-6373 at 4/12/15 6:27 PM:
---------------------------------------------------------------

Indeed, two phase doc id iteration for SpanOr is not simple. I think I'm 
getting there though. It needs two tests that I think I have seen  before, but 
I could not find where:
One is to avoid calling an approximation.match() again when match is accepted. 
This can be done by keeping the last doc for which approximation.match() 
returned true.
The other is to distinguish between approximation+acceptance and normal 
acceptance of a matching doc. This can be done by keeping the last doc for 
which twoPhaseCurrentDocMatches returned true.
This is still cooking, and I don't expect to finish this very soon.

bq. Can we still keep asTwoPhaseIterator on scorer/spans?

For the SpanOr patch found I there was a second place that needs an 
implementation with instanceof checks.
This second place is  DisiWrapper for disjunctions, the existing one is in 
ConjunctionDISI.
So I thought about where to put this in a single place, and that place is 
DocIdSetIterator.
The patch implementation with instanceof checks is just writing out the 
inheritance, it would be simpler to just return null by default,
and leave the rest to inheritance, and/or lave the method abstract.
Since there are other implications, the question is: is there a better place to 
put this?
The situation has changed, it is now not only Scorer but also Spans that have 
this.


was (Author: [email protected]):
Indeed, two phase doc id iteration for SpanOr is not simple. I think I'm 
getting there though. It needs two tests that I think I have seen  before, but 
I could not find where:
One is to avoid calling an approximation.match() again when match is accepted. 
This can be done by keeping the last doc for which approximation.match() 
returned true.
The other is to distinguish between approximation+acceptance and normal 
acceptance of a matching doc. This can be done by keeping the last doc for 
which twoPhaseCurrentDocMatches returned true.
This is still cooking, and I don't expect to finish this very soon.

bq. Can we still keep asTwoPhaseIterator on scorer/spans?

For the SpanOr patch I found I there was a second place that needs an 
implementation with instanceof checks.
This second place is  DisiWrapper for disjunctions, the existing one is in 
ConjunctionDISI.
So I thought about where to put this in a single place, and that place is 
DocIdSetIterator.
The patch implementation with instanceof checks is just writing out the 
inheritance, it would be simpler to just return null by default,
and leave the rest to inheritance.
Since there are other implications, the question is: is there a better place to 
put this?
The situation has changed, it is now not only Scorer but also Spans that have 
this.

> Complete two phase doc id iteration support for Spans
> -----------------------------------------------------
>
>                 Key: LUCENE-6373
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6373
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Paul Elschot
>         Attachments: LUCENE-6373-SpanOr.patch
>
>
> Spin off from LUCENE-6308, see comments there from about 23 March 2015.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to