[jira] Commented: (LUCENE-2529) always apply position increment gap between values

David Smiley (JIRA) Wed, 06 Oct 2010 09:29:57 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918576#action_12918576
 ]


David Smiley commented on LUCENE-2529:
--------------------------------------

Yes, the check could move outside the loop.  I don't think so for the "correct 
< 0" but whatever.  And the Highlight test fails because the test is based off 
of a multi-valued field in which the first value is blank, thus triggering the 
subsequent positions to be off by one from the former behavior.
A small readability improvement might be to set fieldState.position to -1 
instead of decrementing it because that is exactly the intended effect, which 
is more clear.

FYI the observable change in behavior in this patch principally results from 
the "firstToken" boolean part of the condition in which the position is 
decremented.  Remove that part of the condition leaving simply i==0 and you get 
the former/existing behavior.

Ugh, I've thought about this issue way too much and I'm suddenly having a 
change of opinion.  Arguably the current effect of positions is more 
consistent... it is as if the starting position (prior to looking at the 
position increment) is -1, no matter whether the first value has tokens or not. 
 I'm going to achieve my objectives for my project via your suggestion of a 
single field with a value separator and a custom analyzer.  It's a little hacky 
but I'll then have full control over the positions and the field isn't stored 
any way.  I wish I had the ability to control positions in the indexer here a 
bit more but the support really isn't there and won't be until perhaps 
LUCENE-2450, the issue Mike referenced.

As far as my patch goes, with the small modification to get the existing 
behavior, I think it's an improvement to readability over the current code.  I 
can submit an update if desired. 

> always apply position increment gap between values
> --------------------------------------------------
>
>                 Key: LUCENE-2529
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2529
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.9.3, 3.0.2, 3.1, 4.0
>         Environment: (I don't know which version to say this affects since 
> it's some quasi trunk release and the new versioning scheme confuses me.)
>            Reporter: David Smiley
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: 
> LUCENE-2529_always_apply_position_increment_gap_between_values.patch, 
> LUCENE-2529_nonsenseIncrements.patch, 
> LUCENE-2529_skip_posIncr_for_1st_token.patch, 
> LUCENE-2529_skip_posIncr_for_1st_token.patch, 
> LUCENE-2529_skip_posIncr_for_1st_token.patch, LUCENE-2529_test.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I'm doing some fancy stuff with span queries that is very sensitive to term 
> positions.  I discovered that the position increment gap on indexing is only 
> applied between values when there are existing terms indexed for the 
> document.  I suspect this logic wasn't deliberate, it's just how its always 
> been for no particular reason.  I think it should always apply the gap 
> between fields.  Reference DocInverterPerField.java line 82:
> if (fieldState.length > 0)
>           fieldState.position += 
> docState.analyzer.getPositionIncrementGap(fieldInfo.name);
> This is checking fieldState.length.  I think the condition should simply be:  
> if (i > 0).
> I don't think this change will affect anyone at all but it will certainly 
> help me.  Presently, I can either change this line in Lucene, or I can put in 
> a hack so that the first value for the document is some dummy value which is 
> wasteful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2529) always apply position increment gap between values

Reply via email to