[ 
https://issues.apache.org/jira/browse/LUCENE-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-2529:
---------------------------------

    Attachment: LUCENE-2529_skip_posIncr_for_1st_token.patch

I think you're right Rob, I didn't think of that at all.  I altered the patch 
further, and included your test as a part of it.  I also added a little to the 
existing test I patched so that I consider a leading stop word, which is really 
the jist of what you are drawing attention to in your patch.  The tests pass.  
This patch is retains the spirit of the earlier patch, but it honors the first 
position increment, expecting it to be >= 1.  If it's 0, then the 
fieldState.position would end up being -1 but there's a quick check here that 
will correct it to be 0.  That suits my needs, and the whole result I think is 
clearer and more consistent than the existing code & behavior.

> always apply position increment gap between values
> --------------------------------------------------
>
>                 Key: LUCENE-2529
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2529
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.9.3, 3.0.2, 3.1, 4.0
>         Environment: (I don't know which version to say this affects since 
> it's some quasi trunk release and the new versioning scheme confuses me.)
>            Reporter: David Smiley
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.1, 4.0
>
>         Attachments: 
> LUCENE-2529_always_apply_position_increment_gap_between_values.patch, 
> LUCENE-2529_skip_posIncr_for_1st_token.patch, 
> LUCENE-2529_skip_posIncr_for_1st_token.patch, 
> LUCENE-2529_skip_posIncr_for_1st_token.patch, LUCENE-2529_test.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> I'm doing some fancy stuff with span queries that is very sensitive to term 
> positions.  I discovered that the position increment gap on indexing is only 
> applied between values when there are existing terms indexed for the 
> document.  I suspect this logic wasn't deliberate, it's just how its always 
> been for no particular reason.  I think it should always apply the gap 
> between fields.  Reference DocInverterPerField.java line 82:
> if (fieldState.length > 0)
>           fieldState.position += 
> docState.analyzer.getPositionIncrementGap(fieldInfo.name);
> This is checking fieldState.length.  I think the condition should simply be:  
> if (i > 0).
> I don't think this change will affect anyone at all but it will certainly 
> help me.  Presently, I can either change this line in Lucene, or I can put in 
> a hack so that the first value for the document is some dummy value which is 
> wasteful.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to