[ 
https://issues.apache.org/jira/browse/LUCENE-3717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir resolved LUCENE-3717.
---------------------------------

    Resolution: Fixed

I committed this. I will go thru the analyzers and try to make sure they are 
all using checkRandomData (i think most are), just to see if we have any other 
bugs sitting out there.

It would be nice to have these offsets all under control for the next release.
                
> Add fake charfilter to BaseTokenStreamTestCase to find offsets bugs
> -------------------------------------------------------------------
>
>                 Key: LUCENE-3717
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3717
>             Project: Lucene - Java
>          Issue Type: Task
>            Reporter: Robert Muir
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3717.patch
>
>
> Recently lots of issues have been fixed about broken offsets, but it would be 
> nice to improve the
> test coverage and test that they work across the board (especially with 
> charfilters).
> in BaseTokenStreamTestCase.checkRandomData, we can sometimes pass the 
> analyzer a reader wrapped
> in a "MockCharFilter" (the one in the patch sometimes doubles characters). If 
> the analyzer does
> not call correctOffsets or does incorrect "offset math" (LUCENE-3642, etc) 
> then eventually
> this will create offsets and the test will fail.
> Other than tests bugs, this found 2 real bugs: ICUTokenizer did not call 
> correctOffset() in its end(),
> and ThaiWordFilter did incorrect offset math.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to