[ 
https://issues.apache.org/jira/browse/LUCENE-3911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Muir updated LUCENE-3911:
--------------------------------

    Attachment: LUCENE-3911_more.patch

trivial patch: forces us to pass minLength as well to randomRealistic so in 
that case we get whole words in the same unicode block (good for stemmers), 
also sometimes uses randomRegexpIshString, so we get lots of punctuation (good 
for tokenizers/filters, etc)
                
> improve BaseTokenStreamTestCase random string generation
> --------------------------------------------------------
>
>                 Key: LUCENE-3911
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3911
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: general/test
>    Affects Versions: 3.6, 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-3911.patch, LUCENE-3911.patch, 
> LUCENE-3911_more.patch
>
>
> Most analysis tests use mocktokenizer (which splits on whitespace), but
> its rare that we generate a string with 'many tokens'. So I think we should
> try to generate more realistic test strings.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to