[ 
https://issues.apache.org/jira/browse/LUCENE-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983179#comment-14983179
 ] 

Michael McCandless commented on LUCENE-6866:
--------------------------------------------

Thanks for confirming [~steve_rowe]!

> TestCheckIndex.testChecksumsOnlyVerbose() failure: Document contains at least 
> one immense term
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6866
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6866
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Steve Rowe
>            Assignee: Michael McCandless
>             Fix For: Trunk, 5.4
>
>
> My Jenkins found the following seed that reproduces for me 100% on branch_5x, 
> both Java7&8, and on trunk:
> {noformat}
>    [junit4] Suite: org.apache.lucene.index.TestCheckIndex
>    [junit4]   2> NOTE: download the large Jenkins line-docs file by running 
> 'ant get-jenkins-line-docs' in the lucene directory.
>    [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestCheckIndex 
> -Dtests.method=testChecksumsOnlyVerbose -Dtests.seed=1B39BC3F6E1634F 
> -Dtests.slow=true 
> -Dtests.linedocsfile=/home/jenkins/lucene-data/enwiki.random.lines.txt 
> -Dtests.locale=hu -Dtests.timezone=America/Havana -Dtests.asserts=true 
> -Dtests.file.encoding=UTF-8
>    [junit4] ERROR   0.22s | TestCheckIndex.testChecksumsOnlyVerbose <<<
>    [junit4]    > Throwable #1: java.lang.IllegalArgumentException: Document 
> contains at least one immense term in field="body" (whose UTF8 encoding is 
> longer than the max length 32766), all of which were skipped.  Please correct 
> the analyzer to not produce such terms.  The prefix of the first immense term 
> is: '[125, 125, 123, 123, 123, 123, 123, 115, 117, 98, 115, 116, 99, 124, 
> 125, 125, 125, 123, 123, 123, 49, 125, 125, 125, 124, 123, 123, 123, 112, 
> 49]...', original message: bytes can be at most 32766 in length; got 94384
>    [junit4]    >        at 
> __randomizedtesting.SeedInfo.seed([1B39BC3F6E1634F:C5FF0BD55AE404AD]:0)
>    [junit4]    >        at 
> org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:726)
>    [junit4]    >        at 
> org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:347)
>    [junit4]    >        at 
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:234)
>    [junit4]    >        at 
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:449)
>    [junit4]    >        at 
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1461)
>    [junit4]    >        at 
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1240)
>    [junit4]    >        at 
> org.apache.lucene.index.TestCheckIndex.testChecksumsOnlyVerbose(TestCheckIndex.java:156)
>    [junit4]    >        at java.lang.Thread.run(Thread.java:745)
>    [junit4]    >        Suppressed: java.lang.IllegalStateException: close() 
> called in wrong state: INCREMENT
>    [junit4]    >                at 
> org.apache.lucene.analysis.MockTokenizer.fail(MockTokenizer.java:126)
>    [junit4]    >                at 
> org.apache.lucene.analysis.MockTokenizer.close(MockTokenizer.java:293)
>    [junit4]    >                at 
> org.apache.lucene.analysis.TokenFilter.close(TokenFilter.java:58)
>    [junit4]    >                at 
> org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:742)
>    [junit4]    >                ... 42 more
>    [junit4]    > Caused by: 
> org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes 
> can be at most 32766 in length; got 94384
>    [junit4]    >        at 
> org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:284)
>    [junit4]    >        at 
> org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:150)
>    [junit4]    >        at 
> org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:716)
>    [junit4]    >        ... 42 more
>    [junit4]   2> NOTE: test params are: codec=Asserting(Lucene54): {}, 
> docValues:{}, sim=ClassicSimilarity, locale=hu, timezone=America/Havana
>    [junit4]   2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation 
> 1.8.0_45 (64-bit)/cpus=16,threads=1,free=412272632,total=514850816
>    [junit4]   2> NOTE: All tests run in this JVM: [TestCheckIndex]
>    [junit4] Completed [1/1] in 0.45s, 1 test, 1 error <<< FAILURES!
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to