[
https://issues.apache.org/jira/browse/LUCENE-6866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983179#comment-14983179
]
Michael McCandless commented on LUCENE-6866:
--------------------------------------------
Thanks for confirming [~steve_rowe]!
> TestCheckIndex.testChecksumsOnlyVerbose() failure: Document contains at least
> one immense term
> ----------------------------------------------------------------------------------------------
>
> Key: LUCENE-6866
> URL: https://issues.apache.org/jira/browse/LUCENE-6866
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Steve Rowe
> Assignee: Michael McCandless
> Fix For: Trunk, 5.4
>
>
> My Jenkins found the following seed that reproduces for me 100% on branch_5x,
> both Java7&8, and on trunk:
> {noformat}
> [junit4] Suite: org.apache.lucene.index.TestCheckIndex
> [junit4] 2> NOTE: download the large Jenkins line-docs file by running
> 'ant get-jenkins-line-docs' in the lucene directory.
> [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=TestCheckIndex
> -Dtests.method=testChecksumsOnlyVerbose -Dtests.seed=1B39BC3F6E1634F
> -Dtests.slow=true
> -Dtests.linedocsfile=/home/jenkins/lucene-data/enwiki.random.lines.txt
> -Dtests.locale=hu -Dtests.timezone=America/Havana -Dtests.asserts=true
> -Dtests.file.encoding=UTF-8
> [junit4] ERROR 0.22s | TestCheckIndex.testChecksumsOnlyVerbose <<<
> [junit4] > Throwable #1: java.lang.IllegalArgumentException: Document
> contains at least one immense term in field="body" (whose UTF8 encoding is
> longer than the max length 32766), all of which were skipped. Please correct
> the analyzer to not produce such terms. The prefix of the first immense term
> is: '[125, 125, 123, 123, 123, 123, 123, 115, 117, 98, 115, 116, 99, 124,
> 125, 125, 125, 123, 123, 123, 49, 125, 125, 125, 124, 123, 123, 123, 112,
> 49]...', original message: bytes can be at most 32766 in length; got 94384
> [junit4] > at
> __randomizedtesting.SeedInfo.seed([1B39BC3F6E1634F:C5FF0BD55AE404AD]:0)
> [junit4] > at
> org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:726)
> [junit4] > at
> org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:347)
> [junit4] > at
> org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:234)
> [junit4] > at
> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:449)
> [junit4] > at
> org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1461)
> [junit4] > at
> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1240)
> [junit4] > at
> org.apache.lucene.index.TestCheckIndex.testChecksumsOnlyVerbose(TestCheckIndex.java:156)
> [junit4] > at java.lang.Thread.run(Thread.java:745)
> [junit4] > Suppressed: java.lang.IllegalStateException: close()
> called in wrong state: INCREMENT
> [junit4] > at
> org.apache.lucene.analysis.MockTokenizer.fail(MockTokenizer.java:126)
> [junit4] > at
> org.apache.lucene.analysis.MockTokenizer.close(MockTokenizer.java:293)
> [junit4] > at
> org.apache.lucene.analysis.TokenFilter.close(TokenFilter.java:58)
> [junit4] > at
> org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:742)
> [junit4] > ... 42 more
> [junit4] > Caused by:
> org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
> can be at most 32766 in length; got 94384
> [junit4] > at
> org.apache.lucene.util.BytesRefHash.add(BytesRefHash.java:284)
> [junit4] > at
> org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:150)
> [junit4] > at
> org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:716)
> [junit4] > ... 42 more
> [junit4] 2> NOTE: test params are: codec=Asserting(Lucene54): {},
> docValues:{}, sim=ClassicSimilarity, locale=hu, timezone=America/Havana
> [junit4] 2> NOTE: Linux 4.1.0-custom2-amd64 amd64/Oracle Corporation
> 1.8.0_45 (64-bit)/cpus=16,threads=1,free=412272632,total=514850816
> [junit4] 2> NOTE: All tests run in this JVM: [TestCheckIndex]
> [junit4] Completed [1/1] in 0.45s, 1 test, 1 error <<< FAILURES!
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]