Is the character processing here all done by the charfilter, or does
it use some encoding methods from the JDK?

when i looked at it, it looked like a jvm bug.

On Sun, Nov 23, 2014 at 1:04 PM, Steve Rowe <[email protected]> wrote:
> This is the same line in the same test that failed on Windows under a
> 1.8.0_20 JVM five days ago
> <http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4439/>, but in a
> different way.
>
> This test's input is the string "&#55404;&#57999;" - HTML character
> references for U+D86D U+E28F - and the expected output is the char sequence
> U+FFFD U+E28F (the Unicode replacement character followed by the second
> input char).
>
> In the Windows failure, the output was U+D86D U+E28F (improperly paired high
> surrogate).
>
> In this Linux failure, the output is U+2B68F (properly paired UTF-16 U+D86D
> U+DE8F).
>
> Very weird.
>
> I'm beasting this suite now on Windows under Oracle JVM 1.8.0_20 to see if I
> can get it to fail.  No dice so far after 140 trials.
>
>
> On Sun, Nov 23, 2014 at 6:19 AM, Policeman Jenkins Server
> <[email protected]> wrote:
>>
>> Build: http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-Linux/11492/
>> Java: 32bit/jdk1.8.0_20 -server -XX:+UseParallelGC (asserts: false)
>>
>> 1 tests failed.
>> FAILED:
>> org.apache.lucene.analysis.charfilter.HTMLStripCharFilterTest.testUTF16Surrogates
>>
>> Error Message:
>> term 0 expected:<[�]> but was:<[𫚏]>
>>
>> Stack Trace:
>> org.junit.ComparisonFailure: term 0 expected:<[�]> but was:<[𫚏]>
>>         at
>> __randomizedtesting.SeedInfo.seed([CF8F65E969B602B9:93CFDF3CEB58ED83]:0)
>>         at org.junit.Assert.assertEquals(Assert.java:125)
>>         at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:180)
>>         at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:295)
>>         at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:299)
>>         at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:303)
>>         at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:353)
>>         at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:362)
>>         at
>> org.apache.lucene.analysis.charfilter.HTMLStripCharFilterTest.testUTF16Surrogates(HTMLStripCharFilterTest.java:600)
>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:483)
>>         at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618)
>>         at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827)
>>         at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863)
>>         at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877)
>>         at
>> org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50)
>>         at
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>>         at
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>>         at
>> org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49)
>>         at
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>>         at
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>>         at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>         at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
>>         at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798)
>>         at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458)
>>         at
>> com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836)
>>         at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738)
>>         at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772)
>>         at
>> com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783)
>>         at
>> org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46)
>>         at
>> org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42)
>>         at
>> com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55)
>>         at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>>         at
>> com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39)
>>         at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>         at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>         at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>         at
>> org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:54)
>>         at
>> org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48)
>>         at
>> org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65)
>>         at
>> org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:55)
>>         at
>> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
>>         at
>> com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>>
>>
>> Build Log:
>> [...truncated 5753 lines...]
>>    [junit4] Suite:
>> org.apache.lucene.analysis.charfilter.HTMLStripCharFilterTest
>>    [junit4]   2> NOTE: reproduce with: ant test
>> -Dtestcase=HTMLStripCharFilterTest -Dtests.method=testUTF16Surrogates
>> -Dtests.seed=CF8F65E969B602B9 -Dtests.multiplier=3 -Dtests.slow=true
>> -Dtests.locale=th_TH -Dtests.timezone=PLT -Dtests.asserts=false
>> -Dtests.file.encoding=UTF-8
>>    [junit4] FAILURE 0.07s J0 | HTMLStripCharFilterTest.testUTF16Surrogates
>> <<<
>>    [junit4]    > Throwable #1: org.junit.ComparisonFailure: term 0
>> expected:<[�]> but was:<[𫚏]>
>>    [junit4]    >        at
>> __randomizedtesting.SeedInfo.seed([CF8F65E969B602B9:93CFDF3CEB58ED83]:0)
>>    [junit4]    >        at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:180)
>>    [junit4]    >        at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:295)
>>    [junit4]    >        at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:299)
>>    [junit4]    >        at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertTokenStreamContents(BaseTokenStreamTestCase.java:303)
>>    [junit4]    >        at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:353)
>>    [junit4]    >        at
>> org.apache.lucene.analysis.BaseTokenStreamTestCase.assertAnalyzesTo(BaseTokenStreamTestCase.java:362)
>>    [junit4]    >        at
>> org.apache.lucene.analysis.charfilter.HTMLStripCharFilterTest.testUTF16Surrogates(HTMLStripCharFilterTest.java:600)
>>    [junit4]    >        at java.lang.Thread.run(Thread.java:745)
>>    [junit4]   2> NOTE: test params are: codec=Asserting(Lucene50):
>> {dummy=BlockTreeOrds(blocksize=128)}, docValues:{}, sim=DefaultSimilarity,
>> locale=th_TH, timezone=PLT
>>    [junit4]   2> NOTE: Linux 3.13.0-39-generic i386/Oracle Corporation
>> 1.8.0_20 (32-bit)/cpus=8,threads=1,free=88329216,total=222035968
>>    [junit4]   2> NOTE: All tests run in this JVM:
>> [TestPatternReplaceCharFilter, TestArabicNormalizationFilter,
>> TestPatternReplaceCharFilterFactory, TestWikipediaTokenizerFactory,
>> TestCondition2, TestIrishLowerCaseFilterFactory, TestGalicianStemFilter,
>> TestWordlistLoader, TestElisionFilterFactory, TestLengthFilter,
>> TestGermanLightStemFilterFactory, EdgeNGramTokenFilterTest,
>> TestSerbianNormalizationFilterFactory, TestPortugueseLightStemFilter,
>> TestSwedishLightStemFilterFactory, TestPatternReplaceFilterFactory,
>> TestElision, TestCzechStemFilterFactory, TestSpanishLightStemFilter,
>> TestSingleTokenTokenFilter, TestHindiStemmer, TestKeepWordFilter,
>> TestLimitTokenCountFilter, TestShingleFilterFactory, TestTrimFilter,
>> TestCapitalizationFilterFactory, TestFactories,
>> TestGalicianMinimalStemFilterFactory, TestFlagLong, TestIgnore,
>> TestGermanMinimalStemFilterFactory, TestUAX29URLEmailTokenizerFactory,
>> TestPatternCaptureGroupTokenFilter, TestAlternateCasing, TestCzechAnalyzer,
>> TestOnlyInCompound, TestPersianNormalizationFilter,
>> TestGermanNormalizationFilterFactory, WikipediaTokenizerTest,
>> TestMultiWordSynonyms, TestTruncateTokenFilter, TestPersianAnalyzer,
>> TestArabicAnalyzer, TestRemoveDuplicatesTokenFilter,
>> TestSoraniStemFilterFactory, TestPorterStemFilterFactory,
>> TestCodepointCountFilterFactory, TokenTypeSinkTokenizerTest,
>> TestSoraniAnalyzer, TestApostropheFilter, QueryAutoStopWordAnalyzerTest,
>> TestTwoSuffixes, TestScandinavianFoldingFilterFactory, TestArmenianAnalyzer,
>> TestFinnishAnalyzer, TestFlagNum, TestIndonesianStemmer,
>> TestLimitTokenCountAnalyzer, TestScandinavianNormalizationFilterFactory,
>> TestReversePathHierarchyTokenizer, TestGalicianMinimalStemFilter,
>> TestPersianNormalizationFilterFactory, TestNeedAffix,
>> TestGermanLightStemFilter, TestLimitTokenPositionFilterFactory,
>> TestStopFilterFactory, TestMappingCharFilter, HTMLStripCharFilterTest]
>>    [junit4] Completed on J0 in 2.12s, 31 tests, 1 failure <<< FAILURES!
>>
>> [...truncated 403 lines...]
>> BUILD FAILED
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/build.xml:525: The
>> following error occurred while executing this line:
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/build.xml:473: The
>> following error occurred while executing this line:
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/build.xml:61: The
>> following error occurred while executing this line:
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/extra-targets.xml:39: The
>> following error occurred while executing this line:
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/build.xml:452: The
>> following error occurred while executing this line:
>>
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/common-build.xml:2141:
>> The following error occurred while executing this line:
>>
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/analysis/build.xml:106:
>> The following error occurred while executing this line:
>>
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/analysis/build.xml:38:
>> The following error occurred while executing this line:
>>
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/module-build.xml:58:
>> The following error occurred while executing this line:
>>
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/common-build.xml:1359:
>> The following error occurred while executing this line:
>>
>> /mnt/ssd/jenkins/workspace/Lucene-Solr-5.x-Linux/lucene/common-build.xml:966:
>> There were test failures: 270 suites, 1408 tests, 1 failure, 1 ignored
>>
>> Total time: 30 minutes 5 seconds
>> Build step 'Invoke Ant' marked build as failure
>> [description-setter] Description set: Java: 32bit/jdk1.8.0_20 -server
>> -XX:+UseParallelGC (asserts: false)
>> Archiving artifacts
>> Recording test results
>> Email was triggered for: Failure - Any
>> Sending email for trigger: Failure - Any
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to