[ 
https://issues.apache.org/jira/browse/LUCENE-8692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784997#comment-16784997
 ] 

Hoss Man commented on LUCENE-8692:
----------------------------------

bq. (see also my nocommit comments about the existing tragicEvent() call in 
prepareCommitInternal() ... but that hadn't triggered any failures in the test 
so I hadn't touched it)

I spoke too soon -- beasting just turned up this interesting little situation...

{noformat}
hossman@tray:~/lucene/dev/lucene/core [master] $ ant beast -Dbeast.iters=100 
-Dtests.iters=100 -Dtestcase=TestStressIndexing2 
-Dtests.method=testRandomCorruptionIsTragic\*
...
 [beaster] Beast round 34 results: 
/home/hossman/lucene/dev/lucene/build/core/test/34
  [beaster] The following error occurred while executing this line:
  [beaster] /home/hossman/lucene/dev/lucene/common-build.xml:1572: The 
following error occurred while executing this line:
  [beaster] /home/hossman/lucene/dev/lucene/common-build.xml:1099: There were 
test failures: 1 suite, 100 tests, 1 failure [seed: CABE666E4674CFB2]
  [beaster] Executing 1 suite with 1 JVM.
  [beaster] 
  [beaster] Started J0 PID(10111@localhost).
  [beaster]   2> NOTE: reproduce with: ant test  -Dtestcase=TestStressIndexing2 
-Dtests.method=testRandomCorruptionIsTragic -Dtests.seed=CABE666E4674CFB2 
-Dtests.slow=true -Dtests.badapples=true -Dtests.locale=cs 
-Dtests.timezone=America/Nipigon -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
  [beaster] [15:50:16.736] FAILURE 0.02s | 
TestStressIndexing2.testRandomCorruptionIsTragic 
{seed=[CABE666E4674CFB2:682DC0F2BA2A235F]} <<<
  [beaster]    > Throwable #1: java.lang.AssertionError: index update 
encountered throwable, but no tragic event recorded: java.lang.AssertionError
  [beaster]    >        at 
__randomizedtesting.SeedInfo.seed([CABE666E4674CFB2:682DC0F2BA2A235F]:0)
  [beaster]    >        at org.junit.Assert.fail(Assert.java:88)
  [beaster]    >        at org.junit.Assert.assertTrue(Assert.java:41)
  [beaster]    >        at org.junit.Assert.assertNotNull(Assert.java:712)
  [beaster]    >        at 
org.apache.lucene.index.TestStressIndexing2$CorruptibleIndexingThread.run(TestStressIndexing2.java:1019)
  [beaster]    >        at 
org.apache.lucene.index.TestStressIndexing2.testRandomCorruptionIsTragic(TestStressIndexing2.java:144)
  [beaster]    >        at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown 
Source)
  [beaster]    >        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  [beaster]    >        at java.lang.reflect.Method.invoke(Method.java:498)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1750)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:938)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:974)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:988)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:49)
  [beaster]    >        at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:48)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:817)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:468)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:947)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:832)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:883)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:894)
  [beaster]    >        at 
org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:45)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:41)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64)
  [beaster]    >        at 
org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  [beaster]    >        at 
com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368)
  [beaster]    >        at java.lang.Thread.run(Thread.java:748)
  [beaster]    >        Suppressed: java.lang.AssertionError
  [beaster]    >                at 
org.apache.lucene.codecs.simpletext.SimpleTextDocValuesReader.getNumericNonIterator(SimpleTextDocValuesReader.java:184)
  [beaster]    >                at 
org.apache.lucene.codecs.simpletext.SimpleTextDocValuesReader.getNumeric(SimpleTextDocValuesReader.java:142)
  [beaster]    >                at 
org.apache.lucene.index.CodecReader.getNumericDocValues(CodecReader.java:137)
  [beaster]    >                at 
org.apache.lucene.index.ReadersAndUpdates$2.getNumeric(ReadersAndUpdates.java:373)
  [beaster]    >                at 
org.apache.lucene.codecs.simpletext.SimpleTextDocValuesWriter.addNumericField(SimpleTextDocValuesWriter.java:88)
  [beaster]    >                at 
org.apache.lucene.index.ReadersAndUpdates.handleDVUpdates(ReadersAndUpdates.java:368)
  [beaster]    >                at 
org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:570)
  [beaster]    >                at 
org.apache.lucene.index.ReaderPool.commit(ReaderPool.java:325)
  [beaster]    >                at 
org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3313)
  [beaster]    >                at 
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3222)
  [beaster]    >                at 
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3451)
  [beaster]    >                at 
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3416)
  [beaster]    >                at 
org.apache.lucene.index.TestStressIndexing2$CorruptibleIndexingThread.run(TestStressIndexing2.java:1008)
  [beaster]    >                ... 36 more
  [beaster]   2> NOTE: leaving temporary files on disk at: 
/home/hossman/lucene/dev/lucene/build/core/test/J0/temp/lucene.index.TestStressIndexing2_CABE666E4674CFB2-001
  [beaster]   2> NOTE: test params are: codec=SimpleText, 
sim=Asserting(org.apache.lucene.search.similarities.AssertingSimilarity@40b1325d),
 locale=cs, timezone=America/Nipigon
  [beaster]   2> NOTE: Linux 3.19.0-84-generic amd64/Oracle Corporation 
1.8.0_144 (64-bit)/cpus=4,threads=1,free=203261928,total=249561088
  [beaster]   2> NOTE: All tests run in this JVM: [TestStressIndexing2]
  [beaster] 
  [beaster] Tests with failures [seed: CABE666E4674CFB2]:
  [beaster]   - 
org.apache.lucene.index.TestStressIndexing2.testRandomCorruptionIsTragic 
{seed=[CABE666E4674CFB2:682DC0F2BA2A235F]}
{noformat}

...I'm not sure how/why that assertion would have tripped let alone if/when 
AssertionErrors should be treated as tragic?

> IndexWriter.getTragicException() nay not reflect all corrupting exceptions 
> (notably: NoSuchFileException)
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-8692
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8692
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Hoss Man
>            Priority: Major
>         Attachments: LUCENE-8692.patch, LUCENE-8692.patch, LUCENE-8692.patch, 
> LUCENE-8692_test.patch
>
>
> Backstory...
> Solr has a "LeaderTragicEventTest" which uses MockDirectoryWrapper's 
> {{corruptFiles}} to introduce corruption into the "leader" node's index and 
> then assert that this solr node gives up it's leadership of the shard and 
> another replica takes over.
> This can currently fail sporadically (but usually reproducibly - 
> seeSOLR-13237) due to the leader not giving up it's leadership even after the 
> corruption causes an update/commit to fail.  Solr's leadership code makes 
> this decision after encountering an exception from the IndexWriter based on 
> wether {{IndexWriter.getTragicException()}} is (non-)null.
> ----
> While investigating this, I created an isolated Lucene-Core equivilent test 
> that demonstrates the same basic situation:
> * Gradually cause corruption on an index untill (otherwise) valid execution 
> of IW.add() + IW.commit() calls throw an exception to the IW client.
> * assert that if an exception is thrown to the IW client, 
> {{getTragicException()}} is now non-null.
> It's fairly easy to make my new test fail reproducibly -- in every situation 
> I've seen the underlying exception is a {{NoSuchFileException}} (ie: the 
> randomly introduced corruption was to delete some file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to