[ 
https://issues.apache.org/jira/browse/IGNITE-24273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirill Tkalenko updated IGNITE-24273:
-------------------------------------
    Description: 
We need to deal with flaky 
*org.apache.ignite.raft.jraft.core.ItNodeTest#testInstallSnapshot*.

Link to 
[TC|https://ci.ignite.apache.org/viewLog.html?buildId=8794190&buildTypeId=ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRaft&tab=buildResultsDiv].

{noformat}
org.opentest4j.AssertionFailedError: expected: <2> but was: <1>
        at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at 
app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
        at 
app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
        at 
app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145)
        at 
app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531)
        at 
app//org.apache.ignite.raft.jraft.core.ItNodeTest.triggerLeaderSnapshot(ItNodeTest.java:4635)
        at 
app//org.apache.ignite.raft.jraft.core.ItNodeTest.testInstallSnapshot(ItNodeTest.java:2274)
        at java.base@17.0.6/java.lang.reflect.Method.invoke(Method.java:568)
        at java.base@17.0.6/java.util.ArrayList.forEach(ArrayList.java:1511)
        at java.base@17.0.6/java.util.ArrayList.forEach(ArrayList.java:1511)
{noformat}

The problem is that between snapshot creations we execute 10 more tasks that 
should increase the last applied index, but due to the race this does not have 
time to happen at the moment of the repeated request for snapshot creation, the 
test needs to be fixed.

A little more detail on where the race occurs. 
Between snapshot creations, 
*org.apache.ignite.raft.jraft.core.ItNodeTest#sendTestTaskAndWait* is called, 
which complete latch of all tasks in the state machine, but since this happens 
in another thread, the test thread starts executing earlier and requests the 
creation of a snapshot before the last applied index in 
*org.apache.ignite.raft.jraft.core.FSMCallerImpl* is updated, so when 
requesting the creation of a snapshot, we do not have time to understand that 
something has changed in the state machine and is needed and we skip the 
creation of a snapshot.


  was:
We need to deal with flaky 
*org.apache.ignite.raft.jraft.core.ItNodeTest#testInstallSnapshot*.

Link to 
[TC|https://ci.ignite.apache.org/viewLog.html?buildId=8794190&buildTypeId=ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRaft&tab=buildResultsDiv].

The problem is that between snapshot creations we execute 10 more tasks that 
should increase the last applied index, but due to the race this does not have 
time to happen at the moment of the repeated request for snapshot creation, the 
test needs to be fixed.

A little more detail on where the race occurs. 
Between snapshot creations, 
*org.apache.ignite.raft.jraft.core.ItNodeTest#sendTestTaskAndWait* is called, 
which complete latch of all tasks in the state machine, but since this happens 
in another thread, the test thread starts executing earlier and requests the 
creation of a snapshot before the last applied index in 
*org.apache.ignite.raft.jraft.core.FSMCallerImpl* is updated, so when 
requesting the creation of a snapshot, we do not have time to understand that 
something has changed in the state machine and is needed and we skip the 
creation of a snapshot.

{noformat}
org.opentest4j.AssertionFailedError: expected: <2> but was: <1>
        at 
app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at 
app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at 
app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
        at 
app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
        at 
app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145)
        at 
app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531)
        at 
app//org.apache.ignite.raft.jraft.core.ItNodeTest.triggerLeaderSnapshot(ItNodeTest.java:4635)
        at 
app//org.apache.ignite.raft.jraft.core.ItNodeTest.testInstallSnapshot(ItNodeTest.java:2274)
        at java.base@17.0.6/java.lang.reflect.Method.invoke(Method.java:568)
        at java.base@17.0.6/java.util.ArrayList.forEach(ArrayList.java:1511)
        at java.base@17.0.6/java.util.ArrayList.forEach(ArrayList.java:1511)
{noformat}


> Fix flaky ItNodeTest#testInstallSnapshot
> ----------------------------------------
>
>                 Key: IGNITE-24273
>                 URL: https://issues.apache.org/jira/browse/IGNITE-24273
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Kirill Tkalenko
>            Assignee: Kirill Tkalenko
>            Priority: Major
>              Labels: ignite-3
>
> We need to deal with flaky 
> *org.apache.ignite.raft.jraft.core.ItNodeTest#testInstallSnapshot*.
> Link to 
> [TC|https://ci.ignite.apache.org/viewLog.html?buildId=8794190&buildTypeId=ApacheIgnite3xGradle_Test_IntegrationTests_ModuleRaft&tab=buildResultsDiv].
> {noformat}
> org.opentest4j.AssertionFailedError: expected: <2> but was: <1>
>       at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
>       at 
> app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
>       at 
> app//org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
>       at 
> app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:150)
>       at 
> app//org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:145)
>       at 
> app//org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:531)
>       at 
> app//org.apache.ignite.raft.jraft.core.ItNodeTest.triggerLeaderSnapshot(ItNodeTest.java:4635)
>       at 
> app//org.apache.ignite.raft.jraft.core.ItNodeTest.testInstallSnapshot(ItNodeTest.java:2274)
>       at java.base@17.0.6/java.lang.reflect.Method.invoke(Method.java:568)
>       at java.base@17.0.6/java.util.ArrayList.forEach(ArrayList.java:1511)
>       at java.base@17.0.6/java.util.ArrayList.forEach(ArrayList.java:1511)
> {noformat}
> The problem is that between snapshot creations we execute 10 more tasks that 
> should increase the last applied index, but due to the race this does not 
> have time to happen at the moment of the repeated request for snapshot 
> creation, the test needs to be fixed.
> A little more detail on where the race occurs. 
> Between snapshot creations, 
> *org.apache.ignite.raft.jraft.core.ItNodeTest#sendTestTaskAndWait* is called, 
> which complete latch of all tasks in the state machine, but since this 
> happens in another thread, the test thread starts executing earlier and 
> requests the creation of a snapshot before the last applied index in 
> *org.apache.ignite.raft.jraft.core.FSMCallerImpl* is updated, so when 
> requesting the creation of a snapshot, we do not have time to understand that 
> something has changed in the state machine and is needed and we skip the 
> creation of a snapshot.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to