[
https://issues.apache.org/jira/browse/IGNITE-8400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456600#comment-16456600
]
Aleksey Plekhanov edited comment on IGNITE-8400 at 5/3/18 11:32 AM:
--------------------------------------------------------------------
Node is dropped out of topology because another node (previous in the ring) in
some cases send message to this node and can't get reply within given failure
detection timeout. To solve this I set reconnect count to 2 (this change also
disables failure detection timeout and sets separate timeouts for each IO
method invocation). I also remove {{sleep()}} in {{checkSegmented}} since this
doesn't affect test logic, but brings extra delay to test (with disabled
failure detection timeout test run longer).
Looped test runs on TC [1] after this fix doesn't contain {{Grid is in invalid
state}} error anymore. But there are still {{Test has been timed out}} error
sometimes (with current implementation this error also fired). I think another
ticket should be filled for {{Test has been timed out}} error after merge of
this ticket and new test failure statistics collected.
[1]
https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Cache3&branch_IgniteTests24Java8=pull%2F3930%2Fhead&tab=buildTypeStatusDiv
was (Author: alex_pl):
Node is dropped out of topology because another node (previous in the ring) in
some cases can't send message to this node and get reply within given failure
detection timeout. To solve this I set reconnect count to 2 (this change also
disables failure detection timeout and sets separate timeouts for each IO
method invocation). I also remove {{sleep()}} in {{checkSegmented}} since this
doesn't affect test logic, but brings extra delay to test (with disabled
failure detection timeout test run longer).
Looped test runs on TC [1] after this fix doesn't contain {{Grid is in invalid
state}} error anymore. But there are still {{Test has been timed out}} error
sometimes (with current implementation this error also fired). I think another
ticket should be filled for {{Test has been timed out}} error after merge of
this ticket and new test failure statistics collected.
[1]
https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Cache3&branch_IgniteTests24Java8=pull%2F3930%2Fhead&tab=buildTypeStatusDiv
> Flaky failure of
> IgniteTopologyValidatorGridSplitCacheTest.testTopologyValidatorWithCacheGroup
> (Grid is in invalid state)
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-8400
> URL: https://issues.apache.org/jira/browse/IGNITE-8400
> Project: Ignite
> Issue Type: Bug
> Reporter: Aleksey Plekhanov
> Assignee: Aleksey Plekhanov
> Priority: Major
> Labels: MakeTeamcityGreenAgain
> Fix For: 2.6
>
>
> Test fails sometimes on TeamCity with exception:
> {noformat}
> java.lang.IllegalStateException: Grid is in invalid state to perform this
> operation. It either not started yet or has already being or have stopped
> [igniteInstanceName=cache.IgniteTopologyValidatorGridSplitCacheTest6,
> state=STOPPED]
> {noformat}
> Before this exception node is dropped out of topology by coordinator:
> {noformat}
> [tcp-disco-msg-worker-#7831%cache.IgniteTopologyValidatorGridSplitCacheTest6%][IgniteCacheTopologySplitAbstractTest$SplitTcpDiscoverySpi]
> Node is out of topology (probably, due to short-time network problems).
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)