[ https://issues.apache.org/jira/browse/FLINK-18071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128338#comment-17128338 ]
Yun Gao commented on FLINK-18071: --------------------------------- Hi [~sewen], I also debugged the case today, it seems come from the concurrency between the timer thread to send event and the thread to reset the buffered events. The timer thread may scheduling the event sending, but before the sending finished, failover happens and another thread called resetForTask and cleared the buffered events. However, the first thread will be awaked later and continue to send the outdated event, which will be inserted to the valve buffers and cause repeat. > CoordinatorEventsExactlyOnceITCase.checkListContainsSequence fails on CI > ------------------------------------------------------------------------ > > Key: FLINK-18071 > URL: https://issues.apache.org/jira/browse/FLINK-18071 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Tests > Affects Versions: 1.11.0, 1.12.0 > Reporter: Robert Metzger > Assignee: Yun Gao > Priority: Critical > Labels: test-stability > Fix For: 1.11.0, 1.12.0 > > > CI: > https://dev.azure.com/georgeryan1322/Flink/_build/results?buildId=330&view=logs&j=6e58d712-c5cc-52fb-0895-6ff7bd56c46b&t=f30a8e80-b2cf-535c-9952-7f521a4ae374 > {code} > [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 8.795 > s <<< FAILURE! - in > org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase > [ERROR] > test(org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase) > Time elapsed: 4.647 s <<< FAILURE! > java.lang.AssertionError: List did not contain expected sequence of 200 > elements, but was: [152, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, > 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, > 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, > 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, > 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, > 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, > 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, > 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, > 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, > 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, > 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, > 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, > 198, 199] > at org.junit.Assert.fail(Assert.java:88) > at > org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase.failList(CoordinatorEventsExactlyOnceITCase.java:160) > at > org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase.checkListContainsSequence(CoordinatorEventsExactlyOnceITCase.java:148) > at > org.apache.flink.runtime.operators.coordination.CoordinatorEventsExactlyOnceITCase.test(CoordinatorEventsExactlyOnceITCase.java:143) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)