[ 
https://issues.apache.org/jira/browse/FLINK-19775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17236026#comment-17236026
 ] 

jiawen xiao edited comment on FLINK-19775 at 11/20/20, 9:41 AM:
----------------------------------------------------------------

[@tillrohrmann,|https://github.com/tillrohrmann] [~dian.fu]  ,Maybe I found the 
reason for the instability. It depends on when the thread A calling the await() 
method acquires the lock of the lock object in the jvm lock pool. According to 
the lock.wait() code in the await() method, we will know that it will cause the 
current thread A to enter the blocking state. At this point, trigger() will 
perform its work in the thread B to which it belongs. Suppose that when thread 
B executes to lock.notifyAll(), thread A will be awakened, but the lock of the 
lock object is not allocated in time and is still blocked. As long as the 
interrupt flag of thread A is true, InterruptedException will be thrown. So I 
thought of two solutions. The first point: Thread B changes the interrupt flag 
of thread A to false before executing lock.notifyAll(). This method can ensure 
that the A thread will not throw InterruptedException after being awakened, but 
how to restore the thread interruption flag for different threads still needs 
to be considered. The second point: Use catch code to catch the exception in 
the A thread, it will automatically restore the thread interruption flag, and 
make the compilation pass. But the second method cannot solve the problem of 
exception throwing.


was (Author: 873925...@qq.com):
[@tillrohrmann,|https://github.com/tillrohrmann] [~dian.fu] [ 
|https://github.com/tillrohrmann] ,Maybe I found the reason for the 
instability. It depends on when the thread A calling the await() method 
acquires the lock of the lock object in the jvm lock pool. According to the 
lock.wait() code in the await() method, we will know that it will cause the 
current thread A to enter the blocking state. At this point, trigger() will 
perform its work in the thread B to which it belongs. Suppose that when thread 
B executes to lock.notifyAll(), thread A will be awakened, but the lock of the 
lock object is not allocated in time and is still blocked. As long as the 
interrupt flag of thread A is true, InterruptedException will be thrown. So I 
thought of two solutions. The first point: Thread B changes the interrupt flag 
of thread A to false before executing lock.notifyAll(). This method can ensure 
that the A thread will not throw InterruptedException after being awakened, but 
how to restore the thread interruption flag for different threads still needs 
to be considered. The second point: Use catch code to catch the exception in 
the A thread, it will automatically restore the thread interruption flag, and 
make the compilation pass. But the second method cannot solve the problem of 
exception throwing.

> SystemProcessingTimeServiceTest.testImmediateShutdown is instable
> -----------------------------------------------------------------
>
>                 Key: FLINK-19775
>                 URL: https://issues.apache.org/jira/browse/FLINK-19775
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream
>    Affects Versions: 1.11.0
>            Reporter: Dian Fu
>            Assignee: jiawen xiao
>            Priority: Major
>              Labels: pull-request-available, test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=8131&view=logs&j=d89de3df-4600-5585-dadc-9bbc9a5e661c&t=66b5c59a-0094-561d-0e44-b149dfdd586d
> {code}
> 2020-10-22T21:12:54.9462382Z [ERROR] 
> testImmediateShutdown(org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeServiceTest)
>   Time elapsed: 0.009 s  <<< ERROR!
> 2020-10-22T21:12:54.9463024Z java.lang.InterruptedException
> 2020-10-22T21:12:54.9463331Z  at java.lang.Object.wait(Native Method)
> 2020-10-22T21:12:54.9463766Z  at java.lang.Object.wait(Object.java:502)
> 2020-10-22T21:12:54.9464140Z  at 
> org.apache.flink.core.testutils.OneShotLatch.await(OneShotLatch.java:63)
> 2020-10-22T21:12:54.9466014Z  at 
> org.apache.flink.streaming.runtime.tasks.SystemProcessingTimeServiceTest.testImmediateShutdown(SystemProcessingTimeServiceTest.java:154)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to