On Tue, Mar 16, 2021 at 02:32:54AM +0000, Alexey Trenikhun wrote: > Hi Roman, > I took thread dump: > "Source: digital-itx-eastus2 -> Filter (6/6)#0" Id=200 BLOCKED on > java.lang.Object@5366a0e2 owned by "Legacy Source Thread - Source: > digital-itx-eastus2 -> Filter (6/6)#0" Id=202 > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:92) > - blocked on java.lang.Object@5366a0e2 > at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317) > at > org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:189) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:617) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:581) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:755) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:570) > > "Legacy Source Thread - Source: digital-itx-eastus2 -> Filter (6/6)#0" Id=202 > WAITING on java.util.concurrent.CompletableFuture$Signaller@6915c7ef > at sun.misc.Unsafe.park(Native Method) > - waiting on java.util.concurrent.CompletableFuture$Signaller@6915c7ef > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707) > at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > at > org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegmentBlocking(LocalBufferPool.java:319) > at > org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestBufferBuilderBlocking(LocalBufferPool.java:291) > > Is it checkpoint lock? Is checkpoint lock per task or per TM? I see multiple > threads in SynchronizedStreamTaskActionExecutor.runThrowing blocked on > different Objects.
Hi, This call stack is similar to our case as described in [0]. Maybe they are the same issue? [0] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-debug-checkpoint-savepoint-stuck-in-Flink-1-12-2-td42103.html -- ChangZhuo Chen (陳昌倬) czchen@{czchen,debian}.org http://czchen.info/ Key fingerprint = BA04 346D C2E1 FE63 C790 8793 CC65 B0CD EC27 5D5B
signature.asc
Description: PGP signature