Robert Metzger created FLINK-19658:
--------------------------------------

             Summary: Local recovery and sticky scheduling end-to-end test 
hangs with "Expected to find info here."
                 Key: FLINK-19658
                 URL: https://issues.apache.org/jira/browse/FLINK-19658
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Coordination
    Affects Versions: 1.12.0
            Reporter: Robert Metzger
            Assignee: Robert Metzger


The reason for all these e2e test hangs recently seems to be the Local recovery 
and sticky scheduling end-to-end test.

It is in a restart loop with this error:
{code}
020-10-15T13:01:42.4079891Z 2020-10-15 12:54:06,099 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Flat Map -> 
Sink: Unnamed (1/4) 
(78a56f7797be1d41b0b1b31a75bd90e1_20ba6b65f97481d5570070de90e4e791_0_1) 
switched from RUNNING to FAILED on 
org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@65b70d8d.
2020-10-15T13:01:42.4080637Z java.lang.NullPointerException: Expected to find 
info here.
2020-10-15T13:01:42.4081365Z    at 
org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:78) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4082067Z    at 
org.apache.flink.streaming.tests.StickyAllocationAndLocalRecoveryTestJob$StateCreatingFlatMap.initializeState(StickyAllocationAndLocalRecoveryTestJob.java:343)
 ~[?:?]
2020-10-15T13:01:42.4083125Z    at 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:185)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4103820Z    at 
org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:167)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4104926Z    at 
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4106020Z    at 
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:107)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4107084Z    at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:262)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4108295Z    at 
org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:400)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4109432Z    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$beforeInvoke$0(StreamTask.java:505)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4110458Z    at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4111428Z    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.beforeInvoke(StreamTask.java:501)
 ~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4112328Z    at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:533) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4113167Z    at 
org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4113962Z    at 
org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
~[flink-dist_2.11-1.12-SNAPSHOT.jar:1.12-SNAPSHOT]
2020-10-15T13:01:42.4114434Z    at java.lang.Thread.run(Thread.java:748) 
~[?:1.8.0_265]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to