[ 
https://issues.apache.org/jira/browse/FLINK-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063501#comment-17063501
 ] 

Bashar Abdul Jawad commented on FLINK-16638:
--------------------------------------------

Hi [~roman_khachatryan], sorry I didn't post the full log message, the error is:
{code:java}
 java.lang.IllegalStateException: There is no operator for the state 
6463bd1ad519d1e0c283c83f761989c1
        at 
org.apache.flink.runtime.checkpoint.StateAssignmentOperation.checkStateMappingCompleteness(StateAssignmentOperation.java:567)
        at 
org.apache.flink.runtime.checkpoint.StateAssignmentOperation.assignStates(StateAssignmentOperation.java:79)
        at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreLatestCheckpointedState(CheckpointCoordinator.java:1078)
        at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.restoreSavepoint(CheckpointCoordinator.java:1143)
        at 
org.apache.flink.runtime.scheduler.LegacyScheduler.tryRestoreExecutionGraphFromSavepoint(LegacyScheduler.java:237)
        at 
org.apache.flink.runtime.scheduler.LegacyScheduler.createAndRestoreExecutionGraph(LegacyScheduler.java:196)
        at 
org.apache.flink.runtime.scheduler.LegacyScheduler.<init>(LegacyScheduler.java:176)
        at 
org.apache.flink.runtime.scheduler.LegacySchedulerFactory.createInstance(LegacySchedulerFactory.java:70)
        at 
org.apache.flink.runtime.jobmaster.JobMaster.createScheduler(JobMaster.java:275)
        at 
org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:265)
        at 
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:98)
        at 
org.apache.flink.runtime.jobmaster.factories.DefaultJobMasterServiceFactory.createJobMasterService(DefaultJobMasterServiceFactory.java:40)
        at 
org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:146)
{code}
I still don't see how the code is right. Operator hash 
6463bd1ad519d1e0c283c83f761989c1 is in the save point so it's in 
_operatorStates_. When I set that hash directly on the operator using 
setUidHash the code is not adding it (through a call to 
getUserDefinedOperatorIDs) to `allOperatorIDs` and this condition will always 
be true
{code:java}
                        if 
(!allOperatorIDs.contains(operatorGroupStateEntry.getKey())) {
{code}
Throwing the exception above.

> Flink checkStateMappingCompleteness doesn't include UserDefinedOperatorIDs
> --------------------------------------------------------------------------
>
>                 Key: FLINK-16638
>                 URL: https://issues.apache.org/jira/browse/FLINK-16638
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.10.0
>            Reporter: Bashar Abdul Jawad
>            Priority: Critical
>
> [StateAssignmentOperation.checkStateMappingCompleteness|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/StateAssignmentOperation.java#L555]
>  doesn't check for UserDefinedOperatorIDs (specified using setUidHash), 
> causing the exception:
> {code}
>  java.lang.IllegalStateException: There is no operator for the state {}
> {code}
> to be thrown when a savepoint can't be mapped to an ExecutionJobVertex, even 
> when the operator hash is explicitly specified.
> I believe this logic should be extended to also include 
> UserDefinedOperatorIDs as so:
> {code:java}
> for (ExecutionJobVertex executionJobVertex : tasks) {
>   allOperatorIDs.addAll(executionJobVertex.getOperatorIDs());
>   allOperatorIDs.addAll(executionJobVertex.getUserDefinedOperatorIDs());
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to