Re: Weird operator ID check restore from checkpoint fails

2021-12-07 Thread Dan Hill
Thanks! On Tue, Dec 7, 2021, 22:55 Robert Metzger wrote: > 811d3b279c8b26ed99ff0883b7630242 is the operator id. > If I'm not mistaken, running the job graph generation (e.g. the main > method) in DEBUG log level will show you all the IDs generated. This should > help you map this ID to your code

Re: Weird operator ID check restore from checkpoint fails

2021-12-07 Thread Robert Metzger
811d3b279c8b26ed99ff0883b7630242 is the operator id. If I'm not mistaken, running the job graph generation (e.g. the main method) in DEBUG log level will show you all the IDs generated. This should help you map this ID to your code. On Wed, Dec 8, 2021 at 7:52 AM Dan Hill wrote: > Nothing change

Re: Weird operator ID check restore from checkpoint fails

2021-12-07 Thread Dan Hill
Nothing changed (as far as I know). It's the same binary and the same args. It's Flink v1.12.3. I'm going to switch away from auto-gen uids and see if that helps. The job randomly started failing to checkpoint. I cancelled the job and started it from the last successful checkpoint. I'm confus

Re: Weird operator ID check restore from checkpoint fails

2021-12-07 Thread Robert Metzger
Hi Dan, When restoring a savepoint/checkpoint, Flink is matching the state for the operators based on the uuid of the operator. The exception says that there is some state that doesn't match any operator. So from Flink's perspective, the operator is gone. Here is more information: https://nightlie

Re: Weird operator ID check restore from checkpoint fails

2021-12-07 Thread Dan Hill
I'm restoring the job with the same binary and same flags/args. On Tue, Dec 7, 2021 at 8:48 PM Dan Hill wrote: > Hi. I noticed this warning has "operator > 811d3b279c8b26ed99ff0883b7630242" in it. I assume this should be an > operator uid or name. It looks like something else. What is it? I

Weird operator ID check restore from checkpoint fails

2021-12-07 Thread Dan Hill
Hi. I noticed this warning has "operator 811d3b279c8b26ed99ff0883b7630242" in it. I assume this should be an operator uid or name. It looks like something else. What is it? Is something corrupted? org.apache.flink.runtime.client.JobInitializationException: Could not instantiate JobManager.