Hi Julio, this might be a bug in job stats. Can you please create an issue in Jira describing the steps you were doing and complete logs?
Best, Andrey > On 2 Oct 2018, at 21:11, Julio Biason <julio.bia...@azion.com> wrote: > > Oh, another piece of information: > > Because the job was failing and restarting, I did a cancel via the CLI tool > during one of the restarts. > > On Tue, Oct 2, 2018 at 4:03 PM, Julio Biason <julio.bia...@azion.com > <mailto:julio.bia...@azion.com>> wrote: > Hello, > > I had a job that was failing -- a bug on our code -- so I decided to cancel > it and deploy the fix. Because I couldn't create a savepoint due the job > restarting, I decided to kill it anyway and use the web interface to get the > last successful checkpoint. > > The problem is: the interface is not showing anything for the job. The > details page show nothing, not even the pipeline. > > The only thing that seems related in the JobManager logs is this: > > 2018-10-02 19:03:14,214 [flink-akka.actor.default-dispatcher-4158] ERROR > org.apache.flink.runtime.rest.handler.job.JobDetailsHandler - Implementation > error: Unhandled exception. > java.lang.IllegalArgumentException: Negative number of in progress checkpoints > at > org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:139) > at > org.apache.flink.runtime.checkpoint.CheckpointStatsCounts.<init>(CheckpointStatsCounts.java:72) > at > org.apache.flink.runtime.checkpoint.CheckpointStatsCounts.createSnapshot(CheckpointStatsCounts.java:177) > at > org.apache.flink.runtime.checkpoint.CheckpointStatsTracker.createSnapshot(CheckpointStatsTracker.java:166) > at > org.apache.flink.runtime.executiongraph.ExecutionGraph.getCheckpointStatsSnapshot(ExecutionGraph.java:553) > at > org.apache.flink.runtime.executiongraph.ArchivedExecutionGraph.createFrom(ArchivedExecutionGraph.java:340) > at > org.apache.flink.runtime.jobmaster.JobMaster.requestJob(JobMaster.java:923) > at sun.reflect.GeneratedMethodAccessor101.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70) > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142) > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40) > at > akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165) > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > > > -- > Julio Biason, Sofware Engineer > AZION | Deliver. Accelerate. Protect. > Office: +55 51 3083 8101 <callto:+555130838101> | Mobile: +55 51 > <callto:+5551996209291>99907 0554 > > > > -- > Julio Biason, Sofware Engineer > AZION | Deliver. Accelerate. Protect. > Office: +55 51 3083 8101 <callto:+555130838101> | Mobile: +55 51 > <callto:+5551996209291>99907 0554