771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:52:13.294 [Checkpoint Timer] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 1872 @ 1574092333218 for job 5ec264a20bb5005cdbd8e23a5e59f136.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:52:37.260 [flink-akka.actor.default-dispatcher-30] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed checkpoint 1872 for job 5ec264a20bb5005cdbd8e23a5e59f136 (568048140 bytes in 23541 ms).
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:13.314 [Checkpoint Timer] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 1873 @ 1574092393218 for job 5ec264a20bb5005cdbd8e23a5e59f136.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.279 [flink-akka.actor.default-dispatcher-40] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job bureau-user-offers-statistics-AUTORU-USERS_AUTORU (5ec264a20bb5005cdbd8e23a5e59f136) switched from state RUNNING to CANCELLING.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.279 [flink-akka.actor.default-dispatcher-40] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom File Source (1/1) (934d89cf3d7999b40225dd8009b5493c) switched from RUNNING to CANCELING.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.280 [flink-akka.actor.default-dispatcher-40] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: kafka-source-moderation-update-journal-autoru -> Filter -> Flat Map (1/2) (47656a3c4fc70e19622acca31267e41f) switched from RUNNING to CANCELING.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.280 [flink-akka.actor.default-dispatcher-40] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: kafka-source-moderation-update-journal-autoru -> Filter -> Flat Map (2/2) (be3c4562e65d3d6bdfda4f1632017c6c) switched from RUNNING to CANCELING.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.280 [flink-akka.actor.default-dispatcher-40] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - user-offers-statistics-init-from-file -> Map (1/2) (4a45ed43b05e4d444e190a44b33514ac) switched from RUNNING to CANCELING.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.280 [flink-akka.actor.default-dispatcher-40] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - user-offers-statistics-init-from-file -> Map (2/2) (bb3be311c5e53abaedb06b4d0148c23f) switched from RUNNING to CANCELING.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.280 [flink-akka.actor.default-dispatcher-40] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Keyed Reduce -> Map -> Sink: user-offers-statistics-autoru (1/2) (cfb291033df3f19c9745a6f2fd25e037) switched from RUNNING to CANCELING.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.280 [flink-akka.actor.default-dispatcher-40] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Keyed Reduce -> Map -> Sink: user-offers-statistics-autoru (2/2) (9ce7cd66199513fa97ac44d7617f0c83) switched from RUNNING to CANCELING.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.299 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom File Source (1/1) (934d89cf3d7999b40225dd8009b5493c) switched from CANCELING to CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.300 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: kafka-source-moderation-update-journal-autoru -> Filter -> Flat Map (1/2) (47656a3c4fc70e19622acca31267e41f) switched from CANCELING to CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.300 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: kafka-source-moderation-update-journal-autoru -> Filter -> Flat Map (2/2) (be3c4562e65d3d6bdfda4f1632017c6c) switched from CANCELING to CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.344 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - user-offers-statistics-init-from-file -> Map (2/2) (bb3be311c5e53abaedb06b4d0148c23f) switched from CANCELING to CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.345 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - user-offers-statistics-init-from-file -> Map (1/2) (4a45ed43b05e4d444e190a44b33514ac) switched from CANCELING to CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.706 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Keyed Reduce -> Map -> Sink: user-offers-statistics-autoru (1/2) (cfb291033df3f19c9745a6f2fd25e037) switched from CANCELING to CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.714 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Keyed Reduce -> Map -> Sink: user-offers-statistics-autoru (2/2) (9ce7cd66199513fa97ac44d7617f0c83) switched from CANCELING to CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.714 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph - Job bureau-user-offers-statistics-AUTORU-USERS_AUTORU (5ec264a20bb5005cdbd8e23a5e59f136) switched from state CANCELLING to CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.714 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Stopping checkpoint coordinator for job 5ec264a20bb5005cdbd8e23a5e59f136.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.714 [flink-akka.actor.default-dispatcher-2] INFO o.a.f.runtime.checkpoint.ZooKeeperCompletedCheckpointStore - Shutting down
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.966 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.zookeeper.ZooKeeperStateHandleStore - Removing /moderation-flink/testing/checkpoints/5ec264a20bb5005cdbd8e23a5e59f136 from ZooKeeper
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:19.966 [cluster-io-thread-6] INFO org.apache.flink.runtime.checkpoint.CompletedCheckpoint - Checkpoint with ID 1872 at 's3://misc/moderation-flink/flink-checkpoints/5ec264a20bb5005cdbd8e23a5e59f136/chk-1872' not discarded.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:20.044 [flink-akka.actor.default-dispatcher-2] INFO o.a.flink.runtime.checkpoint.ZooKeeperCheckpointIDCounter - Shutting down.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:20.045 [flink-akka.actor.default-dispatcher-2] INFO o.a.flink.runtime.checkpoint.ZooKeeperCheckpointIDCounter - Removing /checkpoint-counter/5ec264a20bb5005cdbd8e23a5e59f136 from ZooKeeper
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:53:20.259 [flink-akka.actor.default-dispatcher-2] INFO org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Job 5ec264a20bb5005cdbd8e23a5e59f136 reached globally terminal state CANCELED.
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:54:24.085 [flink-akka.actor.default-dispatcher-31] INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Releasing idle slot [0553df66161f5d78f4b41d8c8c32c21f].
771a4992-d694-d2a4-b49a-d4eb382086e5 2019-11-18 18:54:24.085 [flink-akka.actor.default-dispatcher-31] INFO org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Releasing idle slot [498b9bf0c0f2188ff739d72e6df288dc].
If everything is OK(your config options about archive dir and history server is correct), Flink should archive the completed job.You said you did not find any exceptions in the log about failing to archive. But any other exceptions? Can you share the logs about your scene?Best,VinoPavel Potseluev <potsel...@yandex-team.ru> 于2019年11月21日周四 上午2:25写道:Hi all,We see occasionally that flink doesn't save information about canceled job to archive directory (configured by jobmanager.archive.fs.dir property). And there are no exceptions in the log about failing archiving. It's a problem in our use case because our script for deploying jobs relies on flink history server to find latest checkpoint for some job. Does flink guarantee saving data to archive? If so, any ideas why it doesn't work sometimes? Flink version is 1.8.0.--Best regards,Pavel PotseluevSoftware developer, Yandex.Classifieds LLC