Enable info log. it seems it stuck
==> /mnt/ephemeral/logs/flink-flink-jobmanager-0-vpc2w2-rep-stage-flink1.log <== 2017-06-01 12:45:18,229 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 1 @ 1496321118221 ==> /mnt/ephemeral/logs/flink-flink-taskmanager-0-vpc2w2-rep-stage-flink1.log <== 2017-06-01 12:45:18,237 INFO org.apache.flink.core.fs.FileSystem - Created new CloseableRegistry org.apache.flink.core.fs.SafetyNetCloseableRegistry@79e68dd3 for Async calls on Source: Custom Source (2/12) 2017-06-01 12:45:18,237 INFO org.apache.flink.core.fs.FileSystem - Created new CloseableRegistry org.apache.flink.core.fs.SafetyNetCloseableRegistry@78da1e82 for Async calls on Source: Custom Source (5/12) 2017-06-01 12:45:18,238 INFO org.apache.flink.core.fs.FileSystem - Created new CloseableRegistry org.apache.flink.core.fs.SafetyNetCloseableRegistry@68bff79e for Async calls on Source: Custom Source (8/12) 2017-06-01 12:45:18,238 INFO org.apache.flink.core.fs.FileSystem - Created new CloseableRegistry org.apache.flink.core.fs.SafetyNetCloseableRegistry@600bdc29 for Async calls on Source: Custom Source (11/12) 2017-06-01 12:45:24,853 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumertest :: 2017-06-01 12:45:24,853 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumertest :: 2017-06-01 12:45:24,853 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumertest :: 2017-06-01 12:45:24,854 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumertest :: 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, host 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 512 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 1024 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 20 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.preallocate, false 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 4 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 8081 2017-06-01 12:45:24,859 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: state.backend, filesystem 2017-06-01 12:45:24,860 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.network.numberOfBuffers, 2048 2017-06-01 12:45:24,860 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.tmp.dirs, /mnt/ephemeral/tmp 2017-06-01 12:45:24,860 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: fs.hdfs.hadoopconf, /opt/hadoop-config 2017-06-01 12:45:24,862 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: yarn.application-attempts, 10 2017-06-01 12:45:24,863 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability, zookeeper 2017-06-01 12:45:24,863 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.quorum, host1:2181,host2:2181 2017-06-01 12:45:24,863 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.storageDir, s3://somelocation/ha-recovery/ 2017-06-01 12:45:24,863 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.path.root, /flink-y 2017-06-01 12:45:24,863 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: zookeeper.sasl.disable, true 2017-06-01 12:45:24,863 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 12288 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, host 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 512 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 1024 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 20 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.preallocate, false 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 4 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 8081 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: state.backend, filesystem 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.network.numberOfBuffers, 2048 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.tmp.dirs, /mnt/ephemeral/tmp 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: fs.hdfs.hadoopconf, /opt/hadoop-config 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: yarn.application-attempts, 10 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability, zookeeper 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.quorum, host1:2181,host2:2181 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.storageDir, s3://somelocation/ha-recovery/ 2017-06-01 12:45:24,895 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.path.root, /flink-y 2017-06-01 12:45:24,896 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: zookeeper.sasl.disable, true 2017-06-01 12:45:24,896 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 12288 2017-06-01 12:45:24,902 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumer :: 2017-06-01 12:45:24,905 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumer :: 2017-06-01 12:45:24,909 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, host 2017-06-01 12:45:24,909 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 512 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 1024 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 20 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.preallocate, false 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 4 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 8081 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: state.backend, filesystem 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.network.numberOfBuffers, 2048 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.tmp.dirs, /mnt/ephemeral/tmp 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: fs.hdfs.hadoopconf, /opt/hadoop-config 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: yarn.application-attempts, 10 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability, zookeeper 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.quorum, host1:2181,host2:2181 2017-06-01 12:45:24,910 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.storageDir, s3://somelocation/ha-recovery/ 2017-06-01 12:45:24,911 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.path.root, /flink-y 2017-06-01 12:45:24,911 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: zookeeper.sasl.disable, true 2017-06-01 12:45:24,911 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 12288 2017-06-01 12:45:24,915 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumer :: 2017-06-01 12:45:24,916 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumer :: 2017-06-01 12:45:24,923 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, host 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 512 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 1024 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 20 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.preallocate, false 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 4 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.web.port, 8081 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: state.backend, filesystem 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.network.numberOfBuffers, 2048 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.tmp.dirs, /mnt/ephemeral/tmp 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: fs.hdfs.hadoopconf, /opt/hadoop-config 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: yarn.application-attempts, 10 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability, zookeeper 2017-06-01 12:45:24,924 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.quorum, host1:2181,host2:2181 2017-06-01 12:45:24,925 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.storageDir, s3://somelocation/ha-recovery/ 2017-06-01 12:45:24,925 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: high-availability.zookeeper.path.root, /flink-y 2017-06-01 12:45:24,925 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: zookeeper.sasl.disable, true 2017-06-01 12:45:24,925 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 12288 2017-06-01 12:45:25,187 INFO org.apache.flink.core.fs.FileSystem - Ensuring all FileSystem streams are closed for Async calls on Source: Custom Source (11/12) 2017-06-01 12:45:25,188 INFO org.apache.flink.core.fs.FileSystem - Ensuring all FileSystem streams are closed for Async calls on Source: Custom Source (2/12) 2017-06-01 12:45:25,196 INFO org.apache.flink.core.fs.FileSystem - Ensuring all FileSystem streams are closed for Async calls on Source: Custom Source (5/12) 2017-06-01 12:45:25,197 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumertest :: 2017-06-01 12:45:25,203 INFO org.apache.flink.core.fs.FileSystem - Ensuring all FileSystem streams are closed for Async calls on Source: Custom Source (8/12) 2017-06-01 12:45:25,227 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumer :: 2017-06-01 12:45:25,257 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumertest :: 2017-06-01 12:45:25,277 INFO com.company.deserializer.EventDeserializer - ======> KafkaConsumer :: ==> /mnt/ephemeral/logs/flink-flink-client-vpc2w2-rep-stage-flink1.log <== 2017-06-01 12:45:45,350 WARN org.apache.flink.runtime.client.JobSubmissionClientActor - Discard message LeaderSessionMessage(null,ConnectionTimeout) because the expected leader session ID 2d2a8eac-b837-4605-93cc-81720247f247 did not equal the received leader session ID null. ==> /mnt/ephemeral/logs/flink-flink-jobmanager-0-vpc2w2-rep-stage-flink1.log <== 2017-06-01 12:55:18,229 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Checkpoint 1 expired before completing. 2017-06-01 12:55:18,233 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering checkpoint 2 @ 1496321718230 ==> /mnt/ephemeral/logs/flink-flink-taskmanager-0-vpc2w2-rep-stage-flink1.log <== 2017-06-01 12:55:18,235 INFO org.apache.flink.core.fs.FileSystem - Created new CloseableRegistry org.apache.flink.core.fs.SafetyNetCloseableRegistry@44074ae6 for Async calls on Source: Custom Source (2/12) 2017-06-01 12:55:18,235 INFO org.apache.flink.core.fs.FileSystem - Created new CloseableRegistry org.apache.flink.core.fs.SafetyNetCloseableRegistry@463dc5a1 for Async calls on Source: Custom Source (5/12) 2017-06-01 12:55:18,236 INFO org.apache.flink.core.fs.FileSystem - Created new CloseableRegistry org.apache.flink.core.fs.SafetyNetCloseableRegistry@7871a1bb for Async calls on Source: Custom Source (8/12) 2017-06-01 12:55:18,237 INFO org.apache.flink.core.fs.FileSystem - Created new CloseableRegistry org.apache.flink.core.fs.SafetyNetCloseableRegistry@57df8c1d for Async calls on Source: Custom Source (11/12) ==> /mnt/ephemeral/logs/flink-flink-jobmanager-0-vpc2w2-rep-stage-flink1.log <== 2017-06-01 12:58:30,764 WARN org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Received late message for now expired checkpoint attempt 1 from c601dd04affa7da13a226daa222062e7 of job 303656ace348131ed7a38bb02b4fe374. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoints-very-slow-with-high-backpressure-tp12762p13422.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.