Hi Ufuk, hi Stefan,

thanks a lot for your replies.

Ufuk, we are using the HDFS state backend.

Stefan, I installed 1.1.5 on our machines and built our software with
the Flink 1.1.5 dependency, but the problem remains. Below are the logs
for savepoint creation [1] and savepoint disposal [2] as well as the
logs from the start of the job [3]. There were not many more log lines
when I set org.apache.flink.client to DEBUG, so I set the whole package
org.apache.flink to DEBUG in the hope of some findings. But I couldn't
really find anything suspicious.

Again, thanks a lot for your help!

Best regards


Konstantin


[1]

2017-03-28 12:21:32,033 INFO  org.apache.flink.client.CliFrontend               
            - 
--------------------------------------------------------------------------------
2017-03-28 12:21:32,034 INFO  org.apache.flink.client.CliFrontend               
            -  Starting Command Line Client (Version: 1.1.3, Rev:a56d810, 
Date:10.11.2016 @ 13:25:34 CET)
2017-03-28 12:21:32,035 INFO  org.apache.flink.client.CliFrontend               
            -  Current user: our_user
2017-03-28 12:21:32,035 INFO  org.apache.flink.client.CliFrontend               
            -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 
1.7/24.51-b03
2017-03-28 12:21:32,035 INFO  org.apache.flink.client.CliFrontend               
            -  Maximum heap size: 1749 MiBytes
2017-03-28 12:21:32,035 INFO  org.apache.flink.client.CliFrontend               
            -  JAVA_HOME: /usr/java/default
2017-03-28 12:21:32,037 INFO  org.apache.flink.client.CliFrontend               
            -  Hadoop version: 2.3.0
2017-03-28 12:21:32,038 INFO  org.apache.flink.client.CliFrontend               
            -  JVM Options:
2017-03-28 12:21:32,038 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlog.file=/path/to/our/lib/flink-1.1.3/log/flink-our_user-client-ourserver.log
2017-03-28 12:21:32,038 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlog4j.configuration=file:/path/to/our/lib/flink-1.1.3/conf/log4j-cli.properties
2017-03-28 12:21:32,038 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlogback.configurationFile=file:/path/to/our/lib/flink-1.1.3/conf/logback.xml
2017-03-28 12:21:32,038 INFO  org.apache.flink.client.CliFrontend               
            -  Program Arguments:
2017-03-28 12:21:32,038 INFO  org.apache.flink.client.CliFrontend               
            -     savepoint
2017-03-28 12:21:32,038 INFO  org.apache.flink.client.CliFrontend               
            -     7e865198e220bea8a2203ebdb0827b6f
2017-03-28 12:21:32,039 INFO  org.apache.flink.client.CliFrontend               
            -     -j
2017-03-28 12:21:32,039 INFO  org.apache.flink.client.CliFrontend               
            -     
/path/to/our/lib/our_program/lib/our_program-6.2.6-SNAPSHOT-all.jar
2017-03-28 12:21:32,039 INFO  org.apache.flink.client.CliFrontend               
            -  Classpath: 
/path/to/our/lib/flink-1.1.3/lib/flink-dist_2.10-1.1.3.1.jar:/path/to/our/lib/flink-1.1.3/lib/flink-python_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/flink-reporter-1.0.2-20161206.140111-118.jar:/path/to/our/lib/flink-1.1.3/lib/flink-table_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/log4j-1.2.17.jar:/path/to/our/lib/flink-1.1.3/lib/ojdbc6-11.2.0.3.jar:/path/to/our/lib/flink-1.1.3/lib/slf4j-log4j12-1.7.7.jar::/etc/hadoop/conf:
2017-03-28 12:21:32,039 INFO  org.apache.flink.client.CliFrontend               
            - 
--------------------------------------------------------------------------------
2017-03-28 12:21:32,039 INFO  org.apache.flink.client.CliFrontend               
            - Using configuration directory /path/to/our/lib/flink-1.1.3/conf
2017-03-28 12:21:32,039 INFO  org.apache.flink.client.CliFrontend               
            - Trying to load configuration file
2017-03-28 12:21:32,050 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: env.java.opts, 
-Djavax.net.ssl.trustStore=/path/to/our/cacerts 
-XX:HeapDumpPath=/path/to/our/hadoop/yarn/log -XX:+HeapDumpOnOutOfMemoryError 
-XX:MaxPermSize=192m
2017-03-28 12:21:32,050 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.rpc.address, localhost
2017-03-28 12:21:32,050 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.rpc.port, 6123
2017-03-28 12:21:32,051 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.heap.mb, 256
2017-03-28 12:21:32,051 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.heap.mb, 512
2017-03-28 12:21:32,051 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.numberOfTaskSlots, 4
2017-03-28 12:21:32,051 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.memory.preallocate, false
2017-03-28 12:21:32,051 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: parallelism.default, 1
2017-03-28 12:21:32,051 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.web.port, 8081
2017-03-28 12:21:32,051 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.web.submit.enable, false
2017-03-28 12:21:32,052 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: state.backend, filesystem
2017-03-28 12:21:32,052 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: state.backend.fs.checkpointdir, 
hdfs://ourserver:8020/our_user/flink/state
2017-03-28 12:21:32,052 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.network.numberOfBuffers, 4096
2017-03-28 12:21:32,052 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf/
2017-03-28 12:21:32,052 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.mode, zookeeper
2017-03-28 12:21:32,052 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.quorum, 
ourserver:2181,ourserver2:2181,ourserver3:2181
2017-03-28 12:21:32,053 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.storageDir, 
hdfs:///our_user/flink/recovery
2017-03-28 12:21:32,053 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.path.root, flink
2017-03-28 12:21:32,053 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.path.namespace, yarn_session
2017-03-28 12:21:32,053 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: reocvery.zookeeper.client.connection-timeout, 30000
2017-03-28 12:21:32,053 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.session-timeout, 120000
2017-03-28 12:21:32,053 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.retry-wait, 5000
2017-03-28 12:21:32,053 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.max-retry-attempts, 5
2017-03-28 12:21:32,054 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: yarn.application-attempts, 10
2017-03-28 12:21:32,054 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: yarn.maximum-failed-containers, 80
2017-03-28 12:21:32,054 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.watch.heartbeat.interval, 50s
2017-03-28 12:21:32,055 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.log.lifecycle.events, true
2017-03-28 12:21:32,055 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.ask.timeout, 20s
2017-03-28 12:21:32,055 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: savepoints.state.backend, filesystem
2017-03-28 12:21:32,055 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: savepoints.state.backend.fs.dir, 
hdfs:///our_user/flink/savepoints
2017-03-28 12:21:32,281 INFO  org.apache.flink.client.CliFrontend               
            - Running 'savepoint' command.
2017-03-28 12:21:32,287 INFO  org.apache.flink.client.CliFrontend               
            - Retrieving JobManager.
2017-03-28 12:21:32,288 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Found YARN properties file /tmp/.yarn-properties-our_user
2017-03-28 12:21:32,372 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Using Yarn application id from YARN properties 
application_1488884688139_2648
2017-03-28 12:21:32,372 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - YARN properties set default parallelism to 12
2017-03-28 12:21:32,372 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Found YARN properties file /tmp/.yarn-properties-our_user
2017-03-28 12:21:32,373 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Using Yarn application id from YARN properties 
application_1488884688139_2648
2017-03-28 12:21:32,373 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - YARN properties set default parallelism to 12
2017-03-28 12:21:32,440 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: env.java.opts, 
-Djavax.net.ssl.trustStore=/path/to/our/cacerts 
-XX:HeapDumpPath=/path/to/our/hadoop/yarn/log -XX:+HeapDumpOnOutOfMemoryError 
-XX:MaxPermSize=192m
2017-03-28 12:21:32,440 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.rpc.address, localhost
2017-03-28 12:21:32,440 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.rpc.port, 6123
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.heap.mb, 256
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.heap.mb, 512
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.numberOfTaskSlots, 4
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.memory.preallocate, false
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: parallelism.default, 1
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.web.port, 8081
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.web.submit.enable, false
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: state.backend, filesystem
2017-03-28 12:21:32,441 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: state.backend.fs.checkpointdir, 
hdfs://ourserver:8020/our_user/flink/state
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.network.numberOfBuffers, 4096
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf/
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.mode, zookeeper
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.quorum, 
ourserver:2181,ourserver:2181,ourserver:2181
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.storageDir, 
hdfs:///our_user/flink/recovery
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.path.root, flink
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.path.namespace, yarn_session
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: reocvery.zookeeper.client.connection-timeout, 30000
2017-03-28 12:21:32,442 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.session-timeout, 120000
2017-03-28 12:21:32,443 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.retry-wait, 5000
2017-03-28 12:21:32,443 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.max-retry-attempts, 5
2017-03-28 12:21:32,443 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: yarn.application-attempts, 10
2017-03-28 12:21:32,443 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: yarn.maximum-failed-containers, 80
2017-03-28 12:21:32,443 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.watch.heartbeat.interval, 50s
2017-03-28 12:21:32,444 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.log.lifecycle.events, true
2017-03-28 12:21:32,444 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.ask.timeout, 20s
2017-03-28 12:21:32,444 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: savepoints.state.backend, filesystem
2017-03-28 12:21:32,444 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: savepoints.state.backend.fs.dir, 
hdfs:///our_user/flink/savepoints
2017-03-28 12:21:32,541 INFO  org.apache.hadoop.yarn.client.RMProxy             
            - Connecting to ResourceManager at ourserver/ourserver_ip:8050
2017-03-28 12:21:32,718 INFO  org.apache.flink.yarn.YarnClusterDescriptor       
            - Found application JobManager host name 'ourserver' and port 
'36901' from supplied application id 'application_1488884688139_2648'
2017-03-28 12:21:32,732 INFO  org.apache.flink.runtime.util.ZooKeeperUtils      
            - Using 'flink/yarn_session' as zookeeper namespace.
2017-03-28 12:21:32,831 INFO  
org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorFrameworkImpl  
- Starting
2017-03-28 12:21:32,832 DEBUG 
org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient  - Starting
2017-03-28 12:21:32,832 DEBUG 
org.apache.flink.shaded.org.apache.curator.ConnectionState    - Starting
2017-03-28 12:21:32,833 DEBUG 
org.apache.flink.shaded.org.apache.curator.ConnectionState    - reset
2017-03-28 12:21:32,874 INFO  
org.apache.flink.shaded.org.apache.curator.framework.state.ConnectionStateManager
  - State change: CONNECTED
2017-03-28 12:21:33,891 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Starting ZooKeeperLeaderRetrievalService.
2017-03-28 12:21:33,906 DEBUG 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Leader node has changed.
2017-03-28 12:21:33,912 DEBUG 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - New 
leader information: Leader=akka.tcp://flink@ourserver_ip:36901/user/jobmanager, 
session ID=a3c337e5-1749-4c42-9949-0203bbae58d5.
2017-03-28 12:21:33,914 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Stopping ZooKeeperLeaderRetrievalService.
2017-03-28 12:21:33,914 DEBUG 
org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorFrameworkImpl  
- Closing
2017-03-28 12:21:33,915 DEBUG 
org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient  - Closing
2017-03-28 12:21:33,915 DEBUG 
org.apache.flink.shaded.org.apache.curator.ConnectionState    - Closing
2017-03-28 12:21:33,920 INFO  org.apache.flink.client.CliFrontend               
            - Using address /ourserver_ip:36901 to connect to JobManager.
2017-03-28 12:21:33,926 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Starting client actor system.
2017-03-28 12:21:33,928 DEBUG org.apache.flink.runtime.net.ConnectionUtils      
            - Trying to connect to (ourserver/ourserver_ip:36901) from local 
address ourserver/ourserver_ip with timeout 200
2017-03-28 12:21:33,931 DEBUG org.apache.flink.runtime.net.ConnectionUtils      
            - Using InetAddress.getLocalHost() immediately for the connecting 
address
2017-03-28 12:21:34,673 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Starting ZooKeeperLeaderRetrievalService.
2017-03-28 12:21:34,677 DEBUG 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Leader node has changed.
2017-03-28 12:21:34,677 DEBUG 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - New 
leader information: Leader=akka.tcp://flink@ourserver_ip:36901/user/jobmanager, 
session ID=a3c337e5-1749-4c42-9949-0203bbae58d5.
2017-03-28 12:21:34,823 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Stopping ZooKeeperLeaderRetrievalService.
2017-03-28 12:21:34,826 INFO  org.apache.flink.client.CliFrontend               
            - Triggering savepoint for job 7e865198e220bea8a2203ebdb0827b6f.
2017-03-28 12:21:34,828 INFO  org.apache.flink.client.CliFrontend               
            - Waiting for response...
2017-03-28 12:21:34,993 INFO  org.apache.flink.client.CliFrontend               
            - Savepoint completed. Path: 
hdfs:/our_user/flink/savepoints/savepoint-77214a0f9902
2017-03-28 12:21:34,994 INFO  org.apache.flink.client.CliFrontend               
            - You can resume your program from this savepoint with the run 
command.
2017-03-28 12:21:34,994 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Shutting down YarnClusterClient from the client shutdown hook
2017-03-28 12:21:34,994 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Disconnecting YarnClusterClient from ApplicationMaster


[2]

2017-03-28 12:19:58,063 INFO  org.apache.flink.client.CliFrontend               
            - 
--------------------------------------------------------------------------------
2017-03-28 12:19:58,064 INFO  org.apache.flink.client.CliFrontend               
            -  Starting Command Line Client (Version: 1.1.3, Rev:a56d810, 
Date:10.11.2016 @ 13:25:34 CET)
2017-03-28 12:19:58,064 INFO  org.apache.flink.client.CliFrontend               
            -  Current user: our_user
2017-03-28 12:19:58,064 INFO  org.apache.flink.client.CliFrontend               
            -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 
1.7/24.51-b03
2017-03-28 12:19:58,065 INFO  org.apache.flink.client.CliFrontend               
            -  Maximum heap size: 1749 MiBytes
2017-03-28 12:19:58,065 INFO  org.apache.flink.client.CliFrontend               
            -  JAVA_HOME: /usr/java/default
2017-03-28 12:19:58,067 INFO  org.apache.flink.client.CliFrontend               
            -  Hadoop version: 2.3.0
2017-03-28 12:19:58,067 INFO  org.apache.flink.client.CliFrontend               
            -  JVM Options:
2017-03-28 12:19:58,068 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlog.file=/path/to/our/lib/flink-1.1.3/log/flink-our_user-client-ourserver.log
2017-03-28 12:19:58,068 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlog4j.configuration=file:/path/to/our/lib/flink-1.1.3/conf/log4j-cli.properties
2017-03-28 12:19:58,068 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlogback.configurationFile=file:/path/to/our/lib/flink-1.1.3/conf/logback.xml
2017-03-28 12:19:58,068 INFO  org.apache.flink.client.CliFrontend               
            -  Program Arguments:
2017-03-28 12:19:58,069 INFO  org.apache.flink.client.CliFrontend               
            -     savepoint
2017-03-28 12:19:58,069 INFO  org.apache.flink.client.CliFrontend               
            -     -d
2017-03-28 12:19:58,069 INFO  org.apache.flink.client.CliFrontend               
            -     hdfs:/our_user/flink/savepoints/savepoint-d16441420a87
2017-03-28 12:19:58,069 INFO  org.apache.flink.client.CliFrontend               
            -     -j
2017-03-28 12:19:58,069 INFO  org.apache.flink.client.CliFrontend               
            -     
/path/to/our/lib/our_program/lib/our_program-6.2.6-SNAPSHOT-all.jar
2017-03-28 12:19:58,069 INFO  org.apache.flink.client.CliFrontend               
            -  Classpath: 
/path/to/our/lib/flink-1.1.3/lib/flink-dist_2.10-1.1.3.1.jar:/path/to/our/lib/flink-1.1.3/lib/flink-python_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/flink-reporter-1.0.2-20161206.140111-118.jar:/path/to/our/lib/flink-1.1.3/lib/flink-table_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/log4j-1.2.17.jar:/path/to/our/lib/flink-1.1.3/lib/ojdbc6-11.2.0.3.jar:/path/to/our/lib/flink-1.1.3/lib/slf4j-log4j12-1.7.7.jar::/etc/hadoop/conf:
2017-03-28 12:19:58,070 INFO  org.apache.flink.client.CliFrontend               
            - 
--------------------------------------------------------------------------------
2017-03-28 12:19:58,070 INFO  org.apache.flink.client.CliFrontend               
            - Using configuration directory /path/to/our/lib/flink-1.1.3/conf
2017-03-28 12:19:58,070 INFO  org.apache.flink.client.CliFrontend               
            - Trying to load configuration file
2017-03-28 12:19:58,085 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: env.java.opts, 
-Djavax.net.ssl.trustStore=/path/to/our/cacerts 
-XX:HeapDumpPath=/path/to/our/hadoop/yarn/log -XX:+HeapDumpOnOutOfMemoryError 
-XX:MaxPermSize=192m
2017-03-28 12:19:58,085 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.rpc.address, localhost
2017-03-28 12:19:58,086 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.rpc.port, 6123
2017-03-28 12:19:58,086 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.heap.mb, 256
2017-03-28 12:19:58,086 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.heap.mb, 512
2017-03-28 12:19:58,086 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.numberOfTaskSlots, 4
2017-03-28 12:19:58,086 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.memory.preallocate, false
2017-03-28 12:19:58,086 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: parallelism.default, 1
2017-03-28 12:19:58,087 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.web.port, 8081
2017-03-28 12:19:58,087 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.web.submit.enable, false
2017-03-28 12:19:58,087 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: state.backend, filesystem
2017-03-28 12:19:58,087 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: state.backend.fs.checkpointdir, 
hdfs://ourserver:8020/our_user/flink/state
2017-03-28 12:19:58,087 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.network.numberOfBuffers, 4096
2017-03-28 12:19:58,087 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf/
2017-03-28 12:19:58,088 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.mode, zookeeper
2017-03-28 12:19:58,088 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.quorum, 
ourserver:2181,ourserver2:2181,ourserver3:2181
2017-03-28 12:19:58,088 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.storageDir, 
hdfs:///our_user/flink/recovery
2017-03-28 12:19:58,088 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.path.root, flink
2017-03-28 12:19:58,088 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.path.namespace, yarn_session
2017-03-28 12:19:58,088 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: reocvery.zookeeper.client.connection-timeout, 30000
2017-03-28 12:19:58,089 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.session-timeout, 120000
2017-03-28 12:19:58,089 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.retry-wait, 5000
2017-03-28 12:19:58,089 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.max-retry-attempts, 5
2017-03-28 12:19:58,089 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: yarn.application-attempts, 10
2017-03-28 12:19:58,089 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: yarn.maximum-failed-containers, 80
2017-03-28 12:19:58,090 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.watch.heartbeat.interval, 50s
2017-03-28 12:19:58,090 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.log.lifecycle.events, true
2017-03-28 12:19:58,090 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.ask.timeout, 20s
2017-03-28 12:19:58,090 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: savepoints.state.backend, filesystem
2017-03-28 12:19:58,090 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: savepoints.state.backend.fs.dir, 
hdfs:///our_user/flink/savepoints
2017-03-28 12:19:58,367 INFO  org.apache.flink.client.CliFrontend               
            - Running 'savepoint' command.
2017-03-28 12:19:58,372 INFO  org.apache.flink.client.CliFrontend               
            - Retrieving JobManager.
2017-03-28 12:19:58,373 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Found YARN properties file /tmp/.yarn-properties-our_user
2017-03-28 12:19:58,484 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Using Yarn application id from YARN properties 
application_1488884688139_2648
2017-03-28 12:19:58,485 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - YARN properties set default parallelism to 12
2017-03-28 12:19:58,485 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Found YARN properties file /tmp/.yarn-properties-our_user
2017-03-28 12:19:58,485 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Using Yarn application id from YARN properties 
application_1488884688139_2648
2017-03-28 12:19:58,485 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - YARN properties set default parallelism to 12
2017-03-28 12:19:58,604 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: env.java.opts, 
-Djavax.net.ssl.trustStore=/path/to/our/cacerts 
-XX:HeapDumpPath=/path/to/our/hadoop/yarn/log -XX:+HeapDumpOnOutOfMemoryError 
-XX:MaxPermSize=192m
2017-03-28 12:19:58,604 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.rpc.address, localhost
2017-03-28 12:19:58,604 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.rpc.port, 6123
2017-03-28 12:19:58,604 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.heap.mb, 256
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.heap.mb, 512
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.numberOfTaskSlots, 4
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.memory.preallocate, false
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: parallelism.default, 1
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.web.port, 8081
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: jobmanager.web.submit.enable, false
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: state.backend, filesystem
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: state.backend.fs.checkpointdir, 
hdfs://ourserver:8020/our_user/flink/state
2017-03-28 12:19:58,605 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: taskmanager.network.numberOfBuffers, 4096
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf/
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.mode, zookeeper
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.quorum, 
ourserver:2181,ourserver:2181,ourserver:2181
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.storageDir, 
hdfs:///our_user/flink/recovery
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.path.root, flink
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.path.namespace, yarn_session
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: reocvery.zookeeper.client.connection-timeout, 30000
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.session-timeout, 120000
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.retry-wait, 5000
2017-03-28 12:19:58,606 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: recovery.zookeeper.client.max-retry-attempts, 5
2017-03-28 12:19:58,607 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: yarn.application-attempts, 10
2017-03-28 12:19:58,607 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: yarn.maximum-failed-containers, 80
2017-03-28 12:19:58,607 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.watch.heartbeat.interval, 50s
2017-03-28 12:19:58,607 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.log.lifecycle.events, true
2017-03-28 12:19:58,607 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: akka.ask.timeout, 20s
2017-03-28 12:19:58,608 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: savepoints.state.backend, filesystem
2017-03-28 12:19:58,608 DEBUG 
org.apache.flink.configuration.GlobalConfiguration            - Loading 
configuration property: savepoints.state.backend.fs.dir, 
hdfs:///our_user/flink/savepoints
2017-03-28 12:19:58,685 INFO  org.apache.hadoop.yarn.client.RMProxy             
            - Connecting to ResourceManager at ourserver/ourserver_ip:8050
2017-03-28 12:19:58,969 INFO  org.apache.flink.yarn.YarnClusterDescriptor       
            - Found application JobManager host name 'ourserver' and port 
'36901' from supplied application id 'application_1488884688139_2648'
2017-03-28 12:19:58,989 INFO  org.apache.flink.runtime.util.ZooKeeperUtils      
            - Using 'flink/yarn_session' as zookeeper namespace.
2017-03-28 12:19:59,114 INFO  
org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorFrameworkImpl  
- Starting
2017-03-28 12:19:59,115 DEBUG 
org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient  - Starting
2017-03-28 12:19:59,115 DEBUG 
org.apache.flink.shaded.org.apache.curator.ConnectionState    - Starting
2017-03-28 12:19:59,115 DEBUG 
org.apache.flink.shaded.org.apache.curator.ConnectionState    - reset
2017-03-28 12:19:59,172 INFO  
org.apache.flink.shaded.org.apache.curator.framework.state.ConnectionStateManager
  - State change: CONNECTED
2017-03-28 12:20:00,212 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Starting ZooKeeperLeaderRetrievalService.
2017-03-28 12:20:00,229 DEBUG 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Leader node has changed.
2017-03-28 12:20:00,235 DEBUG 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - New 
leader information: Leader=akka.tcp://flink@ourserver_ip:36901/user/jobmanager, 
session ID=a3c337e5-1749-4c42-9949-0203bbae58d5.
2017-03-28 12:20:00,238 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Stopping ZooKeeperLeaderRetrievalService.
2017-03-28 12:20:00,238 DEBUG 
org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorFrameworkImpl  
- Closing
2017-03-28 12:20:00,238 DEBUG 
org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient  - Closing
2017-03-28 12:20:00,239 DEBUG 
org.apache.flink.shaded.org.apache.curator.ConnectionState    - Closing
2017-03-28 12:20:00,245 INFO  org.apache.flink.client.CliFrontend               
            - Using address /ourserver_ip:36901 to connect to JobManager.
2017-03-28 12:20:00,245 INFO  org.apache.flink.runtime.util.ZooKeeperUtils      
            - Using 'flink/yarn_session' as zookeeper namespace.
2017-03-28 12:20:00,252 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Starting client actor system.
2017-03-28 12:20:00,254 DEBUG org.apache.flink.runtime.net.ConnectionUtils      
            - Trying to connect to (ourserver/ourserver_ip:36901) from local 
address ourserver/ourserver_ip with timeout 200
2017-03-28 12:20:00,259 DEBUG org.apache.flink.runtime.net.ConnectionUtils      
            - Using InetAddress.getLocalHost() immediately for the connecting 
address
2017-03-28 12:20:01,209 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Starting ZooKeeperLeaderRetrievalService.
2017-03-28 12:20:01,213 DEBUG 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Leader node has changed.
2017-03-28 12:20:01,213 DEBUG 
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - New 
leader information: Leader=akka.tcp://flink@ourserver_ip:36901/user/jobmanager, 
session ID=a3c337e5-1749-4c42-9949-0203bbae58d5.
2017-03-28 12:20:01,442 INFO  
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService  - 
Stopping ZooKeeperLeaderRetrievalService.
2017-03-28 12:20:01,446 INFO  org.apache.flink.client.CliFrontend               
            - Disposing savepoint 
'hdfs:/our_user/flink/savepoints/savepoint-d16441420a87' with JAR 
/path/to/our/lib/our_program/lib/our_program-6.2.6-SNAPSHOT-all.jar.
2017-03-28 12:20:01,590 INFO  org.apache.flink.client.CliFrontend               
            - Waiting for response...
2017-03-28 12:20:01,636 ERROR org.apache.flink.client.CliFrontend               
            - Error while running the command.
java.io.IOException: Failed to dispose savepoint 
hdfs:/our_user/flink/savepoints/savepoint-d16441420a87.
        at 
org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.disposeSavepoint(FsSavepointStore.java:163)
        at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:745)
        at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:727)
        at 
org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:727)
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at 
scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
        at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
        at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
        at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
        at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.lang.ClassNotFoundException: 
our.company.eventdata.EventDataRecord
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:270)
        at 
org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:65)
        at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
        at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
        at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1483)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1333)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at 
java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
        at 
org.apache.flink.api.java.typeutils.runtime.PojoSerializer.readObject(PojoSerializer.java:131)
        at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at 
java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
        at 
org.apache.flink.api.common.state.StateDescriptor.readObject(StateDescriptor.java:268)
        at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        at java.util.HashMap.readObject(HashMap.java:1184)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
        at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:291)
        at 
org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58)
        at 
org.apache.flink.runtime.checkpoint.SubtaskState.discard(SubtaskState.java:85)
        at 
org.apache.flink.runtime.checkpoint.TaskState.discard(TaskState.java:147)
        at 
org.apache.flink.runtime.checkpoint.savepoint.SavepointV0.dispose(SavepointV0.java:66)
        at 
org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.disposeSavepoint(FsSavepointStore.java:151)
        ... 12 more
2017-03-28 12:20:01,652 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Shutting down YarnClusterClient from the client shutdown hook
2017-03-28 12:20:01,653 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Disconnecting YarnClusterClient from ApplicationMaster


[3]

2017-03-28 10:43:57,361 INFO  org.apache.flink.client.CliFrontend               
            - 
--------------------------------------------------------------------------------
2017-03-28 10:43:57,362 INFO  org.apache.flink.client.CliFrontend               
            -  Starting Command Line Client (Version: 1.1.3, Rev:a56d810, 
Date:10.11.2016 @ 13:25:34 CET)
2017-03-28 10:43:57,362 INFO  org.apache.flink.client.CliFrontend               
            -  Current user: our_user
2017-03-28 10:43:57,362 INFO  org.apache.flink.client.CliFrontend               
            -  JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 
1.7/24.51-b03
2017-03-28 10:43:57,362 INFO  org.apache.flink.client.CliFrontend               
            -  Maximum heap size: 1749 MiBytes
2017-03-28 10:43:57,363 INFO  org.apache.flink.client.CliFrontend               
            -  JAVA_HOME: /usr/java/default
2017-03-28 10:43:57,365 INFO  org.apache.flink.client.CliFrontend               
            -  Hadoop version: 2.3.0
2017-03-28 10:43:57,365 INFO  org.apache.flink.client.CliFrontend               
            -  JVM Options:
2017-03-28 10:43:57,365 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlog.file=/path/to/our/lib/flink-1.1.3/log/flink-our_user-client-ourserver.log
2017-03-28 10:43:57,365 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlog4j.configuration=file:/path/to/our/lib/flink-1.1.3/conf/log4j-cli.properties
2017-03-28 10:43:57,365 INFO  org.apache.flink.client.CliFrontend               
            -     
-Dlogback.configurationFile=file:/path/to/our/lib/flink-1.1.3/conf/logback.xml
2017-03-28 10:43:57,365 INFO  org.apache.flink.client.CliFrontend               
            -  Program Arguments:
2017-03-28 10:43:57,366 INFO  org.apache.flink.client.CliFrontend               
            -     run
2017-03-28 10:43:57,366 INFO  org.apache.flink.client.CliFrontend               
            -     -p
2017-03-28 10:43:57,366 INFO  org.apache.flink.client.CliFrontend               
            -     5
2017-03-28 10:43:57,366 INFO  org.apache.flink.client.CliFrontend               
            -     -c
2017-03-28 10:43:57,366 INFO  org.apache.flink.client.CliFrontend               
            -     our.company.package.OurProgramClass
2017-03-28 10:43:57,366 INFO  org.apache.flink.client.CliFrontend               
            -     /path/to/our/lib/our_program/lib/our_program.jar
2017-03-28 10:43:57,366 INFO  org.apache.flink.client.CliFrontend               
            -     /path/to/our/lib/our_program/conf/our_program.properties
2017-03-28 10:43:57,367 INFO  org.apache.flink.client.CliFrontend               
            -  Classpath: 
/path/to/our/lib/flink-1.1.3/lib/flink-dist_2.10-1.1.3.1.jar:/path/to/our/lib/flink-1.1.3/lib/flink-python_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/flink-reporter-1.0.2-20161206.140111-118.jar:/path/to/our/lib/flink-1.1.3/lib/flink-table_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/log4j-1.2.17.jar:/path/to/our/lib/flink-1.1.3/lib/ojdbc6-11.2.0.3.jar:/path/to/our/lib/flink-1.1.3/lib/slf4j-log4j12-1.7.7.jar::/etc/hadoop/conf:
2017-03-28 10:43:57,367 INFO  org.apache.flink.client.CliFrontend               
            - 
--------------------------------------------------------------------------------
2017-03-28 10:43:57,367 INFO  org.apache.flink.client.CliFrontend               
            - Using configuration directory /path/to/our/lib/flink-1.1.3/conf
2017-03-28 10:43:57,367 INFO  org.apache.flink.client.CliFrontend               
            - Trying to load configuration file
2017-03-28 10:43:57,664 INFO  org.apache.flink.client.CliFrontend               
            - Running 'run' command.
2017-03-28 10:43:57,671 INFO  org.apache.flink.client.CliFrontend               
            - Building program from JAR file
2017-03-28 10:43:57,827 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Found YARN properties file /tmp/.yarn-properties-our_user
2017-03-28 10:43:57,921 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Using Yarn application id from YARN properties 
application_1488884688139_2648
2017-03-28 10:43:57,921 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - YARN properties set default parallelism to 12
2017-03-28 10:43:57,921 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Found YARN properties file /tmp/.yarn-properties-our_user
2017-03-28 10:43:57,922 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - Using Yarn application id from YARN properties 
application_1488884688139_2648
2017-03-28 10:43:57,922 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli     
            - YARN properties set default parallelism to 12
2017-03-28 10:43:58,046 INFO  org.apache.hadoop.yarn.client.RMProxy             
            - Connecting to ResourceManager at ourserver/ourserver_ip:8050
2017-03-28 10:43:58,237 INFO  org.apache.flink.yarn.YarnClusterDescriptor       
            - Found application JobManager host name 'ourserver' and port 
'36901' from supplied application id 'application_1488884688139_2648'
2017-03-28 10:43:58,246 INFO  org.apache.flink.client.CliFrontend               
            - Cluster configuration: Yarn cluster with application id 
application_1488884688139_2648
2017-03-28 10:43:59,439 INFO  org.apache.flink.client.CliFrontend               
            - Using address ourserver_ip:36901 to connect to JobManager.
2017-03-28 10:43:59,439 INFO  org.apache.flink.client.CliFrontend               
            - JobManager web interface address 
http://ourserver:8088/proxy/application_1488884688139_2648/
2017-03-28 10:43:59,439 DEBUG org.apache.flink.client.CliFrontend               
            - Client slots is set to -1
2017-03-28 10:43:59,440 DEBUG org.apache.flink.client.CliFrontend               
            - Savepoint path is set to null
2017-03-28 10:43:59,440 DEBUG org.apache.flink.client.CliFrontend               
            - User parallelism is set to 5
2017-03-28 10:43:59,440 INFO  org.apache.flink.client.CliFrontend               
            - Starting execution of program
2017-03-28 10:43:59,440 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Starting program in interactive mode
2017-03-28 10:44:00,593 WARN  org.apache.hadoop.hdfs.BlockReaderLocal           
            - The short-circuit local reads feature cannot be used because 
libhadoop cannot be loaded.
2017-03-28 10:44:01,672 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Waiting until all TaskManagers have connected
2017-03-28 10:44:02,702 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Starting client actor system.
2017-03-28 10:44:03,717 INFO  org.apache.flink.yarn.YarnClusterClient           
            - TaskManager status (3/1)
2017-03-28 10:44:03,720 INFO  org.apache.flink.yarn.YarnClusterClient           
            - All TaskManagers are connected
2017-03-28 10:44:04,736 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Submitting job with JobID: d33a2835c9c25881a0765c250bbceb7e. 
Waiting for job completion.
Connected to JobManager at 
Actor[akka.tcp://flink@ourserver_ip:36901/user/jobmanager#-429328340]
03/28/2017 10:44:06     Job execution switched to status RUNNING.


On 27.03.2017 15:24, Ufuk Celebi wrote:
> What kind of state backend where you using for the checkpoints?
>
> If there is a bug that prevents us from deleting the savepoint files
> automatically, we can do a manual workaround and delete the
> checkpoints files manually. With Flink 1.3 this becomes very straight
> forward as savepoint data all go to a self contained directory that
> can be deleted manually.
>
> On Mon, Mar 27, 2017 at 12:46 PM, Stefan Richter
> <s.rich...@data-artisans.com> wrote:
>> Hi,
>>
>> could you provide us with the log from the job client, with logging on debug
>> level for package org.apache.flink.client? Also, did you check if this
>> problem also exists in the latest bugfix release for your version (1.1.5) ?
>>
>> Best,
>> Stefan
>>
>>
>> Am 27.03.2017 um 11:41 schrieb Konstantin Gregor
>> <konstantin.gre...@tngtech.com>:
>>
>> Hey everyone,
>>
>> we are experiencing an issue in the disposal of savepoints in
>> Flink-1.1.3. We have a streaming job that has custom state (user objects
>> are part of the state). We create a savepoint:
>>
>> $ flink savepoint <JOBID>
>> [...]
>> Savepoint completed. Path:
>> hdfs:/bigdata/flink/savepoints/savepoint-20f064fb9f50
>> [...]
>>
>> Then we want to simply dispose of that savepoint where we also provide
>> the jar to the job from which the savepoint was made:
>> $ flink savepoint -d
>> hdfs:/bigdata/flink/savepoints/savepoint-20f064fb9f50 -j
>> /path/to/jar/application.jar
>>
>> This gives us a ClassNotFoundException of our custom objects [1].
>>
>> Adding our jar to the flink/lib directory is not an option for us,
>> things will break because of this.
>> Does anyone have an idea on how to proceed here?
>>
>> Thanks and best regards,
>>
>> Konstantin
>>
>> [1]
>> java.io.IOException: Failed to dispose savepoint
>> hdfs:///bigdata/flink/savepoints/savepoint-20f064fb9f50.
>>         at
>> org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.disposeSavepoint(FsSavepointStore.java:163)
>>         at
>> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:745)
>>         at
>> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:727)
>>         at
>> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:727)
>>         at
>> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
>>         at
>> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
>>         at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
>>         at
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401)
>>         at
>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>         at
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
>>         at
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
>>         at
>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>         at
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>> Caused by: java.lang.ClassNotFoundException:
>> our.company.application.eventdata.EventDataRecord
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>         at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>         at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>         at java.lang.Class.forName0(Native Method)
>>         at java.lang.Class.forName(Class.java:270)
>>         at
>> org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:65)
>>         at
>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
>>         at
>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>>         at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1483)
>>         at
>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1333)
>>         at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>         at
>> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
>>         at
>> org.apache.flink.api.java.typeutils.runtime.PojoSerializer.readObject(PojoSerializer.java:131)
>>         at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>>         at
>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>         at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>         at
>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>         at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>         at
>> java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
>>         at
>> org.apache.flink.api.common.state.StateDescriptor.readObject(StateDescriptor.java:268)
>>         at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>>         at
>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>         at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>         at
>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>         at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>         at
>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>         at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>         at
>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>         at java.util.HashMap.readObject(HashMap.java:1184)
>>         at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
>>         at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:606)
>>         at
>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>>         at
>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>>         at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>         at
>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>         at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>         at
>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>         at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>         at
>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>         at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
>>         at
>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
>>         at
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>>         at
>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>>         at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>>         at
>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>>         at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>>         at
>> org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:291)
>>         at
>> org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58)
>>         at
>> org.apache.flink.runtime.checkpoint.SubtaskState.discard(SubtaskState.java:85)
>>         at
>> org.apache.flink.runtime.checkpoint.TaskState.discard(TaskState.java:147)
>>         at
>> org.apache.flink.runtime.checkpoint.savepoint.SavepointV0.dispose(SavepointV0.java:66)
>>         at
>> org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.disposeSavepoint(FsSavepointStore.java:151)
>>
>>
>> --
>> Konstantin Gregor * konstantin.gre...@tngtech.com
>> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
>> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
>> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>>
>>

-- 
Konstantin Gregor * konstantin.gre...@tngtech.com
TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
Sitz: Unterföhring * Amtsgericht München * HRB 135082

Reply via email to