Thanks for providing the logs. It looks like the JARs are uploaded (modulo any bugs ;-))... could you double check that the class is actually part of the JAR and has not been moved around via
jar -tf <jarfile> | grep EventDataRecord If everything looks good in the JAR, I could write a short tool that prints the pointed file paths and you could manually delete the files as a work around. Sorry for all the trouble with this. In version >1.2 we don't need the user code any more to dispose savepoints. – Ufuk On Wed, Mar 29, 2017 at 11:50 AM, Konstantin Gregor <konstantin.gre...@tngtech.com> wrote: > Hi Ufuk, hi Stefan, > > > thanks a lot for your replies. > > Ufuk, we are using the HDFS state backend. > > Stefan, I installed 1.1.5 on our machines and built our software with the > Flink 1.1.5 dependency, but the problem remains. Below are the logs for > savepoint creation [1] and savepoint disposal [2] as well as the logs from > the start of the job [3]. There were not many more log lines when I set > org.apache.flink.client to DEBUG, so I set the whole package > org.apache.flink to DEBUG in the hope of some findings. But I couldn't > really find anything suspicious. > > Again, thanks a lot for your help! > > Best regards > > > Konstantin > > > [1] > > 2017-03-28 12:21:32,033 INFO org.apache.flink.client.CliFrontend > - > -------------------------------------------------------------------------------- > 2017-03-28 12:21:32,034 INFO org.apache.flink.client.CliFrontend > - Starting Command Line Client (Version: 1.1.3, Rev:a56d810, > Date:10.11.2016 @ 13:25:34 CET) > 2017-03-28 12:21:32,035 INFO org.apache.flink.client.CliFrontend > - Current user: our_user > 2017-03-28 12:21:32,035 INFO org.apache.flink.client.CliFrontend > - JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - > 1.7/24.51-b03 > 2017-03-28 12:21:32,035 INFO org.apache.flink.client.CliFrontend > - Maximum heap size: 1749 MiBytes > 2017-03-28 12:21:32,035 INFO org.apache.flink.client.CliFrontend > - JAVA_HOME: /usr/java/default > 2017-03-28 12:21:32,037 INFO org.apache.flink.client.CliFrontend > - Hadoop version: 2.3.0 > 2017-03-28 12:21:32,038 INFO org.apache.flink.client.CliFrontend > - JVM Options: > 2017-03-28 12:21:32,038 INFO org.apache.flink.client.CliFrontend > - > -Dlog.file=/path/to/our/lib/flink-1.1.3/log/flink-our_user-client-ourserver.log > 2017-03-28 12:21:32,038 INFO org.apache.flink.client.CliFrontend > - > -Dlog4j.configuration=file:/path/to/our/lib/flink-1.1.3/conf/log4j-cli.properties > 2017-03-28 12:21:32,038 INFO org.apache.flink.client.CliFrontend > - > -Dlogback.configurationFile=file:/path/to/our/lib/flink-1.1.3/conf/logback.xml > 2017-03-28 12:21:32,038 INFO org.apache.flink.client.CliFrontend > - Program Arguments: > 2017-03-28 12:21:32,038 INFO org.apache.flink.client.CliFrontend > - savepoint > 2017-03-28 12:21:32,038 INFO org.apache.flink.client.CliFrontend > - 7e865198e220bea8a2203ebdb0827b6f > 2017-03-28 12:21:32,039 INFO org.apache.flink.client.CliFrontend > - -j > 2017-03-28 12:21:32,039 INFO org.apache.flink.client.CliFrontend > - /path/to/our/lib/our_program/lib/our_program-6.2.6-SNAPSHOT-all.jar > 2017-03-28 12:21:32,039 INFO org.apache.flink.client.CliFrontend > - Classpath: > /path/to/our/lib/flink-1.1.3/lib/flink-dist_2.10-1.1.3.1.jar:/path/to/our/lib/flink-1.1.3/lib/flink-python_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/flink-reporter-1.0.2-20161206.140111-118.jar:/path/to/our/lib/flink-1.1.3/lib/flink-table_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/log4j-1.2.17.jar:/path/to/our/lib/flink-1.1.3/lib/ojdbc6-11.2.0.3.jar:/path/to/our/lib/flink-1.1.3/lib/slf4j-log4j12-1.7.7.jar::/etc/hadoop/conf: > 2017-03-28 12:21:32,039 INFO org.apache.flink.client.CliFrontend > - > -------------------------------------------------------------------------------- > 2017-03-28 12:21:32,039 INFO org.apache.flink.client.CliFrontend > - Using configuration directory /path/to/our/lib/flink-1.1.3/conf > 2017-03-28 12:21:32,039 INFO org.apache.flink.client.CliFrontend > - Trying to load configuration file > 2017-03-28 12:21:32,050 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: env.java.opts, > -Djavax.net.ssl.trustStore=/path/to/our/cacerts > -XX:HeapDumpPath=/path/to/our/hadoop/yarn/log > -XX:+HeapDumpOnOutOfMemoryError -XX:MaxPermSize=192m > 2017-03-28 12:21:32,050 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.rpc.address, localhost > 2017-03-28 12:21:32,050 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.rpc.port, 6123 > 2017-03-28 12:21:32,051 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.heap.mb, 256 > 2017-03-28 12:21:32,051 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.heap.mb, 512 > 2017-03-28 12:21:32,051 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.numberOfTaskSlots, 4 > 2017-03-28 12:21:32,051 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.memory.preallocate, false > 2017-03-28 12:21:32,051 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: parallelism.default, 1 > 2017-03-28 12:21:32,051 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.web.port, 8081 > 2017-03-28 12:21:32,051 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.web.submit.enable, false > 2017-03-28 12:21:32,052 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: state.backend, filesystem > 2017-03-28 12:21:32,052 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: state.backend.fs.checkpointdir, > hdfs://ourserver:8020/our_user/flink/state > 2017-03-28 12:21:32,052 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.network.numberOfBuffers, 4096 > 2017-03-28 12:21:32,052 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf/ > 2017-03-28 12:21:32,052 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.mode, zookeeper > 2017-03-28 12:21:32,052 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.quorum, > ourserver:2181,ourserver2:2181,ourserver3:2181 > 2017-03-28 12:21:32,053 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.storageDir, > hdfs:///our_user/flink/recovery > 2017-03-28 12:21:32,053 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.path.root, flink > 2017-03-28 12:21:32,053 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.path.namespace, yarn_session > 2017-03-28 12:21:32,053 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: reocvery.zookeeper.client.connection-timeout, 30000 > 2017-03-28 12:21:32,053 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.session-timeout, 120000 > 2017-03-28 12:21:32,053 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.retry-wait, 5000 > 2017-03-28 12:21:32,053 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.max-retry-attempts, 5 > 2017-03-28 12:21:32,054 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: yarn.application-attempts, 10 > 2017-03-28 12:21:32,054 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: yarn.maximum-failed-containers, 80 > 2017-03-28 12:21:32,054 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.watch.heartbeat.interval, 50s > 2017-03-28 12:21:32,055 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.log.lifecycle.events, true > 2017-03-28 12:21:32,055 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.ask.timeout, 20s > 2017-03-28 12:21:32,055 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: savepoints.state.backend, filesystem > 2017-03-28 12:21:32,055 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: savepoints.state.backend.fs.dir, > hdfs:///our_user/flink/savepoints > 2017-03-28 12:21:32,281 INFO org.apache.flink.client.CliFrontend > - Running 'savepoint' command. > 2017-03-28 12:21:32,287 INFO org.apache.flink.client.CliFrontend > - Retrieving JobManager. > 2017-03-28 12:21:32,288 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Found YARN properties file /tmp/.yarn-properties-our_user > 2017-03-28 12:21:32,372 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Using Yarn application id from YARN properties > application_1488884688139_2648 > 2017-03-28 12:21:32,372 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - YARN properties set default parallelism to 12 > 2017-03-28 12:21:32,372 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Found YARN properties file /tmp/.yarn-properties-our_user > 2017-03-28 12:21:32,373 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Using Yarn application id from YARN properties > application_1488884688139_2648 > 2017-03-28 12:21:32,373 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - YARN properties set default parallelism to 12 > 2017-03-28 12:21:32,440 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: env.java.opts, > -Djavax.net.ssl.trustStore=/path/to/our/cacerts > -XX:HeapDumpPath=/path/to/our/hadoop/yarn/log > -XX:+HeapDumpOnOutOfMemoryError -XX:MaxPermSize=192m > 2017-03-28 12:21:32,440 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.rpc.address, localhost > 2017-03-28 12:21:32,440 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.rpc.port, 6123 > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.heap.mb, 256 > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.heap.mb, 512 > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.numberOfTaskSlots, 4 > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.memory.preallocate, false > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: parallelism.default, 1 > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.web.port, 8081 > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.web.submit.enable, false > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: state.backend, filesystem > 2017-03-28 12:21:32,441 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: state.backend.fs.checkpointdir, > hdfs://ourserver:8020/our_user/flink/state > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.network.numberOfBuffers, 4096 > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf/ > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.mode, zookeeper > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.quorum, > ourserver:2181,ourserver:2181,ourserver:2181 > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.storageDir, > hdfs:///our_user/flink/recovery > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.path.root, flink > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.path.namespace, yarn_session > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: reocvery.zookeeper.client.connection-timeout, 30000 > 2017-03-28 12:21:32,442 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.session-timeout, 120000 > 2017-03-28 12:21:32,443 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.retry-wait, 5000 > 2017-03-28 12:21:32,443 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.max-retry-attempts, 5 > 2017-03-28 12:21:32,443 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: yarn.application-attempts, 10 > 2017-03-28 12:21:32,443 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: yarn.maximum-failed-containers, 80 > 2017-03-28 12:21:32,443 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.watch.heartbeat.interval, 50s > 2017-03-28 12:21:32,444 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.log.lifecycle.events, true > 2017-03-28 12:21:32,444 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.ask.timeout, 20s > 2017-03-28 12:21:32,444 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: savepoints.state.backend, filesystem > 2017-03-28 12:21:32,444 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: savepoints.state.backend.fs.dir, > hdfs:///our_user/flink/savepoints > 2017-03-28 12:21:32,541 INFO org.apache.hadoop.yarn.client.RMProxy > - Connecting to ResourceManager at ourserver/ourserver_ip:8050 > 2017-03-28 12:21:32,718 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found application JobManager host name 'ourserver' and port '36901' from > supplied application id 'application_1488884688139_2648' > 2017-03-28 12:21:32,732 INFO org.apache.flink.runtime.util.ZooKeeperUtils > - Using 'flink/yarn_session' as zookeeper namespace. > 2017-03-28 12:21:32,831 INFO > org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorFrameworkImpl > - Starting > 2017-03-28 12:21:32,832 DEBUG > org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient - > Starting > 2017-03-28 12:21:32,832 DEBUG > org.apache.flink.shaded.org.apache.curator.ConnectionState - Starting > 2017-03-28 12:21:32,833 DEBUG > org.apache.flink.shaded.org.apache.curator.ConnectionState - reset > 2017-03-28 12:21:32,874 INFO > org.apache.flink.shaded.org.apache.curator.framework.state.ConnectionStateManager > - State change: CONNECTED > 2017-03-28 12:21:33,891 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Starting ZooKeeperLeaderRetrievalService. > 2017-03-28 12:21:33,906 DEBUG > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Leader node has changed. > 2017-03-28 12:21:33,912 DEBUG > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > New leader information: > Leader=akka.tcp://flink@ourserver_ip:36901/user/jobmanager, session > ID=a3c337e5-1749-4c42-9949-0203bbae58d5. > 2017-03-28 12:21:33,914 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Stopping ZooKeeperLeaderRetrievalService. > 2017-03-28 12:21:33,914 DEBUG > org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorFrameworkImpl > - Closing > 2017-03-28 12:21:33,915 DEBUG > org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient - Closing > 2017-03-28 12:21:33,915 DEBUG > org.apache.flink.shaded.org.apache.curator.ConnectionState - Closing > 2017-03-28 12:21:33,920 INFO org.apache.flink.client.CliFrontend > - Using address /ourserver_ip:36901 to connect to JobManager. > 2017-03-28 12:21:33,926 INFO org.apache.flink.yarn.YarnClusterClient > - Starting client actor system. > 2017-03-28 12:21:33,928 DEBUG org.apache.flink.runtime.net.ConnectionUtils > - Trying to connect to (ourserver/ourserver_ip:36901) from local address > ourserver/ourserver_ip with timeout 200 > 2017-03-28 12:21:33,931 DEBUG org.apache.flink.runtime.net.ConnectionUtils > - Using InetAddress.getLocalHost() immediately for the connecting address > 2017-03-28 12:21:34,673 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Starting ZooKeeperLeaderRetrievalService. > 2017-03-28 12:21:34,677 DEBUG > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Leader node has changed. > 2017-03-28 12:21:34,677 DEBUG > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > New leader information: > Leader=akka.tcp://flink@ourserver_ip:36901/user/jobmanager, session > ID=a3c337e5-1749-4c42-9949-0203bbae58d5. > 2017-03-28 12:21:34,823 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Stopping ZooKeeperLeaderRetrievalService. > 2017-03-28 12:21:34,826 INFO org.apache.flink.client.CliFrontend > - Triggering savepoint for job 7e865198e220bea8a2203ebdb0827b6f. > 2017-03-28 12:21:34,828 INFO org.apache.flink.client.CliFrontend > - Waiting for response... > 2017-03-28 12:21:34,993 INFO org.apache.flink.client.CliFrontend > - Savepoint completed. Path: > hdfs:/our_user/flink/savepoints/savepoint-77214a0f9902 > 2017-03-28 12:21:34,994 INFO org.apache.flink.client.CliFrontend > - You can resume your program from this savepoint with the run command. > 2017-03-28 12:21:34,994 INFO org.apache.flink.yarn.YarnClusterClient > - Shutting down YarnClusterClient from the client shutdown hook > 2017-03-28 12:21:34,994 INFO org.apache.flink.yarn.YarnClusterClient > - Disconnecting YarnClusterClient from ApplicationMaster > > > [2] > > 2017-03-28 12:19:58,063 INFO org.apache.flink.client.CliFrontend > - > -------------------------------------------------------------------------------- > 2017-03-28 12:19:58,064 INFO org.apache.flink.client.CliFrontend > - Starting Command Line Client (Version: 1.1.3, Rev:a56d810, > Date:10.11.2016 @ 13:25:34 CET) > 2017-03-28 12:19:58,064 INFO org.apache.flink.client.CliFrontend > - Current user: our_user > 2017-03-28 12:19:58,064 INFO org.apache.flink.client.CliFrontend > - JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - > 1.7/24.51-b03 > 2017-03-28 12:19:58,065 INFO org.apache.flink.client.CliFrontend > - Maximum heap size: 1749 MiBytes > 2017-03-28 12:19:58,065 INFO org.apache.flink.client.CliFrontend > - JAVA_HOME: /usr/java/default > 2017-03-28 12:19:58,067 INFO org.apache.flink.client.CliFrontend > - Hadoop version: 2.3.0 > 2017-03-28 12:19:58,067 INFO org.apache.flink.client.CliFrontend > - JVM Options: > 2017-03-28 12:19:58,068 INFO org.apache.flink.client.CliFrontend > - > -Dlog.file=/path/to/our/lib/flink-1.1.3/log/flink-our_user-client-ourserver.log > 2017-03-28 12:19:58,068 INFO org.apache.flink.client.CliFrontend > - > -Dlog4j.configuration=file:/path/to/our/lib/flink-1.1.3/conf/log4j-cli.properties > 2017-03-28 12:19:58,068 INFO org.apache.flink.client.CliFrontend > - > -Dlogback.configurationFile=file:/path/to/our/lib/flink-1.1.3/conf/logback.xml > 2017-03-28 12:19:58,068 INFO org.apache.flink.client.CliFrontend > - Program Arguments: > 2017-03-28 12:19:58,069 INFO org.apache.flink.client.CliFrontend > - savepoint > 2017-03-28 12:19:58,069 INFO org.apache.flink.client.CliFrontend > - -d > 2017-03-28 12:19:58,069 INFO org.apache.flink.client.CliFrontend > - hdfs:/our_user/flink/savepoints/savepoint-d16441420a87 > 2017-03-28 12:19:58,069 INFO org.apache.flink.client.CliFrontend > - -j > 2017-03-28 12:19:58,069 INFO org.apache.flink.client.CliFrontend > - /path/to/our/lib/our_program/lib/our_program-6.2.6-SNAPSHOT-all.jar > 2017-03-28 12:19:58,069 INFO org.apache.flink.client.CliFrontend > - Classpath: > /path/to/our/lib/flink-1.1.3/lib/flink-dist_2.10-1.1.3.1.jar:/path/to/our/lib/flink-1.1.3/lib/flink-python_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/flink-reporter-1.0.2-20161206.140111-118.jar:/path/to/our/lib/flink-1.1.3/lib/flink-table_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/log4j-1.2.17.jar:/path/to/our/lib/flink-1.1.3/lib/ojdbc6-11.2.0.3.jar:/path/to/our/lib/flink-1.1.3/lib/slf4j-log4j12-1.7.7.jar::/etc/hadoop/conf: > 2017-03-28 12:19:58,070 INFO org.apache.flink.client.CliFrontend > - > -------------------------------------------------------------------------------- > 2017-03-28 12:19:58,070 INFO org.apache.flink.client.CliFrontend > - Using configuration directory /path/to/our/lib/flink-1.1.3/conf > 2017-03-28 12:19:58,070 INFO org.apache.flink.client.CliFrontend > - Trying to load configuration file > 2017-03-28 12:19:58,085 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: env.java.opts, > -Djavax.net.ssl.trustStore=/path/to/our/cacerts > -XX:HeapDumpPath=/path/to/our/hadoop/yarn/log > -XX:+HeapDumpOnOutOfMemoryError -XX:MaxPermSize=192m > 2017-03-28 12:19:58,085 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.rpc.address, localhost > 2017-03-28 12:19:58,086 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.rpc.port, 6123 > 2017-03-28 12:19:58,086 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.heap.mb, 256 > 2017-03-28 12:19:58,086 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.heap.mb, 512 > 2017-03-28 12:19:58,086 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.numberOfTaskSlots, 4 > 2017-03-28 12:19:58,086 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.memory.preallocate, false > 2017-03-28 12:19:58,086 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: parallelism.default, 1 > 2017-03-28 12:19:58,087 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.web.port, 8081 > 2017-03-28 12:19:58,087 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.web.submit.enable, false > 2017-03-28 12:19:58,087 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: state.backend, filesystem > 2017-03-28 12:19:58,087 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: state.backend.fs.checkpointdir, > hdfs://ourserver:8020/our_user/flink/state > 2017-03-28 12:19:58,087 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.network.numberOfBuffers, 4096 > 2017-03-28 12:19:58,087 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf/ > 2017-03-28 12:19:58,088 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.mode, zookeeper > 2017-03-28 12:19:58,088 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.quorum, > ourserver:2181,ourserver2:2181,ourserver3:2181 > 2017-03-28 12:19:58,088 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.storageDir, > hdfs:///our_user/flink/recovery > 2017-03-28 12:19:58,088 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.path.root, flink > 2017-03-28 12:19:58,088 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.path.namespace, yarn_session > 2017-03-28 12:19:58,088 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: reocvery.zookeeper.client.connection-timeout, 30000 > 2017-03-28 12:19:58,089 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.session-timeout, 120000 > 2017-03-28 12:19:58,089 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.retry-wait, 5000 > 2017-03-28 12:19:58,089 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.max-retry-attempts, 5 > 2017-03-28 12:19:58,089 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: yarn.application-attempts, 10 > 2017-03-28 12:19:58,089 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: yarn.maximum-failed-containers, 80 > 2017-03-28 12:19:58,090 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.watch.heartbeat.interval, 50s > 2017-03-28 12:19:58,090 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.log.lifecycle.events, true > 2017-03-28 12:19:58,090 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.ask.timeout, 20s > 2017-03-28 12:19:58,090 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: savepoints.state.backend, filesystem > 2017-03-28 12:19:58,090 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: savepoints.state.backend.fs.dir, > hdfs:///our_user/flink/savepoints > 2017-03-28 12:19:58,367 INFO org.apache.flink.client.CliFrontend > - Running 'savepoint' command. > 2017-03-28 12:19:58,372 INFO org.apache.flink.client.CliFrontend > - Retrieving JobManager. > 2017-03-28 12:19:58,373 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Found YARN properties file /tmp/.yarn-properties-our_user > 2017-03-28 12:19:58,484 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Using Yarn application id from YARN properties > application_1488884688139_2648 > 2017-03-28 12:19:58,485 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - YARN properties set default parallelism to 12 > 2017-03-28 12:19:58,485 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Found YARN properties file /tmp/.yarn-properties-our_user > 2017-03-28 12:19:58,485 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Using Yarn application id from YARN properties > application_1488884688139_2648 > 2017-03-28 12:19:58,485 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - YARN properties set default parallelism to 12 > 2017-03-28 12:19:58,604 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: env.java.opts, > -Djavax.net.ssl.trustStore=/path/to/our/cacerts > -XX:HeapDumpPath=/path/to/our/hadoop/yarn/log > -XX:+HeapDumpOnOutOfMemoryError -XX:MaxPermSize=192m > 2017-03-28 12:19:58,604 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.rpc.address, localhost > 2017-03-28 12:19:58,604 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.rpc.port, 6123 > 2017-03-28 12:19:58,604 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.heap.mb, 256 > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.heap.mb, 512 > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.numberOfTaskSlots, 4 > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.memory.preallocate, false > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: parallelism.default, 1 > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.web.port, 8081 > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: jobmanager.web.submit.enable, false > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: state.backend, filesystem > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: state.backend.fs.checkpointdir, > hdfs://ourserver:8020/our_user/flink/state > 2017-03-28 12:19:58,605 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: taskmanager.network.numberOfBuffers, 4096 > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf/ > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.mode, zookeeper > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.quorum, > ourserver:2181,ourserver:2181,ourserver:2181 > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.storageDir, > hdfs:///our_user/flink/recovery > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.path.root, flink > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.path.namespace, yarn_session > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: reocvery.zookeeper.client.connection-timeout, 30000 > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.session-timeout, 120000 > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.retry-wait, 5000 > 2017-03-28 12:19:58,606 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: recovery.zookeeper.client.max-retry-attempts, 5 > 2017-03-28 12:19:58,607 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: yarn.application-attempts, 10 > 2017-03-28 12:19:58,607 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: yarn.maximum-failed-containers, 80 > 2017-03-28 12:19:58,607 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.watch.heartbeat.interval, 50s > 2017-03-28 12:19:58,607 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.log.lifecycle.events, true > 2017-03-28 12:19:58,607 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: akka.ask.timeout, 20s > 2017-03-28 12:19:58,608 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: savepoints.state.backend, filesystem > 2017-03-28 12:19:58,608 DEBUG > org.apache.flink.configuration.GlobalConfiguration - Loading > configuration property: savepoints.state.backend.fs.dir, > hdfs:///our_user/flink/savepoints > 2017-03-28 12:19:58,685 INFO org.apache.hadoop.yarn.client.RMProxy > - Connecting to ResourceManager at ourserver/ourserver_ip:8050 > 2017-03-28 12:19:58,969 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found application JobManager host name 'ourserver' and port '36901' from > supplied application id 'application_1488884688139_2648' > 2017-03-28 12:19:58,989 INFO org.apache.flink.runtime.util.ZooKeeperUtils > - Using 'flink/yarn_session' as zookeeper namespace. > 2017-03-28 12:19:59,114 INFO > org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorFrameworkImpl > - Starting > 2017-03-28 12:19:59,115 DEBUG > org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient - > Starting > 2017-03-28 12:19:59,115 DEBUG > org.apache.flink.shaded.org.apache.curator.ConnectionState - Starting > 2017-03-28 12:19:59,115 DEBUG > org.apache.flink.shaded.org.apache.curator.ConnectionState - reset > 2017-03-28 12:19:59,172 INFO > org.apache.flink.shaded.org.apache.curator.framework.state.ConnectionStateManager > - State change: CONNECTED > 2017-03-28 12:20:00,212 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Starting ZooKeeperLeaderRetrievalService. > 2017-03-28 12:20:00,229 DEBUG > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Leader node has changed. > 2017-03-28 12:20:00,235 DEBUG > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > New leader information: > Leader=akka.tcp://flink@ourserver_ip:36901/user/jobmanager, session > ID=a3c337e5-1749-4c42-9949-0203bbae58d5. > 2017-03-28 12:20:00,238 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Stopping ZooKeeperLeaderRetrievalService. > 2017-03-28 12:20:00,238 DEBUG > org.apache.flink.shaded.org.apache.curator.framework.imps.CuratorFrameworkImpl > - Closing > 2017-03-28 12:20:00,238 DEBUG > org.apache.flink.shaded.org.apache.curator.CuratorZookeeperClient - Closing > 2017-03-28 12:20:00,239 DEBUG > org.apache.flink.shaded.org.apache.curator.ConnectionState - Closing > 2017-03-28 12:20:00,245 INFO org.apache.flink.client.CliFrontend > - Using address /ourserver_ip:36901 to connect to JobManager. > 2017-03-28 12:20:00,245 INFO org.apache.flink.runtime.util.ZooKeeperUtils > - Using 'flink/yarn_session' as zookeeper namespace. > 2017-03-28 12:20:00,252 INFO org.apache.flink.yarn.YarnClusterClient > - Starting client actor system. > 2017-03-28 12:20:00,254 DEBUG org.apache.flink.runtime.net.ConnectionUtils > - Trying to connect to (ourserver/ourserver_ip:36901) from local address > ourserver/ourserver_ip with timeout 200 > 2017-03-28 12:20:00,259 DEBUG org.apache.flink.runtime.net.ConnectionUtils > - Using InetAddress.getLocalHost() immediately for the connecting address > 2017-03-28 12:20:01,209 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Starting ZooKeeperLeaderRetrievalService. > 2017-03-28 12:20:01,213 DEBUG > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Leader node has changed. > 2017-03-28 12:20:01,213 DEBUG > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > New leader information: > Leader=akka.tcp://flink@ourserver_ip:36901/user/jobmanager, session > ID=a3c337e5-1749-4c42-9949-0203bbae58d5. > 2017-03-28 12:20:01,442 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Stopping ZooKeeperLeaderRetrievalService. > 2017-03-28 12:20:01,446 INFO org.apache.flink.client.CliFrontend > - Disposing savepoint > 'hdfs:/our_user/flink/savepoints/savepoint-d16441420a87' with JAR > /path/to/our/lib/our_program/lib/our_program-6.2.6-SNAPSHOT-all.jar. > 2017-03-28 12:20:01,590 INFO org.apache.flink.client.CliFrontend > - Waiting for response... > 2017-03-28 12:20:01,636 ERROR org.apache.flink.client.CliFrontend > - Error while running the command. > java.io.IOException: Failed to dispose savepoint > hdfs:/our_user/flink/savepoints/savepoint-d16441420a87. > at > org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.disposeSavepoint(FsSavepointStore.java:163) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:745) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:727) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:727) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: java.lang.ClassNotFoundException: > our.company.eventdata.EventDataRecord > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:270) > at > org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:65) > at > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) > at > java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) > at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1483) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1333) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500) > at > org.apache.flink.api.java.typeutils.runtime.PojoSerializer.readObject(PojoSerializer.java:131) > at sun.reflect.GeneratedMethodAccessor51.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500) > at > org.apache.flink.api.common.state.StateDescriptor.readObject(StateDescriptor.java:268) > at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at java.util.HashMap.readObject(HashMap.java:1184) > at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:291) > at > org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58) > at > org.apache.flink.runtime.checkpoint.SubtaskState.discard(SubtaskState.java:85) > at > org.apache.flink.runtime.checkpoint.TaskState.discard(TaskState.java:147) > at > org.apache.flink.runtime.checkpoint.savepoint.SavepointV0.dispose(SavepointV0.java:66) > at > org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.disposeSavepoint(FsSavepointStore.java:151) > ... 12 more > 2017-03-28 12:20:01,652 INFO org.apache.flink.yarn.YarnClusterClient > - Shutting down YarnClusterClient from the client shutdown hook > 2017-03-28 12:20:01,653 INFO org.apache.flink.yarn.YarnClusterClient > - Disconnecting YarnClusterClient from ApplicationMaster > > > [3] > > 2017-03-28 10:43:57,361 INFO org.apache.flink.client.CliFrontend > - > -------------------------------------------------------------------------------- > 2017-03-28 10:43:57,362 INFO org.apache.flink.client.CliFrontend > - Starting Command Line Client (Version: 1.1.3, Rev:a56d810, > Date:10.11.2016 @ 13:25:34 CET) > 2017-03-28 10:43:57,362 INFO org.apache.flink.client.CliFrontend > - Current user: our_user > 2017-03-28 10:43:57,362 INFO org.apache.flink.client.CliFrontend > - JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - > 1.7/24.51-b03 > 2017-03-28 10:43:57,362 INFO org.apache.flink.client.CliFrontend > - Maximum heap size: 1749 MiBytes > 2017-03-28 10:43:57,363 INFO org.apache.flink.client.CliFrontend > - JAVA_HOME: /usr/java/default > 2017-03-28 10:43:57,365 INFO org.apache.flink.client.CliFrontend > - Hadoop version: 2.3.0 > 2017-03-28 10:43:57,365 INFO org.apache.flink.client.CliFrontend > - JVM Options: > 2017-03-28 10:43:57,365 INFO org.apache.flink.client.CliFrontend > - > -Dlog.file=/path/to/our/lib/flink-1.1.3/log/flink-our_user-client-ourserver.log > 2017-03-28 10:43:57,365 INFO org.apache.flink.client.CliFrontend > - > -Dlog4j.configuration=file:/path/to/our/lib/flink-1.1.3/conf/log4j-cli.properties > 2017-03-28 10:43:57,365 INFO org.apache.flink.client.CliFrontend > - > -Dlogback.configurationFile=file:/path/to/our/lib/flink-1.1.3/conf/logback.xml > 2017-03-28 10:43:57,365 INFO org.apache.flink.client.CliFrontend > - Program Arguments: > 2017-03-28 10:43:57,366 INFO org.apache.flink.client.CliFrontend > - run > 2017-03-28 10:43:57,366 INFO org.apache.flink.client.CliFrontend > - -p > 2017-03-28 10:43:57,366 INFO org.apache.flink.client.CliFrontend > - 5 > 2017-03-28 10:43:57,366 INFO org.apache.flink.client.CliFrontend > - -c > 2017-03-28 10:43:57,366 INFO org.apache.flink.client.CliFrontend > - our.company.package.OurProgramClass > 2017-03-28 10:43:57,366 INFO org.apache.flink.client.CliFrontend > - /path/to/our/lib/our_program/lib/our_program.jar > 2017-03-28 10:43:57,366 INFO org.apache.flink.client.CliFrontend > - /path/to/our/lib/our_program/conf/our_program.properties > 2017-03-28 10:43:57,367 INFO org.apache.flink.client.CliFrontend > - Classpath: > /path/to/our/lib/flink-1.1.3/lib/flink-dist_2.10-1.1.3.1.jar:/path/to/our/lib/flink-1.1.3/lib/flink-python_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/flink-reporter-1.0.2-20161206.140111-118.jar:/path/to/our/lib/flink-1.1.3/lib/flink-table_2.10-1.1.3.jar:/path/to/our/lib/flink-1.1.3/lib/log4j-1.2.17.jar:/path/to/our/lib/flink-1.1.3/lib/ojdbc6-11.2.0.3.jar:/path/to/our/lib/flink-1.1.3/lib/slf4j-log4j12-1.7.7.jar::/etc/hadoop/conf: > 2017-03-28 10:43:57,367 INFO org.apache.flink.client.CliFrontend > - > -------------------------------------------------------------------------------- > 2017-03-28 10:43:57,367 INFO org.apache.flink.client.CliFrontend > - Using configuration directory /path/to/our/lib/flink-1.1.3/conf > 2017-03-28 10:43:57,367 INFO org.apache.flink.client.CliFrontend > - Trying to load configuration file > 2017-03-28 10:43:57,664 INFO org.apache.flink.client.CliFrontend > - Running 'run' command. > 2017-03-28 10:43:57,671 INFO org.apache.flink.client.CliFrontend > - Building program from JAR file > 2017-03-28 10:43:57,827 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Found YARN properties file /tmp/.yarn-properties-our_user > 2017-03-28 10:43:57,921 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Using Yarn application id from YARN properties > application_1488884688139_2648 > 2017-03-28 10:43:57,921 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - YARN properties set default parallelism to 12 > 2017-03-28 10:43:57,921 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Found YARN properties file /tmp/.yarn-properties-our_user > 2017-03-28 10:43:57,922 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - Using Yarn application id from YARN properties > application_1488884688139_2648 > 2017-03-28 10:43:57,922 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli > - YARN properties set default parallelism to 12 > 2017-03-28 10:43:58,046 INFO org.apache.hadoop.yarn.client.RMProxy > - Connecting to ResourceManager at ourserver/ourserver_ip:8050 > 2017-03-28 10:43:58,237 INFO org.apache.flink.yarn.YarnClusterDescriptor > - Found application JobManager host name 'ourserver' and port '36901' from > supplied application id 'application_1488884688139_2648' > 2017-03-28 10:43:58,246 INFO org.apache.flink.client.CliFrontend > - Cluster configuration: Yarn cluster with application id > application_1488884688139_2648 > 2017-03-28 10:43:59,439 INFO org.apache.flink.client.CliFrontend > - Using address ourserver_ip:36901 to connect to JobManager. > 2017-03-28 10:43:59,439 INFO org.apache.flink.client.CliFrontend > - JobManager web interface address > http://ourserver:8088/proxy/application_1488884688139_2648/ > 2017-03-28 10:43:59,439 DEBUG org.apache.flink.client.CliFrontend > - Client slots is set to -1 > 2017-03-28 10:43:59,440 DEBUG org.apache.flink.client.CliFrontend > - Savepoint path is set to null > 2017-03-28 10:43:59,440 DEBUG org.apache.flink.client.CliFrontend > - User parallelism is set to 5 > 2017-03-28 10:43:59,440 INFO org.apache.flink.client.CliFrontend > - Starting execution of program > 2017-03-28 10:43:59,440 INFO org.apache.flink.yarn.YarnClusterClient > - Starting program in interactive mode > 2017-03-28 10:44:00,593 WARN org.apache.hadoop.hdfs.BlockReaderLocal > - The short-circuit local reads feature cannot be used because libhadoop > cannot be loaded. > 2017-03-28 10:44:01,672 INFO org.apache.flink.yarn.YarnClusterClient > - Waiting until all TaskManagers have connected > 2017-03-28 10:44:02,702 INFO org.apache.flink.yarn.YarnClusterClient > - Starting client actor system. > 2017-03-28 10:44:03,717 INFO org.apache.flink.yarn.YarnClusterClient > - TaskManager status (3/1) > 2017-03-28 10:44:03,720 INFO org.apache.flink.yarn.YarnClusterClient > - All TaskManagers are connected > 2017-03-28 10:44:04,736 INFO org.apache.flink.yarn.YarnClusterClient > - Submitting job with JobID: d33a2835c9c25881a0765c250bbceb7e. Waiting for > job completion. > Connected to JobManager at > Actor[akka.tcp://flink@ourserver_ip:36901/user/jobmanager#-429328340] > 03/28/2017 10:44:06 Job execution switched to status RUNNING. > > > On 27.03.2017 15:24, Ufuk Celebi wrote: > > What kind of state backend where you using for the checkpoints? > > If there is a bug that prevents us from deleting the savepoint files > automatically, we can do a manual workaround and delete the > checkpoints files manually. With Flink 1.3 this becomes very straight > forward as savepoint data all go to a self contained directory that > can be deleted manually. > > On Mon, Mar 27, 2017 at 12:46 PM, Stefan Richter > <s.rich...@data-artisans.com> wrote: > > Hi, > > could you provide us with the log from the job client, with logging on debug > level for package org.apache.flink.client? Also, did you check if this > problem also exists in the latest bugfix release for your version (1.1.5) ? > > Best, > Stefan > > > Am 27.03.2017 um 11:41 schrieb Konstantin Gregor > <konstantin.gre...@tngtech.com>: > > Hey everyone, > > we are experiencing an issue in the disposal of savepoints in > Flink-1.1.3. We have a streaming job that has custom state (user objects > are part of the state). We create a savepoint: > > $ flink savepoint <JOBID> > [...] > Savepoint completed. Path: > hdfs:/bigdata/flink/savepoints/savepoint-20f064fb9f50 > [...] > > Then we want to simply dispose of that savepoint where we also provide > the jar to the job from which the savepoint was made: > $ flink savepoint -d > hdfs:/bigdata/flink/savepoints/savepoint-20f064fb9f50 -j > /path/to/jar/application.jar > > This gives us a ClassNotFoundException of our custom objects [1]. > > Adding our jar to the flink/lib directory is not an option for us, > things will break because of this. > Does anyone have an idea on how to proceed here? > > Thanks and best regards, > > Konstantin > > [1] > java.io.IOException: Failed to dispose savepoint > hdfs:///bigdata/flink/savepoints/savepoint-20f064fb9f50. > at > org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.disposeSavepoint(FsSavepointStore.java:163) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply$mcV$sp(JobManager.scala:745) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:727) > at > org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1$$anonfun$applyOrElse$7.apply(JobManager.scala:727) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) > at > scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) > at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) > at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:401) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Caused by: java.lang.ClassNotFoundException: > our.company.application.eventdata.EventDataRecord > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:270) > at > org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:65) > at > java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) > at > java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) > at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1483) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1333) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500) > at > org.apache.flink.api.java.typeutils.runtime.PojoSerializer.readObject(PojoSerializer.java:131) > at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500) > at > org.apache.flink.api.common.state.StateDescriptor.readObject(StateDescriptor.java:268) > at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at java.util.HashMap.readObject(HashMap.java:1184) > at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344) > at > java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) > at > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) > at > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) > at > java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) > at > org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:291) > at > org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58) > at > org.apache.flink.runtime.checkpoint.SubtaskState.discard(SubtaskState.java:85) > at > org.apache.flink.runtime.checkpoint.TaskState.discard(TaskState.java:147) > at > org.apache.flink.runtime.checkpoint.savepoint.SavepointV0.dispose(SavepointV0.java:66) > at > org.apache.flink.runtime.checkpoint.savepoint.FsSavepointStore.disposeSavepoint(FsSavepointStore.java:151) > > > -- > Konstantin Gregor * konstantin.gre...@tngtech.com > TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring > Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke > Sitz: Unterföhring * Amtsgericht München * HRB 135082 > > > > -- > Konstantin Gregor * konstantin.gre...@tngtech.com > TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring > Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke > Sitz: Unterföhring * Amtsgericht München * HRB 135082