As a workaround, we commented out state.backend.rocksdb.localdir since it defaults to the taskmanager.tmp.dirs location.
Now, we are having only these state backend configs in our flink-conf.yaml: state.backend: rocksdb state.checkpoints.dir: file:///home/demo/checkpoints/ext_checkpoints state.savepoints.dir: file:///home/demo/checkpoints/checkpoints/savepoints Checkpointing and savepointing works with the above configs. However, I wasn't able to find the rocksdb directory which was supposed to be insid /tmp directory. I was able to find these directories inside /tmp in taskmanager : drwxr-xr-x. 2 flink flink 4096 Jul 11 05:23 blobStore-122d93f5-35c9-4d8a-9632-e0e65f766825 drwxr-xr-x. 14 flink flink 4096 Jul 11 07:23 blobStore-c7d3433b-8e6d-4195-a431-d9392b638b5f -rw-r--r--. 1 flink flink 4 Jul 11 05:23 flink--taskexecutor.pid drwxr-xr-x. 2 flink flink 4096 Jul 11 05:23 flink-dist-cache-08e706f9-a388-4a6d-8774-849207746783 drwxr-xr-x. 2 flink flink 4096 Jul 11 05:23 flink-io-287519fb-4cc8-4396-9072-584e3fae0dcc drwxr-xr-x. 2 flink flink 4096 Jul 11 05:23 hsperfdata_flink drwxr-xr-x. 2 root root 4096 May 16 12:54 hsperfdata_root -rw-r--r--. 1 flink flink 1179 Jul 11 05:23 jaas-4321856842187934442.conf drwxr-xr-x. 2 flink flink 4096 Jul 11 07:17 localState No sign of any rocksdb directory. Or is it not being used at all? On Tue, Jul 10, 2018 at 12:45 PM, Sampath Bhat <sam414255p...@gmail.com> wrote: > Chesnay - Why is the absolute file check required in the > RocksDBStateBackend.setDbStoragePaths(String ... paths). I think this is > causing the issue. Its not related to GlusterFS or file system. The same > problem can be reproduced with the following configuration on local > machine. The flink application should support checkpointing. We get the > IllegealArgumentexecption (Relative File paths not allowed) > > state.backend: rocksdb > state.checkpoints.dir: file:///home/demo/checkpoints/ext_checkpoints > state.savepoints.dir: file:///home/demo/checkpoints/checkpoints/savepoints > state.backend.fs.checkpointdir: file:///home/demo/checkpoints/ > checkpoints/fs_state > #state.backend.rocksdb.checkpointdir: file:///home/demo/checkpoints/ > checkpoints/rocksdb_state > state.backend.rocksdb.localdir: /home/demo/checkpoints/ > checkpoints/rocksdb_state > > Any insights would be helpful. > > On Wed, Jul 4, 2018 at 2:27 PM, Chesnay Schepler <ches...@apache.org> > wrote: > >> Reference: https://issues.apache.org/jira/browse/FLINK-9739 >> >> >> On 04.07.2018 10:46, Chesnay Schepler wrote: >> >> It's not really path-parsing logic, but path handling i suppose; see >> RocksDBStateBackend#setDbStoragePaths(). >> >> I went ahead and converted said method into a simple test method, maybe >> this is enough to debug the issue. >> >> I assume this regression was caused by FLINK-6557, which refactored the >> state backend to rely on java Files instead of Flink paths. >> I'll open a JIRA to document it. >> >> The deprecation notice is not a problem. >> >> public static void testPaths(String... paths) { >> if (paths.length == 0) { >> throw new IllegalArgumentException("empty paths"); } >> else { >> File[] pp = new File[paths.length]; for (int i = 0; i < >> paths.length; i++) { >> final String rawPath = paths[i]; final String path; >> if (rawPath == null) { >> throw new IllegalArgumentException("null path"); } >> else { >> // we need this for backwards compatibility, to allow URIs like >> 'file:///'... URI uri = null; try { >> uri = new Path(rawPath).toUri(); } >> catch (Exception e) { >> // cannot parse as a path } >> >> if (uri != null && uri.getScheme() != null) { >> if ("file".equalsIgnoreCase(uri.getScheme())) { >> path = uri.getPath(); } >> else { >> throw new IllegalArgumentException("Path " + rawPath + " >> has a non-local scheme"); } >> } >> else { >> path = rawPath; } >> } >> >> pp[i] = new File(path); if (!pp[i].isAbsolute()) { // my >> suspicion is that this categorically fails for GlusterFS paths >> throw new IllegalArgumentException("Relative paths are not >> supported"); } >> } >> } >> } >> >> >> >> On 03.07.2018 16:35, Jash, Shaswata (Nokia - IN/Bangalore) wrote: >> >> Hello Chesnay, >> >> >> >> Cluster (in kubernetes)-wide checkpointing directory using glusterfs >> volume mount (thus file access protocol file:///) was working fine till >> 1.4.2 for us. So we like to understand where the breakage happened in >> 1.5.0. >> >> Can you please mention me the relevant source code files related to >> rocksdb “custom file path” parsing logic? We would be interested to >> investigate this. >> >> >> >> I also observed below in the log – >> >> >> >> Config uses deprecated configuration key >> 'state.backend.rocksdb.checkpointdir' instead of proper key >> 'state.backend.rocksdb.localdir' >> >> Regards, >> >> Shaswata >> >> >> >> *From:* Chesnay Schepler [mailto:ches...@apache.org <ches...@apache.org>] >> >> *Sent:* Tuesday, July 03, 2018 5:52 PM >> *To:* Data Engineer <dataenginee...@gmail.com> <dataenginee...@gmail.com> >> *Cc:* user@flink.apache.org >> *Subject:* Re: Checkpointing in Flink 1.5.0 >> >> >> >> The code appears to be working fine. >> >> This may happen because you're using a GlusterFS volume. >> The RocksDBStateBackend uses java Files internally (NOT nio Paths), which >> AFAIK only work properly against the plain local file-system. >> >> The GlusterFS nio FIleSystem implementation also explicitly does not >> support conversions to File >> <https://github.com/gluster/glusterfs-java-filesystem/blob/master/glusterfs-java-filesystem/src/main/java/com/peircean/glusterfs/GlusterPath.java#L271> >> . >> >> On 03.07.2018 13:53, Chesnay Schepler wrote: >> >> Thanks. Looks like RocksDBStateBackend.setDbStoragePaths has some custom >> file path parsing logic, will probe it a bit to see what the issue is. >> >> On 03.07.2018 13:45, Data Engineer wrote: >> >> 2018-07-03 11:30:35,703 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >> -------------------------------------------------------------------------------- >> >> 2018-07-03 11:30:35,705 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting >> StandaloneSessionClusterEntrypoint (Version: <unknown>, Rev:c61b108, >> Date:24.05.2018 @ 16:54:44 CEST) >> >> 2018-07-03 11:30:35,705 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - OS current >> user: flink >> >> 2018-07-03 11:30:35,705 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Current >> Hadoop/Kerberos user: <no hadoop dependency found> >> >> 2018-07-03 11:30:35,706 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM: >> OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.171-b10 >> >> 2018-07-03 11:30:35,706 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Maximum >> heap size: 981 MiBytes >> >> 2018-07-03 11:30:35,706 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JAVA_HOME: >> /etc/alternatives/jre_openjdk/ >> >> 2018-07-03 11:30:35,707 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - No Hadoop >> Dependency available >> >> 2018-07-03 11:30:35,707 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM Options: >> >> 2018-07-03 11:30:35,707 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xms1024m >> >> 2018-07-03 11:30:35,707 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xmx1024m >> >> 2018-07-03 11:30:35,708 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >> -Dlog.file=/opt/flink-1.5.0/log/flink--standalonesession-0-myfl-flink-jobmanager-7b4d8c4dd4-bv6zf.log >> >> 2018-07-03 11:30:35,708 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >> -Dlog4j.configuration=file:/opt/flink-1.5.0/conf/log4j.properties >> >> 2018-07-03 11:30:35,708 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >> -Dlogback.configurationFile=file:/opt/flink-1.5.0/conf/logback.xml >> >> 2018-07-03 11:30:35,708 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Program >> Arguments: >> >> 2018-07-03 11:30:35,709 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >> --configDir >> >> 2018-07-03 11:30:35,709 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >> /opt/flink-1.5.0/conf >> >> 2018-07-03 11:30:35,709 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >> --executionMode >> >> 2018-07-03 11:30:35,709 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - cluster >> >> 2018-07-03 11:30:35,710 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --host >> >> 2018-07-03 11:30:35,710 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - cluster >> >> 2018-07-03 11:30:35,710 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Classpath: >> /opt/flink-1.5.0/lib/flink-cep_2.11-1.5.0.jar:/opt/flink-1.5.0/lib/flink-connectors-1.5.0.jar:/opt/flink-1.5.0/lib/flink-gelly_2.11-1.5.0.jar:/opt/flink-1.5.0/lib/flink-ml_2.11-1.5.0.jar:/opt/flink-1.5.0/lib/flink-python_2.11-1.5.0.jar:/opt/flink-1.5.0/lib/flink-table_2.11-1.5.0.jar:/opt/flink-1.5.0/lib/log4j-1.2.17.jar:/opt/flink-1.5.0/lib/slf4j-log4j12-1.7.7.jar:/opt/flink-1.5.0/lib/flink-dist_2.11-1.5.0.jar::: >> >> 2018-07-03 11:30:35,710 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >> -------------------------------------------------------------------------------- >> >> 2018-07-03 11:30:35,712 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Registered >> UNIX signal handlers for [TERM, HUP, INT] >> >> 2018-07-03 11:30:35,720 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: blob.server.port, 4124 >> >> 2018-07-03 11:30:35,720 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: jobmanager.rpc.address, myfl-flink-jobmanager >> >> 2018-07-03 11:30:35,720 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: jobmanager.rpc.port, 4123 >> >> 2018-07-03 11:30:35,721 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: jobmanager.heap.mb, 1024 >> >> 2018-07-03 11:30:35,721 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: taskmanager.heap.mb, 1024 >> >> 2018-07-03 11:30:35,721 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: taskmanager.rpc.port, 4122 >> >> 2018-07-03 11:30:35,721 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: taskmanager.data.port, 4121 >> >> 2018-07-03 11:30:35,721 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: taskmanager.query.port, 4125 >> >> 2018-07-03 11:30:35,722 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: rest.port, 8081 >> >> 2018-07-03 11:30:35,762 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: rest.address, myfl-flink-jobmanager-ui >> >> 2018-07-03 11:30:35,762 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: state.backend, rocksdb >> >> 2018-07-03 11:30:35,762 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: state.checkpoints.dir, >> file:///opt/flink/share/myfl-flink/checkpoints/ext_checkpoints >> >> 2018-07-03 11:30:35,763 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: state.backend.fs.checkpointdir, >> file:///opt/flink/share/myfl-flink/checkpoints/fs_state >> >> 2018-07-03 11:30:35,763 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: state.backend.rocksdb.checkpointdir, >> file:///opt/flink/share/myfl-flink/checkpoints/rocksdb_state >> >> 2018-07-03 11:30:35,763 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: taskmanager.numberOfTaskSlots, 4 >> >> 2018-07-03 11:30:35,763 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: restart-strategy, fixed-delay >> >> 2018-07-03 11:30:35,764 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: restart-strategy.fixed-delay.attempts, 100 >> >> 2018-07-03 11:30:35,764 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: restart-strategy.fixed-delay.delay, 1 s >> >> 2018-07-03 11:30:35,885 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting >> StandaloneSessionClusterEntrypoint. >> >> 2018-07-03 11:30:35,885 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install >> default filesystem. >> >> 2018-07-03 11:30:35,892 INFO org.apache.flink.core.fs.FileSystem >> - Hadoop is not in the classpath/dependencies. The extended >> set of supported File Systems via Hadoop is not available. >> >> 2018-07-03 11:30:35,963 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install >> security context. >> >> 2018-07-03 11:30:35,970 INFO >> org.apache.flink.runtime.security.modules.HadoopModuleFactory - Cannot >> create Hadoop Security Module because Hadoop cannot be found in the >> Classpath. >> >> 2018-07-03 11:30:35,988 INFO >> org.apache.flink.runtime.security.SecurityUtils - Cannot >> install HadoopSecurityContext because Hadoop cannot be found in the >> Classpath. >> >> 2018-07-03 11:30:35,989 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Initializing >> cluster services. >> >> 2018-07-03 11:30:36,003 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Trying to >> start actor system at myfl-flink-jobmanager:4123 >> >> 2018-07-03 11:30:37,288 INFO akka.event.slf4j.Slf4jLogger >> - Slf4jLogger started >> >> 2018-07-03 11:30:37,396 INFO akka.remote.Remoting >> - Starting remoting >> >> 2018-07-03 11:30:37,583 INFO akka.remote.Remoting >> - Remoting started; listening on addresses >> :[akka.tcp://flink@myfl-flink-jobmanager:4123] >> >> 2018-07-03 11:30:37,591 INFO >> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Actor system >> started at akka.tcp://flink@myfl-flink-jobmanager:4123 >> >> 2018-07-03 11:30:37,611 INFO org.apache.flink.runtime.blob.BlobServer >> - Created BLOB server storage directory >> /tmp/blobStore-e445bc66-cee3-4a3d-b810-74df02627eca >> >> 2018-07-03 11:30:37,613 INFO org.apache.flink.runtime.blob.BlobServer >> - Started BLOB server at 0.0.0.0:4124 - max concurrent >> requests: 50 - max backlog: 1000 >> >> 2018-07-03 11:30:37,629 INFO >> org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics >> reporter configured, no metrics will be exposed/reported. >> >> 2018-07-03 11:30:37,664 INFO >> org.apache.flink.runtime.dispatcher.FileArchivedExecutionGraphStore - >> Initializing FileArchivedExecutionGraphStore: Storage directory >> /tmp/executionGraphStore-4ff546b1-4bfb-4911-9314-89c61d7e7149, expiration >> time 3600000, maximum cache size 52428800 bytes. >> >> 2018-07-03 11:30:37,694 INFO >> org.apache.flink.runtime.blob.TransientBlobCache - Created BLOB >> cache storage directory /tmp/blobStore-7e0efdb8-f70b-42ed-9387-c0e1b8090b36 >> >> 2018-07-03 11:30:37,702 WARN >> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Upload >> directory >> /tmp/flink-web-e68a12b9-b9cc-4508-be00-4bf9f113afcd/flink-web-upload does >> not exist, or has been deleted externally. Previously uploaded files are no >> longer available. >> >> 2018-07-03 11:30:37,703 INFO >> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Created >> directory >> /tmp/flink-web-e68a12b9-b9cc-4508-be00-4bf9f113afcd/flink-web-upload for >> file uploads. >> >> 2018-07-03 11:30:37,706 INFO >> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Starting >> rest endpoint. >> >> 2018-07-03 11:30:38,369 INFO >> org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined >> location of main cluster component log file: >> /opt/flink-1.5.0/log/flink--standalonesession-0-myfl-flink-jobmanager-7b4d8c4dd4-bv6zf.log >> >> 2018-07-03 11:30:38,369 INFO >> org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined >> location of main cluster component stdout file: >> /opt/flink-1.5.0/log/flink--standalonesession-0-myfl-flink-jobmanager-7b4d8c4dd4-bv6zf.out >> >> 2018-07-03 11:30:38,567 INFO >> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Rest >> endpoint listening at myfl-flink-jobmanager-ui:8081 >> >> 2018-07-03 11:30:38,568 INFO >> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - >> http://myfl-flink-jobmanager-ui:8081 was granted leadership with >> leaderSessionID=00000000-0000-0000-0000-000000000000 >> >> 2018-07-03 11:30:38,568 INFO >> org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Web frontend >> listening at http://myfl-flink-jobmanager-ui:8081. >> >> 2018-07-03 11:30:38,578 INFO >> org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC >> endpoint for >> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at >> akka://flink/user/resourcemanager . >> >> 2018-07-03 11:30:38,966 INFO >> org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC >> endpoint for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at >> akka://flink/user/dispatcher . >> >> 2018-07-03 11:30:39,068 INFO >> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - >> ResourceManager >> akka.tcp://flink@myfl-flink-jobmanager:4123/user/resourcemanager was granted >> leadership with fencing token 00000000000000000000000000000000 >> >> 2018-07-03 11:30:39,069 INFO >> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - Starting >> the SlotManager. >> >> 2018-07-03 11:30:39,164 INFO >> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Dispatcher >> akka.tcp://flink@myfl-flink-jobmanager:4123/user/dispatcher was granted >> leadership with fencing token 00000000000000000000000000000000 >> >> 2018-07-03 11:30:39,165 INFO >> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Recovering >> all persisted jobs. >> >> 2018-07-03 11:30:39,682 INFO >> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - >> Replacing old registration of TaskExecutor 068c693b9585900f68c53b00507ee889. >> >> 2018-07-03 11:30:39,683 INFO >> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - >> Unregister TaskManager 8a5cee3aa38081030dc8558ac477d3b3 from the SlotManager. >> >> 2018-07-03 11:30:39,683 INFO >> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager - The >> target with resource ID 068c693b9585900f68c53b00507ee889 is already been >> monitored. >> >> 2018-07-03 11:30:39,770 INFO >> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManager - >> Registering TaskManager 068c693b9585900f68c53b00507ee889 under >> 03d409e5166fad4f4082b6165eb0de2e at the SlotManager. >> >> 2018-07-03 11:34:20,257 INFO >> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Submitting >> job b684656d9afd75cc384a7bcd071bf55e (CSV Files Read -> CSV to Avro encode >> -> Kafka publish). >> >> 2018-07-03 11:34:20,269 INFO >> org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC >> endpoint for org.apache.flink.runtime.jobmaster.JobMaster at >> akka://flink/user/jobmanager_0 . >> >> 2018-07-03 11:34:20,278 INFO org.apache.flink.runtime.jobmaster.JobMaster >> - Initializing job CSV Files Read -> CSV to Avro encode -> >> Kafka publish (b684656d9afd75cc384a7bcd071bf55e). >> >> 2018-07-03 11:34:20,285 INFO org.apache.flink.runtime.jobmaster.JobMaster >> - Using restart strategy >> FixedDelayRestartStrategy(maxNumberRestartAttempts=2147483647, >> delayBetweenRestartAttempts=0) for CSV Files Read -> CSV to Avro encode -> >> Kafka publish (b684656d9afd75cc384a7bcd071bf55e). >> >> 2018-07-03 11:34:20,289 INFO >> org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC >> endpoint for org.apache.flink.runtime.jobmaster.slotpool.SlotPool at >> akka://flink/user/67ceb2ae-1cb1-44be-a09e-601032e23fb5 . >> >> 2018-07-03 11:34:20,481 INFO >> org.apache.flink.runtime.executiongraph.ExecutionGraph - Job recovers >> via failover strategy: full graph restart >> >> 2018-07-03 11:34:20,562 INFO org.apache.flink.runtime.jobmaster.JobMaster >> - Running initialization on master for job CSV Files Read -> >> CSV to Avro encode -> Kafka publish (b684656d9afd75cc384a7bcd071bf55e). >> >> 2018-07-03 11:34:20,563 INFO org.apache.flink.runtime.jobmaster.JobMaster >> - Successfully ran initialization on master in 0 ms. >> >> 2018-07-03 11:34:20,580 INFO org.apache.flink.runtime.jobmaster.JobMaster >> - Loading state backend via factory >> org.apache.flink.contrib.streaming.state.RocksDBStateBackendFactory >> >> 2018-07-03 11:34:20,590 WARN org.apache.flink.configuration.Configuration >> - Config uses deprecated configuration key >> 'state.backend.rocksdb.checkpointdir' instead of proper key >> 'state.backend.rocksdb.localdir' >> >> 2018-07-03 11:34:20,592 ERROR >> org.apache.flink.runtime.rest.handler.job.JobSubmitHandler - >> Implementation error: Unhandled exception. >> >> org.apache.flink.util.FlinkException: Failed to submit job >> b684656d9afd75cc384a7bcd071bf55e. >> >> at >> org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:254) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> >> at java.lang.reflect.Method.invoke(Method.java:498) >> >> at >> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247) >> >> at >> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162) >> >> at >> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70) >> >> at >> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142) >> >> at >> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40) >> >> at >> akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165) >> >> at akka.actor.Actor$class.aroundReceive(Actor.scala:502) >> >> at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) >> >> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) >> >> at akka.actor.ActorCell.invoke(ActorCell.scala:495) >> >> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) >> >> at akka.dispatch.Mailbox.run(Mailbox.scala:224) >> >> at akka.dispatch.Mailbox.exec(Mailbox.scala:234) >> >> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >> >> at >> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >> >> at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >> >> at >> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >> >> Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not >> set up JobManager >> >> at >> org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:169) >> >> at >> org.apache.flink.runtime.dispatcher.Dispatcher$DefaultJobManagerRunnerFactory.createJobManagerRunner(Dispatcher.java:885) >> >> at >> org.apache.flink.runtime.dispatcher.Dispatcher.createJobManagerRunner(Dispatcher.java:287) >> >> at >> org.apache.flink.runtime.dispatcher.Dispatcher.runJob(Dispatcher.java:277) >> >> at >> org.apache.flink.runtime.dispatcher.Dispatcher.persistAndRunJob(Dispatcher.java:262) >> >> at >> org.apache.flink.runtime.dispatcher.Dispatcher.submitJob(Dispatcher.java:249) >> >> ... 21 more >> >> Caused by: org.apache.flink.runtime.client.JobExecutionException: Could not >> instantiate configured state backend >> >> at >> org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:308) >> >> at >> org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:100) >> >> at >> org.apache.flink.runtime.jobmaster.JobMaster.createExecutionGraph(JobMaster.java:1150) >> >> at >> org.apache.flink.runtime.jobmaster.JobMaster.createAndRestoreExecutionGraph(JobMaster.java:1130) >> >> at org.apache.flink.runtime.jobmaster.JobMaster.<init>(JobMaster.java:298) >> >> at >> org.apache.flink.runtime.jobmaster.JobManagerRunner.<init>(JobManagerRunner.java:151) >> >> ... 26 more >> >> Caused by: org.apache.flink.configuration.IllegalConfigurationException: >> Invalid configuration for RocksDB state backend's local storage directories: >> Relative paths are not supported >> >> at >> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.<init>(RocksDBStateBackend.java:273) >> >> at >> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.configure(RocksDBStateBackend.java:296) >> >> at >> org.apache.flink.contrib.streaming.state.RocksDBStateBackendFactory.createFromConfig(RocksDBStateBackendFactory.java:47) >> >> at >> org.apache.flink.contrib.streaming.state.RocksDBStateBackendFactory.createFromConfig(RocksDBStateBackendFactory.java:32) >> >> at >> org.apache.flink.runtime.state.StateBackendLoader.loadStateBackendFromConfig(StateBackendLoader.java:157) >> >> at >> org.apache.flink.runtime.state.StateBackendLoader.fromApplicationOrConfigOrDefault(StateBackendLoader.java:222) >> >> at >> org.apache.flink.runtime.executiongraph.ExecutionGraphBuilder.buildGraph(ExecutionGraphBuilder.java:304) >> >> ... 31 more >> >> Caused by: java.lang.IllegalArgumentException: Relative paths are not >> supported >> >> at >> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.setDbStoragePaths(RocksDBStateBackend.java:518) >> >> at >> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.<init>(RocksDBStateBackend.java:269) >> >> ... 37 more >> >> >> >> >> >> On Tue, Jul 3, 2018 at 5:11 PM, Chesnay Schepler <ches...@apache.org> >> wrote: >> >> Doesn't sound like intended behavior, can you give us the stacktrace? >> >> >> >> On 03.07.2018 13:17, Data Engineer wrote: >> >> The Flink documentation says that we need to specify the filesystem type >> (file://, hdfs://) when configuring the rocksdb backend dir. >> https://ci.apache.org/projects/flink/flink-docs-release-1.5/ >> ops/state/state_backends.html#the-rocksdbstatebackend >> >> But when I do this, I get an error on job submission saying that relative >> paths are not permitted in the rocksdb stand backend. >> I am submitting the job via flink cli (bin/flink run). >> >> Also, even though I give a local file system path >> "file:///home/abc/share", it is a shared GlusterFS volume mount, so it >> will be accessible by the JobManager and all TaskManagers. >> >> I removed the filesystem type from the rocksdb backend dir configuration, >> and though the job got submitted, the rocksdb checkpoint directory was not >> created. >> I have enabled checkpointing in my Flink application. >> >> I am using Flink 1.5.0. >> >> Any help or pointers would be appreciated. >> >> >> >> >> >> >> >> >> >> >> >> >