Is your user-jar packaging and relocating Flink classes? If so, then
your job actually operate against the classes provided by the cluster,
which, well, just wouldn't work.
On 18/06/2020 09:34, Sourabh Mehta wrote:
Hi ,
application is using 1.10.0 but cluster is setup on 1.9.0.
Yes I do have access. please find below starting logs from cluster
2020-06-17 11:28:18,989 INFO
org.apache.shaded.flink.table.module.ModuleManager - Got
FunctionDefinition equals from module core
2020-06-17 11:28:20,538 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,538 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,538 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.heap.size, 1024m
2020-06-17 11:28:20,538 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-17 11:28:20,538 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 1
2020-06-17 11:28:20,539 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.execution.failover-strategy, region
2020-06-17 11:28:20,539 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, cluster-flink-poc-m
2020-06-17 11:28:20,539 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.mb, 12288
2020-06-17 11:28:20,539 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.heap.mb, 12288
2020-06-17 11:28:20,540 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 4
2020-06-17 11:28:20,540 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 28
2020-06-17 11:28:20,540 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.network.numberOfBuffers, 2048
2020-06-17 11:28:20,540 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf
2020-06-17 11:28:20,550 INFO
org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils
- The configuration option Key: 'taskmanager.cpu.cores' , default:
null (fallback keys: []) required for local execution is not set,
setting it to its default value 1.7976931348623157E308
2020-06-17 11:28:20,552 INFO
org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils
- The configuration option Key: 'taskmanager.memory.task.heap.size' ,
default: null (fallback keys: []) required for local execution is not
set, setting it to its default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO
org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils
- The configuration option Key:
'taskmanager.memory.task.off-heap.size' , default: 0 bytes (fallback
keys: []) required for local execution is not set, setting it to its
default value 9223372036854775807 bytes
2020-06-17 11:28:20,552 INFO
org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils
- The configuration option Key: 'taskmanager.memory.network.min' ,
default: 64 mb (fallback keys: [{key=taskmanager.network.memory.min,
isDeprecated=true}]) required for local execution is not set, setting
it to its default value 64 mb
2020-06-17 11:28:20,553 INFO
org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils
- The configuration option Key: 'taskmanager.memory.network.max' ,
default: 1 gb (fallback keys: [{key=taskmanager.network.memory.max,
isDeprecated=true}]) required for local execution is not set, setting
it to its default value 64 mb
2020-06-17 11:28:20,553 INFO
org.apache.shaded.flink.runtime.taskexecutor.TaskExecutorResourceUtils
- The configuration option Key: 'taskmanager.memory.managed.size' ,
default: null (fallback keys: [{key=taskmanager.memory.size,
isDeprecated=true}]) required for local execution is not set, setting
it to its default value 128 mb
2020-06-17 11:28:20,558 INFO
org.apache.shaded.flink.runtime.minicluster.MiniCluster - Starting
Flink Mini Cluster
2020-06-17 11:28:20,561 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, localhost
2020-06-17 11:28:20,561 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.port, 6123
2020-06-17 11:28:20,561 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 1024m
2020-06-17 11:28:20,561 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.heap.size, 1024m
2020-06-17 11:28:20,561 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-17 11:28:20,561 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 1
2020-06-17 11:28:20,561 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.execution.failover-strategy, region
2020-06-17 11:28:20,562 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, cluster-flink-poc-m
2020-06-17 11:28:20,562 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.mb, 12288
2020-06-17 11:28:20,562 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.heap.mb, 12288
2020-06-17 11:28:20,562 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 4
2020-06-17 11:28:20,562 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 28
2020-06-17 11:28:20,563 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.network.numberOfBuffers, 2048
2020-06-17 11:28:20,563 INFO
org.apache.shaded.flink.configuration.GlobalConfiguration - Loading
configuration property: fs.hdfs.hadoopconf, /etc/hadoop/conf
2020-06-17 11:28:20,563 INFO
org.apache.shaded.flink.runtime.minicluster.MiniCluster - Starting
Metrics Registry
2020-06-17 11:28:20,610 INFO
org.apache.shaded.flink.runtime.metrics.MetricRegistryImpl - No
metrics reporter configured, no metrics will be exposed/reported.
2020-06-17 11:28:20,610 INFO
org.apache.shaded.flink.runtime.minicluster.MiniCluster - Starting
RPC Service(s)
2020-06-17 11:28:20,976 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
2020-06-17 11:28:21,070 INFO
org.apache.shaded.flink.runtime.rpc.akka.AkkaRpcServiceUtils -
Trying to start actor system at :0
2020-06-17 11:28:21,115 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
2020-06-17 11:28:21,131 INFO akka.remote.Remoting
- Starting remoting
2020-06-17 11:28:21,279 INFO akka.remote.Remoting
- Remoting started; listening on addresses
:[akka.tcp://flink-metrics@<<IP:PORT>>]
2020-06-17 11:28:21,283 INFO
org.apache.shaded.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor
system started at akka.tcp://flink-metrics@<<IP:PORT>>
Note : I have removed a few IP addresses from the log.
On Thu, Jun 18, 2020 at 12:08 AM Till Rohrmann <trohrm...@apache.org
<mailto:trohrm...@apache.org>> wrote:
Hi Sourabh,
do you have access to the cluster logs? They could be helpful for
debugging the problem. Which version of Flink are you using?
Cheers,
Till
On Wed, Jun 17, 2020 at 7:39 PM Sourabh Mehta
<sourabhmehta2...@gmail.com <mailto:sourabhmehta2...@gmail.com>>
wrote:
No, I am not.
On Wed, 17 Jun 2020 at 10:48 PM, Chesnay Schepler
<ches...@apache.org <mailto:ches...@apache.org>> wrote:
Are you by any chance creating a local environment via
(Stream)ExecutionEnvironment#createLocalEnvironment?
On 17/06/2020 17:05, Sourabh Mehta wrote:
Hi Team,
I'm exploring flink for one of my use case, I'm facing
some issues while running a flink job in cluster mode.
Below are the steps I followed to setup and run job in
cluster mode :
1. Setup flink on google cloud dataproc using
https://github.com/GoogleCloudDataproc/initialization-actions/tree/master/flink
2. After setting up the cluster I could see the flink
session started and could see the UI for the same.
3 Submitted job from dataproc master node using below command
sudo HADOOP_CONF_DIR=/etc/hadoop/conf
/usr/lib/flink/bin/flink run -m yarn-cluster -yid
application_1592311654771_0001 -class
com.sm.flink.FlinkDriver
/usr/lib/flink/lib/flink-1.0.10-sm-SNAPSHOT.jar
hdfs://cluster-flink-poc-m:8020/user/flink/rocksdb/
After running the job I see the job started successfully
but created a mini local cluster and ran in local mode. I
don't see any jobs submitted to JobManger and I also see
0 task managers on UI.
Can someone please help me understand here?, do let me
know what input is required to investigate the same.