Could you check whether the Flink job has been submitted successfully? You could find some logs like the following in JobManager.
Starting execution of job ... Also it will help a lot if you could share the full jobmanager and client log. Best, Yang Rainie Li <raini...@pinterest.com> 于2020年7月16日周四 上午4:03写道: > These are the console log after launch the app: > > 2020-07-15 19:25:28,507 INFO > org.apache.flink.yarn.AbstractYarnClusterDescriptor - YARN > application has been deployed successfully. > Starting execution of program > -------Environment Variables----- > DOCKER_CONFIG=/etc/.docker > FLINK_BIN_DIR=/usr/local/flink-1.9.1/bin > FLINK_CONF_DIR=/etc/flink-1.9.1/conf/ > FLINK_LIB_DIR=/usr/local/flink-1.9.1/lib > FLINK_LOG_DIR=/home/karthik/pincohesion > FLINK_OPT_DIR=/usr/local/flink-1.9.1/opt > FLINK_PLUGINS_DIR=/usr/local/flink-1.9.1/plugins > > HADOOP_CLASSPATH=/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/*:/usr/local/hadoop/share/hadoop/common/*:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/*:/usr/local/hadoop/share/hadoop/hdfs/*:/usr/local/hadoop/share/hadoop/yarn/lib/*:/usr/local/hadoop/share/hadoop/yarn/*:/usr/local/hadoop/share/hadoop/mapreduce/lib/*:/usr/local/hadoop/share/hadoop/mapreduce/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/usr/local/hadoop/share/hadoop/tools/lib/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/usr/local/hadoop/share/hadoop/tools/lib/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/usr/local/hadoop/share/hadoop/tools/lib/*:/usr/local/hadoop/contrib/capacity-scheduler/*.jar:/usr/local/hadoop/share/hadoop/tools/lib/* > HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop > HADOOP_HOME=/usr/local/hadoop > HISTFILE=/home/rainieli/.bash_history > HISTFILESIZE=2000 > HISTIGNORE= > HISTSIZE=1000 > HOME=/home/rainieli > JAVA_HOME=/usr/lib/jvm/java-8-oracle > LANG=C.UTF-8 > LC_TERMINAL=iTerm2 > LC_TERMINAL_VERSION=3.3.9 > LESSCLOSE=/usr/bin/lesspipe %s %s > LESSOPEN=| /usr/bin/lesspipe %s > LOGNAME=rainieli > > LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36: > MAIL=/var/mail/rainieli > OLDPWD=/home/rainieli > > PATH=/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/lib/jvm/java-8-oracle/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/hadoop/bin:/usr/local/hadoop/bin > PWD=/home/karthik > SHELL=/bin/bash > SHLVL=1 > SSH_CLIENT=172.16.11.92 64705 22 > SSH_CONNECTION=172.16.11.92 64705 10.2.66.110 22 > SSH_TTY=/dev/pts/2 > S_COLORS=auto > TERM=xterm-256color > USER=rainieli > -------Command Line Arguments----- > [--conf-file, PIN_JOIN_pin_cohesion_realtime_signal.prod.properties] > Current working directory: null > ....... (some serverset info here) > > Thanks > Best regards > Rainie > > On Wed, Jul 15, 2020 at 12:45 PM Rainie Li <raini...@pinterest.com> wrote: > >> Thank you, Jesse. >> >> Here are more log info: >> >> 2020-07-15 18:19:36,456 INFO org.apache.flink.client.cli.CliFrontend >> - >> -------------------------------------------------------------------------------- >> 2020-07-15 18:19:36,460 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: jobmanager.rpc.address, localhost >> 2020-07-15 18:19:36,460 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: jobmanager.rpc.port, 6123 >> 2020-07-15 18:19:36,460 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: jobmanager.heap.size, 1024m >> 2020-07-15 18:19:36,460 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: taskmanager.heap.size, 1024m >> 2020-07-15 18:19:36,460 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: taskmanager.numberOfTaskSlots, 1 >> 2020-07-15 18:19:36,460 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: parallelism.default, 1 >> 2020-07-15 18:19:36,461 INFO >> org.apache.flink.configuration.GlobalConfiguration - Loading >> configuration property: jobmanager.execution.failover-strategy, region >> 2020-07-15 18:19:36,463 WARN org.apache.flink.client.cli.CliFrontend >> - Could not load CLI class >> org.apache.flink.yarn.cli.FlinkYarnSessionCli. >> java.lang.NoClassDefFoundError: >> org/apache/hadoop/yarn/exceptions/YarnException >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:264) >> at >> org.apache.flink.client.cli.CliFrontend.loadCustomCommandLine(CliFrontend.java:1185) >> at >> org.apache.flink.client.cli.CliFrontend.loadCustomCommandLines(CliFrontend.java:1145) >> at >> org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1070) >> Caused by: java.lang.ClassNotFoundException: >> org.apache.hadoop.yarn.exceptions.YarnException >> at java.net.URLClassLoader.findClass(URLClassLoader.java:382) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> ... 5 more >> 2020-07-15 18:19:36,519 INFO org.apache.flink.core.fs.FileSystem >> - Hadoop is not in the classpath/dependencies. The >> extended set of supported File Systems via Hadoop is not availab\ >> le. >> 2020-07-15 18:19:36,647 INFO >> org.apache.flink.runtime.security.modules.HadoopModuleFactory - Cannot >> create Hadoop Security Module because Hadoop cannot be found in the >> Classpath. >> 2020-07-15 18:19:36,658 INFO >> org.apache.flink.runtime.security.SecurityUtils - Cannot >> install HadoopSecurityContext because Hadoop cannot be found in the >> Classpath. >> >> >> Best regards >> Rainie >> >> On Wed, Jul 15, 2020 at 11:49 AM Jesse Lord <jl...@vectra.ai> wrote: >> >>> Hi Rainie, >>> >>> >>> >>> I am relatively new to flink, but I suspect that your error is somewhere >>> else in the log. I have found most of my problems by doing a search for the >>> word “error” or “exception”. Since all of these log lines are at the info >>> level, they are probably not highlighting any real issues. If you send more >>> of the log or find an error line that might help others debug. >>> >>> >>> >>> Thanks, >>> >>> Jesse >>> >>> >>> >>> *From: *Rainie Li <raini...@pinterest.com> >>> *Date: *Wednesday, July 15, 2020 at 10:54 AM >>> *To: *"user@flink.apache.org" <user@flink.apache.org> >>> *Subject: *flink app crashed >>> >>> >>> >>> Hi All, >>> >>> >>> >>> I am new to Flink, any idea why flink app's Job Manager stuck, here is >>> bottom part from the Job Manager log. Any suggestion will be appreciated. >>> >>> 2020-07-15 16:49:52,749 INFO >>> org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting RPC endpoint >>> for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at >>> akka://flink/user/dispatcher . >>> >>> 2020-07-15 16:49:52,759 INFO >>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - >>> Starting ZooKeeperLeaderRetrievalService /leader/resource_manager_lock. >>> >>> 2020-07-15 16:49:52,759 INFO >>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - >>> Starting ZooKeeperLeaderRetrievalService /leader/dispatcher_lock. >>> >>> 2020-07-15 16:49:52,762 INFO >>> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - >>> Starting ZooKeeperLeaderElectionService >>> ZooKeeperLeaderElectionService{leaderPath='/leader/dispatcher_lock'}. >>> >>> 2020-07-15 16:49:52,790 INFO >>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Dispatcher >>> /user/dispatcher was granted leadership with fencing token >>> >>> 2020-07-15 16:49:52,791 INFO >>> org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Recovering all >>> persisted jobs. >>> >>> 2020-07-15 16:49:52,931 INFO >>> org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing >>> over to rm1 >>> >>> 2020-07-15 16:49:53,014 INFO org.apache.flink.yarn.YarnResourceManager - >>> Recovered 0 containers from previous attempts ([]). >>> >>> 2020-07-15 16:49:53,018 INFO >>> org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl - Upper >>> bound of the thread pool size is 500 >>> >>> 2020-07-15 16:49:53,020 INFO >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy - >>> yarn.client.max-cached-nodemanagers-proxies : 0 >>> >>> 2020-07-15 16:49:53,021 INFO >>> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService - >>> Starting ZooKeeperLeaderElectionService >>> ZooKeeperLeaderElectionService{leaderPath='/leader/resource_manager_lock'}. >>> >>> 2020-07-15 16:49:53,042 INFO org.apache.flink.yarn.YarnResourceManager - >>> ResourceManager akka.tcp://flink@cluster-dev-001/user/resourcemanager >>> was granted leadership with fencing token >>> >>> 2020-07-15 16:49:53,046 INFO >>> org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl - >>> Starting the SlotManager. >>> >>> 2020-07-15 16:50:52,217 INFO org.apache.kafka.clients.Metadata - Cluster >>> ID: FZrfSqHiTpaZwEzIRYkCLQ >>> >>> >>> >>> >>> >>> Thanks >>> >>> Best regards >>> >>> Rainie >>> >>