hi, Arvid thanks for the advice , I removed the quotes and it do created a yarn session on EMR , but I didn't find any jit log file generated .
The config with quotes is working on standalone cluster . I also tried to dynamic pass the property within the yarn session command : flink-yarn-session -n 1 -d -nm testSession -yD env.java.opts="-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation -XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly" but get same result , session created , but can not find any jit log file under container log . Thanks Jacky Arvid Heise <ar...@ververica.com> 于2020年5月12日周二 下午12:57写道: > Hi Jacky, > > I suspect that the quotes are the actual issue. Could you try to remove > them? See also [1]. > > [1] > http://blogs.perl.org/users/tinita/2018/03/strings-in-yaml---to-quote-or-not-to-quote.html > > On Tue, May 12, 2020 at 4:03 PM Jacky D <jacky.du0...@gmail.com> wrote: > >> hi, Xintong >> >> Thanks for reply , I attached those lines below for application master >> start command : >> >> >> 2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory >> - Crypto codec >> org.apache.hadoop.crypto.OpensslAesCtrCryptoCodec is not available. >> 2020-05-11 21:16:16,635 DEBUG org.apache.hadoop.util.PerformanceAdvisory >> - Using crypto codec >> org.apache.hadoop.crypto.JceAesCtrCryptoCodec. >> 2020-05-11 21:16:16,636 DEBUG org.apache.hadoop.hdfs.DataStreamer >> - DataStreamer block >> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet >> packet seqno: 0 offsetInBlock: 0 lastPacketInBlock: false >> lastByteOffsetInBlock: 1697 >> 2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer >> - DFSClient seqno: 0 reply: SUCCESS >> downstreamAckTimeNanos: 0 flag: 0 >> 2020-05-11 21:16:16,637 DEBUG org.apache.hadoop.hdfs.DataStreamer >> - DataStreamer block >> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 sending packet >> packet seqno: 1 offsetInBlock: 1697 lastPacketInBlock: true >> lastByteOffsetInBlock: 1697 >> 2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer >> - DFSClient seqno: 1 reply: SUCCESS >> downstreamAckTimeNanos: 0 flag: 0 >> 2020-05-11 21:16:16,638 DEBUG org.apache.hadoop.hdfs.DataStreamer >> - Closing old block >> BP-1519523618-98.94.65.144-1581106168138:blk_1073745139_4315 >> 2020-05-11 21:16:16,641 DEBUG org.apache.hadoop.ipc.Client >> - IPC Client (1954985045) connection to >> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #70 >> org.apache.hadoop.hdfs.protocol.ClientProtocol.complete >> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client >> - IPC Client (1954985045) connection to >> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #70 >> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine >> - Call: complete took 2ms >> 2020-05-11 21:16:16,643 DEBUG org.apache.hadoop.ipc.Client >> - IPC Client (1954985045) connection to >> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #71 >> org.apache.hadoop.hdfs.protocol.ClientProtocol.setTimes >> 2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.Client >> - IPC Client (1954985045) connection to >> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #71 >> 2020-05-11 21:16:16,645 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine >> - Call: setTimes took 2ms >> 2020-05-11 21:16:16,647 DEBUG org.apache.hadoop.ipc.Client >> - IPC Client (1954985045) connection to >> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop sending #72 >> org.apache.hadoop.hdfs.protocol.ClientProtocol.setPermission >> 2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.Client >> - IPC Client (1954985045) connection to >> ip-98-94-65-144.ec2.internal/98.94.65.144:8020 from hadoop got value #72 >> 2020-05-11 21:16:16,648 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine >> - Call: setPermission took 2ms >> 2020-05-11 21:16:16,654 DEBUG >> org.apache.flink.yarn.AbstractYarnClusterDescriptor - Application >> Master start command: $JAVA_HOME/bin/java -Xmx424m >> "-XX:+UnlockDiagnosticVMOptions -XX:+TraceClassLoading -XX:+LogCompilation >> -XX:LogFile=${FLINK_LOG_PREFIX}.jit -XX:+PrintAssembly" >> -Dlog.file="<LOG_DIR>/jobmanager.log" >> -Dlog4j.configuration=file:log4j.properties >> org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint 1> >> <LOG_DIR>/jobmanager.out 2> <LOG_DIR>/jobmanager.err >> 2020-05-11 21:16:16,654 DEBUG org.apache.hadoop.ipc.Client >> - stopping client from cache: >> org.apache.hadoop.ipc.Client@28194a50 >> 2020-05-11 21:16:16,656 DEBUG >> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector >> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports >> method setApplicationTags. >> 2020-05-11 21:16:16,656 DEBUG >> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector >> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports >> method setAttemptFailuresValidityInterval. >> 2020-05-11 21:16:16,656 DEBUG >> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector >> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports >> method setKeepContainersAcrossApplicationAttempts. >> 2020-05-11 21:16:16,656 DEBUG >> org.apache.flink.yarn.AbstractYarnClusterDescriptor$ApplicationSubmissionContextReflector >> - org.apache.hadoop.yarn.api.records.ApplicationSubmissionContext supports >> method setNodeLabelExpression. >> >> Xintong Song <tonysong...@gmail.com> 于2020年5月11日周一 下午10:11写道: >> >>> Hi Jacky, >>> >>> Could you search for "Application Master start command:" in the debug >>> log and post the result and a few lines before & after that? This is not >>> included in the clip of attached log file. >>> >>> Thank you~ >>> >>> Xintong Song >>> >>> >>> >>> On Tue, May 12, 2020 at 5:33 AM Jacky D <jacky.du0...@gmail.com> wrote: >>> >>>> hi, Robert >>>> >>>> Thanks so much for quick reply , I changed the log level to debug and >>>> attach the log file . >>>> >>>> Thanks >>>> Jacky >>>> >>>> Robert Metzger <rmetz...@apache.org> 于2020年5月11日周一 下午4:14写道: >>>> >>>>> Thanks a lot for posting the full output. >>>>> >>>>> It seems that Flink is passing an invalid list of arguments to the >>>>> JVM. >>>>> Can you >>>>> - set the root log level in conf/log4j-yarn-session.properties to DEBUG >>>>> - then launch the YARN session >>>>> - share the log file of the yarn session on the mailing list? >>>>> >>>>> I'm particularly interested in the line printed here, as it shows the >>>>> JVM invocation. >>>>> >>>>> https://github.com/apache/flink/blob/release-1.6/flink-yarn/src/main/java/org/apache/flink/yarn/AbstractYarnClusterDescriptor.java#L1630 >>>>> >>>>> >>>>> On Mon, May 11, 2020 at 9:56 PM Jacky D <jacky.du0...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi,Robert >>>>>> >>>>>> Yes , I tried to retrieve more log info from yarn UI , the full logs >>>>>> showing below , this happens when I try to create a flink yarn session on >>>>>> emr when set up jitwatch configuration . >>>>>> >>>>>> 2020-05-11 19:06:09,552 ERROR >>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli - Error >>>>>> while >>>>>> running the Flink Yarn session. >>>>>> java.lang.reflect.UndeclaredThrowableException >>>>>> at >>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1862) >>>>>> at >>>>>> org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) >>>>>> at >>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:813) >>>>>> Caused by: >>>>>> org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't >>>>>> deploy Yarn session cluster >>>>>> at >>>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:429) >>>>>> at >>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:610) >>>>>> at >>>>>> org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$2(FlinkYarnSessionCli.java:813) >>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>> at javax.security.auth.Subject.doAs(Subject.java:422) >>>>>> at >>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844) >>>>>> ... 2 more >>>>>> Caused by: >>>>>> org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: >>>>>> The YARN application unexpectedly switched to state FAILED during >>>>>> deployment. >>>>>> Diagnostics from YARN: Application application_1584459865196_0165 >>>>>> failed 1 times (global limit =2; local limit is =1) due to AM Container >>>>>> for >>>>>> appattempt_1584459865196_0165_000001 exited with exitCode: 1 >>>>>> Failing this attempt.Diagnostics: Exception from container-launch. >>>>>> Container id: container_1584459865196_0165_01_000001 >>>>>> Exit code: 1 >>>>>> Exception message: Usage: java [-options] class [args...] >>>>>> (to execute a class) >>>>>> or java [-options] -jar jarfile [args...] >>>>>> (to execute a jar file) >>>>>> where options include: >>>>>> -d32 use a 32-bit data model if available >>>>>> -d64 use a 64-bit data model if available >>>>>> -server to select the "server" VM >>>>>> The default VM is server, >>>>>> because you are running on a server-class machine. >>>>>> >>>>>> >>>>>> -cp <class search path of directories and zip/jar files> >>>>>> -classpath <class search path of directories and zip/jar files> >>>>>> A : separated list of directories, JAR archives, >>>>>> and ZIP archives to search for class files. >>>>>> -D<name>=<value> >>>>>> set a system property >>>>>> -verbose:[class|gc|jni] >>>>>> enable verbose output >>>>>> -version print product version and exit >>>>>> -version:<value> >>>>>> Warning: this feature is deprecated and will be >>>>>> removed >>>>>> in a future release. >>>>>> require the specified version to run >>>>>> -showversion print product version and continue >>>>>> -jre-restrict-search | -no-jre-restrict-search >>>>>> Warning: this feature is deprecated and will be >>>>>> removed >>>>>> in a future release. >>>>>> include/exclude user private JREs in the version >>>>>> search >>>>>> -? -help print this help message >>>>>> -X print help on non-standard options >>>>>> -ea[:<packagename>...|:<classname>] >>>>>> -enableassertions[:<packagename>...|:<classname>] >>>>>> enable assertions with specified granularity >>>>>> -da[:<packagename>...|:<classname>] >>>>>> -disableassertions[:<packagename>...|:<classname>] >>>>>> disable assertions with specified granularity >>>>>> -esa | -enablesystemassertions >>>>>> enable system assertions >>>>>> -dsa | -disablesystemassertions >>>>>> disable system assertions >>>>>> -agentlib:<libname>[=<options>] >>>>>> load native agent library <libname>, e.g. >>>>>> -agentlib:hprof >>>>>> see also, -agentlib:jdwp=help and >>>>>> -agentlib:hprof=help >>>>>> -agentpath:<pathname>[=<options>] >>>>>> load native agent library by full pathname >>>>>> -javaagent:<jarpath>[=<options>] >>>>>> load Java programming language agent, see >>>>>> java.lang.instrument >>>>>> -splash:<imagepath> >>>>>> show splash screen with specified image >>>>>> See >>>>>> http://www.oracle.com/technetwork/java/javase/documentation/index.html >>>>>> for more details. >>>>>> >>>>>> Thanks >>>>>> Jacky >>>>>> >>>>>> Robert Metzger <rmetz...@apache.org> 于2020年5月11日周一 下午3:42写道: >>>>>> >>>>>>> Hey Jacky, >>>>>>> >>>>>>> The error says "The YARN application unexpectedly switched to state >>>>>>> FAILED during deployment.". >>>>>>> Have you tried retrieving the YARN application logs? >>>>>>> Does the YARN UI / resource manager logs reveal anything on the >>>>>>> reason for the deployment to fail? >>>>>>> >>>>>>> Best, >>>>>>> Robert >>>>>>> >>>>>>> >>>>>>> On Mon, May 11, 2020 at 9:34 PM Jacky D <jacky.du0...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ---------- Forwarded message --------- >>>>>>>> 发件人: Jacky D <jacky.du0...@gmail.com> >>>>>>>> Date: 2020年5月11日周一 下午3:12 >>>>>>>> Subject: Re: Flink Memory analyze on AWS EMR >>>>>>>> To: Khachatryan Roman <khachatryan.ro...@gmail.com> >>>>>>>> >>>>>>>> >>>>>>>> Hi, Roman >>>>>>>> >>>>>>>> Thanks for quick response , I tried without logFIle option but >>>>>>>> failed with same error , I'm currently using flink 1.6 >>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/application_profiling.html, >>>>>>>> so I can only use Jitwatch or JMC . I guess those tools only >>>>>>>> available on >>>>>>>> Standalone cluster ? as document mentioned "Each standalone >>>>>>>> JobManager, TaskManager, HistoryServer, and ZooKeeper daemon redirects >>>>>>>> stdout and stderr to a file with a .out filename suffix and writes >>>>>>>> internal logging to a file with a .log suffix. Java options >>>>>>>> configured by the user in env.java.opts" ? >>>>>>>> >>>>>>>> Thanks >>>>>>>> Jacky >>>>>>>> >>>>>>> > > -- > > Arvid Heise | Senior Java Developer > > <https://www.ververica.com/> > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > -- > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > Ververica GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > (Toni) Cheng >