HI Robert, I'm able to see taskmanage and jobmanager logs after I changed the log4j.properties file (/usr/lib/flink/conf). It seems to be a problem with EMR 6.1 distribution. the log4j.properties files is different in the Flink package that I downloaded and the one that comes with EMR 6.1. I replaced the log4j.properties and it's working. Thanks for helping me debug the issue.
Best, Diwakar On Tue, Nov 3, 2020 at 11:36 AM Robert Metzger <rmetz...@apache.org> wrote: > Hey Diwakar, > > the logs you are providing still don't contain the full Flink logs. > > You can not stop the Flink on YARN using "yarn app -stop > application_1603649952937_0002". To stop Flink on YARN, use: "yarn > application -kill <appId>". > > > > On Sat, Oct 31, 2020 at 6:26 PM Diwakar Jha <diwakar.n...@gmail.com> > wrote: > >> Hi, >> >> I wanted to check if anyone can help me with the logs. I have sent >> several emails but not getting any response. >> >> I'm running Flink 1.11 on EMR 6.1. I don't see any logs though I get this >> stdout error. I'm trying to upgrade Flink 1.8 to Flink 1.11 >> >> 18:29:19.834 [flink-akka.actor.default-dispatcher-28] ERROR >> org.apache.flink.runtime.rest.handler.taskmanager. >> TaskManagerLogFileHandler - Failed to transfer file from TaskExecutor >> container_1604033334508_0001_01_000004. >> java.util.concurrent.CompletionException: org.apache.flink.util. >> FlinkException: The file LOG does not exist on the TaskExecutor. >> >> Thanks! >> >> >> On Fri, Oct 30, 2020 at 9:04 AM Diwakar Jha <diwakar.n...@gmail.com> >> wrote: >> >>> Hello, >>> >>> I see that in my class path (below) I have both log4j-1 and lo4j-api-2. >>> is this because of which i'm not seeing any logs. If so, could someone >>> suggest how to fix it? >>> >>> export >>> CLASSPATH=":lib/flink-csv-1.11.0.jar:lib/flink-json-1.11.0.jar:lib/flink-shaded-zookeeper-3.4.14.jar:lib/flink-table-blink_2.12-1.11.0.jar:lib/flink-table_2.12-1.11.0.jar: >>> *lib/log4j-1.2-api-2.12.1.jar:lib/log4j-api-2.12.1.jar* >>> :lib/log4j-core-2.12.1.jar:lib/ >>> >>> export >>> _FLINK_CLASSPATH=":lib/flink-csv-1.11.0.jar:lib/flink-json-1.11.0.jar:lib/flink-shaded-zookeeper-3.4.14.jar:lib/flink-table-blink_2.12-1.11.0.jar:lib/flink-table_2.12-1.11.0.jar: >>> *lib/log4j-1.2-api-2.12.1.jar:lib/log4j-api-2.12.1.jar* >>> :lib/log4j-core-2.12.1.jar:lib/log4j-slf4j-impl-2.12.1.jar:flink-dist_2.12-1.11.0.jar:flink-conf.yaml:" >>> >>> thanks. >>> >>> On Thu, Oct 29, 2020 at 6:21 PM Diwakar Jha <diwakar.n...@gmail.com> >>> wrote: >>> >>>> Hello Everyone, >>>> >>>> I'm able to get my Flink UI up and running (it was related to the >>>> session manager plugin on my local laptop) but I'm not seeing any >>>> taskmanager/jobmanager logs in my Flink application. I have attached some >>>> yarn application logs while it's running but am not able to figure out how >>>> to stop and get more logs. Could someone please help me figure this out? >>>> I'm running Flink 1.11 on the EMR 6.1 cluster. >>>> >>>> On Tue, Oct 27, 2020 at 1:06 PM Diwakar Jha <diwakar.n...@gmail.com> >>>> wrote: >>>> >>>>> Hi Robert, >>>>> Could please correct me. I'm not able to stop the app. Also, i >>>>> stopped flink job already. >>>>> >>>>> sh-4.2$ yarn app -stop application_1603649952937_0002 >>>>> 2020-10-27 20:04:25,543 INFO client.RMProxy: Connecting to >>>>> ResourceManager at ip-10-0-55-50.ec2.internal/10.0.55.50:8032 >>>>> 2020-10-27 20:04:25,717 INFO client.AHSProxy: Connecting to >>>>> Application History server at ip-10-0-55-50.ec2.internal/ >>>>> 10.0.55.50:10200 >>>>> Exception in thread "main" java.lang.IllegalArgumentException: App >>>>> admin client class name not specified for type Apache Flink >>>>> at >>>>> org.apache.hadoop.yarn.client.api.AppAdminClient.createAppAdminClient(AppAdminClient.java:76) >>>>> at >>>>> org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:597) >>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) >>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) >>>>> at >>>>> org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:126) >>>>> sh-4.2$ >>>>> >>>>> On Tue, Oct 27, 2020 at 9:34 AM Robert Metzger <rmetz...@apache.org> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> are you intentionally not posting this response to the mailing list? >>>>>> >>>>>> As you can see from the yarn logs, log aggregation only works for >>>>>> finished applications ("End of LogType:prelaunch.out.This log file >>>>>> belongs >>>>>> to a running container (container_1603649952937_0002_01_000002) and so >>>>>> may >>>>>> not be complete.") >>>>>> >>>>>> Please stop the app, then provide the logs. >>>>>> >>>>>> >>>>>> On Tue, Oct 27, 2020 at 5:11 PM Diwakar Jha <diwakar.n...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi Robert, >>>>>>> >>>>>>> Yes, i'm using Flink on EMR using YARN. Please find attached the >>>>>>> yarn logs -applicationId. I also attached haddop-yarn-nodemanager logs. >>>>>>> Also, I followed this link below which has the same problem : >>>>>>> http://mail-archives.apache.org/mod_mbox/flink-user/202009.mbox/%3CCAGDv3o5WyJTrXs9Pg+Vy-b+LwgEE26iN54iqE0=f5t+m8vw...@mail.gmail.com%3E >>>>>>> >>>>>>> https://www.talkend.net/post/75078.html >>>>>>> Based on this I changed the log4j.properties. >>>>>>> Let me know what you think. Please also let me know if you need some >>>>>>> specific logs. Appreciate your help. >>>>>>> >>>>>>> Best, >>>>>>> Diwakar >>>>>>> >>>>>>> On Tue, Oct 27, 2020 at 12:26 AM Robert Metzger <rmetz...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Hey Diwakar, >>>>>>>> >>>>>>>> how are you deploying Flink on EMR? Are you using YARN? >>>>>>>> If so, you could also use log aggregation to see all the logs at >>>>>>>> once (from both JobManager and TaskManagers). (yarn logs -applicationId >>>>>>>> <Application ID>) >>>>>>>> >>>>>>>> Could you post (or upload somewhere) all logs you have of one run? >>>>>>>> It is much easier for us to debug something if we have the full logs >>>>>>>> (the >>>>>>>> logs show for example the classpath that you are using, we would see >>>>>>>> how >>>>>>>> you are deploying Flink, etc.) >>>>>>>> >>>>>>>> From the information available, my guess is that you have modified >>>>>>>> your deployment in some way (use of a custom logging version, custom >>>>>>>> deployment method, version mixup with jars from both Flink 1.8 and >>>>>>>> 1.11, >>>>>>>> ...). >>>>>>>> >>>>>>>> Best, >>>>>>>> Robert >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Oct 27, 2020 at 12:41 AM Diwakar Jha < >>>>>>>> diwakar.n...@gmail.com> wrote: >>>>>>>> >>>>>>>>> This is what I see on the WebUI. >>>>>>>>> >>>>>>>>> 23:19:24.263 [flink-akka.actor.default-dispatcher-1865] ERROR >>>>>>>>> org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler >>>>>>>>> - Failed to transfer file from TaskExecutor >>>>>>>>> container_1603649952937_0002_01_000004. >>>>>>>>> java.util.concurrent.CompletionException: >>>>>>>>> org.apache.flink.util.FlinkException: The file LOG does not exist on >>>>>>>>> the >>>>>>>>> TaskExecutor. at >>>>>>>>> org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$25( >>>>>>>>> TaskExecutor.java:1742 <http://taskexecutor.java:1742/>) >>>>>>>>> ~[flink-dist_2.12-1.11.0.jar:1.11.0] at >>>>>>>>> java.util.concurrent.CompletableFuture$AsyncSupply.run >>>>>>>>> <http://java.util.concurrent.completablefuture$asyncsupply.run/>( >>>>>>>>> CompletableFuture.java:1604 <http://completablefuture.java:1604/>) >>>>>>>>> ~[?:1.8.0_252] at java.util.concurrent.ThreadPoolExecutor.runWorker( >>>>>>>>> ThreadPoolExecutor.java:1149 >>>>>>>>> <http://threadpoolexecutor.java:1149/>) ~[?:1.8.0_252] at >>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run >>>>>>>>> <http://java.util.concurrent.threadpoolexecutor$worker.run/>( >>>>>>>>> ThreadPoolExecutor.java:624 <http://threadpoolexecutor.java:624/>) >>>>>>>>> ~[?:1.8.0_252] at java.lang.Thread.run >>>>>>>>> <http://java.lang.thread.run/>(Thread.java:748 >>>>>>>>> <http://thread.java:748/>) ~[?:1.8.0_252] Caused by: >>>>>>>>> org.apache.flink.util.FlinkException: The file LOG does not exist on >>>>>>>>> the >>>>>>>>> TaskExecutor. ... 5 more 23:19:24.275 >>>>>>>>> [flink-akka.actor.default-dispatcher-1865] ERROR >>>>>>>>> org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler >>>>>>>>> - Unhandled exception. org.apache.flink.util.FlinkException: The file >>>>>>>>> LOG >>>>>>>>> does not exist on the TaskExecutor. at >>>>>>>>> org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$requestFileUploadByFilePath$25( >>>>>>>>> TaskExecutor.java:1742 <http://taskexecutor.java:1742/>) >>>>>>>>> ~[flink-dist_2.12-1.11.0.jar:1.11.0] at >>>>>>>>> java.util.concurrent.CompletableFuture$AsyncSupply.run >>>>>>>>> <http://java.util.concurrent.completablefuture$asyncsupply.run/>( >>>>>>>>> CompletableFuture.java:1604 <http://completablefuture.java:1604/>) >>>>>>>>> ~[?:1.8.0_252] at java.util.concurrent.ThreadPoolExecutor.runWorker( >>>>>>>>> ThreadPoolExecutor.java:1149 >>>>>>>>> <http://threadpoolexecutor.java:1149/>) ~[?:1.8.0_252] at >>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run >>>>>>>>> <http://java.util.concurrent.threadpoolexecutor$worker.run/>( >>>>>>>>> ThreadPoolExecutor.java:624 <http://threadpoolexecutor.java:624/>) >>>>>>>>> ~[?:1.8.0_252] at java.lang.Thread.run >>>>>>>>> <http://java.lang.thread.run/>(Thread.java:748 >>>>>>>>> <http://thread.java:748/>) ~[?:1.8.0_252] >>>>>>>>> >>>>>>>>> Appreciate if anyone has any pointer for this. >>>>>>>>> >>>>>>>>> On Mon, Oct 26, 2020 at 10:45 AM Chesnay Schepler < >>>>>>>>> ches...@apache.org> wrote: >>>>>>>>> >>>>>>>>>> Flink 1.11 uses slf4j 1.7.15; the easiest way to check the log >>>>>>>>>> files is usually via the WebUI. >>>>>>>>>> >>>>>>>>>> On 10/26/2020 5:30 PM, Diwakar Jha wrote: >>>>>>>>>> >>>>>>>>>> I think my problem is with Sl4j library. I'm using sl4j 1.7 with >>>>>>>>>> Flink 1.11. If that's correct then i appreciate if someone can point >>>>>>>>>> me to >>>>>>>>>> the exact Slf4j library that i should use with Flink 1.11 >>>>>>>>>> >>>>>>>>>> Flink = 1.11.x; >>>>>>>>>> Slf4j = 1.7; >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sun, Oct 25, 2020 at 8:00 PM Diwakar Jha < >>>>>>>>>> diwakar.n...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks for checking my configurations. Could you also point me >>>>>>>>>>> where I can see the log files? Just to give more details. I'm >>>>>>>>>>> trying to >>>>>>>>>>> access these logs in AWS cloudwatch. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Diwakar >>>>>>>>>>> >>>>>>>>>>> On Sun, Oct 25, 2020 at 2:16 PM Chesnay Schepler < >>>>>>>>>>> ches...@apache.org> wrote: >>>>>>>>>>> >>>>>>>>>>>> With Flink 1.11 reporters were refactored to plugins, and are >>>>>>>>>>>> now accessible by default (so you no longer have to bother with >>>>>>>>>>>> copying >>>>>>>>>>>> jars around). >>>>>>>>>>>> >>>>>>>>>>>> Your configuration appears to be correct, so I suggest to take >>>>>>>>>>>> a look at the log files. >>>>>>>>>>>> >>>>>>>>>>>> On 10/25/2020 9:52 PM, Diwakar Jha wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hello Everyone, >>>>>>>>>>>> >>>>>>>>>>>> I'm new to flink and i'm trying to upgrade from flink 1.8 to >>>>>>>>>>>> flink 1.11 on an emr cluster. after upgrading to flink1.11 One of >>>>>>>>>>>> the >>>>>>>>>>>> differences that i see is i don't get any metrics. I found out >>>>>>>>>>>> that flink >>>>>>>>>>>> 1.11 does not have >>>>>>>>>>>> *org.apache.flink.metrics.statsd.StatsDReporterFactory* jar in >>>>>>>>>>>> /usr/lib/flink/opt which was the case for flink 1.8. Could anyone >>>>>>>>>>>> have any >>>>>>>>>>>> pointer to locate >>>>>>>>>>>> *org.apache.flink.metrics.statsd.StatsDReporterFactory* jar or >>>>>>>>>>>> how to use metrics in flink.1.11? >>>>>>>>>>>> >>>>>>>>>>>> Things i tried : >>>>>>>>>>>> a) the below setup >>>>>>>>>>>> >>>>>>>>>>>> metrics.reporters: stsdmetrics.reporter.stsd.factory.class: >>>>>>>>>>>> org.apache.flink.metrics.statsd.StatsDReporterFactorymetrics.reporter.stsd.host: >>>>>>>>>>>> localhostmetrics.reporter.stsd.port: 8125 >>>>>>>>>>>> >>>>>>>>>>>> b) I tried downloading the statsd jar from >>>>>>>>>>>> https://mvnrepository.com/artifact/org.apache.flink/flink-metrics-statsd >>>>>>>>>>>> putting it inside plugins/statsd directory. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Best, >>>>>>>>>>>> Diwakar Jha. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Best, >>>>>>>>>>> Diwakar Jha. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Best, >>>>>>>>>> Diwakar Jha. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best, >>>>>>>>> Diwakar Jha. >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best, >>>>>>> Diwakar Jha. >>>>>>> >>>>>> >>>>> >>>>> -- >>>>> Best, >>>>> Diwakar Jha. >>>>> >>>>