Hi Florian,

5 minutes sounds too slow. Are you starting multiple "per-job clusters" at
the
same time? How many slots do you configure per TM? After you submit the job,
how many resources do you have left in your YARN cluster?

It might be that you are affected by FLINK-9455 [1]: Flink requests
unnecessary resources from YARN and blocks the execution of other jobs
temporarily. The workaround is to configure only one slot per TM.

If the above does not help, can you attach the full ClusterEntrypoint
(JobManager) logs?

Best,
Gary

[1] https://issues.apache.org/jira/browse/FLINK-9455


On Tue, Aug 7, 2018 at 12:34 PM, Florian Simond <florian.sim...@hotmail.fr>
wrote:

> Thank you!
>
>
> So it is now normal that it takes around 5 minutes to start processing ?
> The job is reading from kafka and writing back into another kafka topic.
> When I start the job, it takes roughly 5 minutes before I get something in
> the output topic.
>
>
> I see a lot of
>
> 2018-08-07 12:20:34,672 INFO  org.apache.flink.yarn.YarnResourceManager       
>               - Received new container: XXX - Remaining pending container 
> requests: 0
> 2018-08-07 12:20:34,672 INFO  org.apache.flink.yarn.YarnResourceManager       
>               - Returning excess container XXX.
>
>
> I see a lot of those lines during the first five minutes.
>
>
> I'm not sure I need to have a static set of TMs, but as we have a limited
> set of nodes, and several jobs, it could be harder to make sure they do not
> interfere with each other...
>
>
> ------------------------------
> *De :* Gary Yao <g...@data-artisans.com>
> *Envoyé :* mardi 7 août 2018 12:27
>
> *À :* Florian Simond
> *Cc :* vino yang; user@flink.apache.org
> *Objet :* Re: Could not build the program from JAR file.
>
> You can find more information about the re-worked deployment model here:
>
>     https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=65147077
>
> TaskManagers are started and shut down according to the slot requirements
> of
> the jobs. It is possible to return to the old behavior by setting
>
>     mode: old
>
> in flink-conf.yaml. However, this mode is deprecated and will be removed
> soon.
> Can you explain why you need to have a static set of TMs?
>
> On Tue, Aug 7, 2018 at 12:07 PM, Florian Simond <florian.sim...@hotmail.fr
> > wrote:
>
> Indeed, that's the solution.
>
>
> It was done automatically before with 1.4.2, that's why I missed that
> part...
>
>
> Do you have any pointer about the dynamic number of TaskManagers ? I'm
> curious to know how it works. Is it possible to still fix it ?
>
>
> Thank you,
>
> Florian
>
>
> ------------------------------
> *De :* Gary Yao <g...@data-artisans.com>
> *Envoyé :* mardi 7 août 2018 11:55
> *À :* Florian Simond
> *Cc :* vino yang; user@flink.apache.org
> *Objet :* Re: Could not build the program from JAR file.
>
> Hi Florian,
>
> Can you run
>
>     export HADOOP_CLASSPATH=`hadoop classpath`
>
> before submitting the job [1]?
>
> Moreover, you should not use the -yn parameter. Beginning with Flink 1.5,
> the
> number of TaskManagers is not fixed anymore.
>
> Best,
> Gary
>
>
> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/
> ops/deployment/hadoop.html#configuring-flink-with-hadoop-classpaths
> Apache Flink 1.5 Documentation: Hadoop Integration
> <https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/deployment/hadoop.html#configuring-flink-with-hadoop-classpaths>
> ci.apache.org
> Deployment & Operations; Clusters & Deployment; Hadoop Integration; Hadoop
> Integration. Configuring Flink with Hadoop Classpaths; Configuring Flink
> with Hadoop Classpaths
>
>
>
> On Tue, Aug 7, 2018 at 9:22 AM, Florian Simond <florian.sim...@hotmail.fr>
> wrote:
>
> In the log, I can see that:
>
>
> First exception is a warning, not sure if it is important.
>
>
> Second one seems to be the one. It tries to find the file "-yn" ???
>
>
> 2018-08-07 09:16:04,776 WARN  org.apache.flink.client.cli.CliFrontend
>                    - Could not load CLI class org.apache.flink.yarn.cli.Flin
> kYarnSessionCli.
> java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
>         at java.lang.Class.forName0(Native Method)
>         at java.lang.Class.forName(Class.java:264)
>         at org.apache.flink.client.cli.CliFrontend.loadCustomCommandLin
> e(CliFrontend.java:1208)
>         at org.apache.flink.client.cli.CliFrontend.loadCustomCommandLin
> es(CliFrontend.java:1164)
>         at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.jav
> a:1090)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.hadoop.conf.Configuration
>         at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
>         at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>         ... 5 more
> 2018-08-07 09:16:04,789 INFO  org.apache.flink.core.fs.FileSystem
>                    - Hadoop is not in the classpath/dependencies. The
> extended set of supported File Systems via Hadoop is not available.
> 2018-08-07 09:16:04,967 INFO  org.apache.flink.runtime.secur
> ity.modules.HadoopModuleFactory  - Cannot create Hadoop Security Module
> because Hadoop cannot be found in the Classpath.
> 2018-08-07 09:16:04,991 INFO  org.apache.flink.runtime.security.SecurityUtils
>              - Cannot install HadoopSecurityContext because Hadoop cannot
> be found in the Classpath.
> 2018-08-07 09:16:05,041 INFO  org.apache.flink.client.cli.CliFrontend
>                    - Running 'run' command.
> 2018-08-07 09:16:05,046 INFO  org.apache.flink.client.cli.CliFrontend
>                    - Building program from JAR file
> 2018-08-07 09:16:05,046 ERROR org.apache.flink.client.cli.CliFrontend
>                    - Invalid command line arguments.
> org.apache.flink.client.cli.CliArgsException: Could not build the program
> from JAR file.
>         at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java
> :208)
>         at org.apache.flink.client.cli.CliFrontend.parseParameters(CliF
> rontend.java:1025)
>         at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFro
> ntend.java:1101)
>         at org.apache.flink.runtime.security.NoOpSecurityContext.runSec
> ured(NoOpSecurityContext.java:30)
>         at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.jav
> a:1101)
> Caused by: java.io.FileNotFoundException: JAR file does not exist: -yn
>         at org.apache.flink.client.cli.CliFrontend.buildProgram(CliFron
> tend.java:828)
>         at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java
> :205)
>         ... 4 more
>
>
>
> ------------------------------
> *De :* vino yang <yanghua1...@gmail.com>
> *Envoyé :* mardi 7 août 2018 09:01
> *À :* Gary Yao
> *Cc :* Florian Simond; user@flink.apache.org
> *Objet :* Re: Could not build the program from JAR file.
>
> Hi Florian,
>
> The error message is because of a FileNotFoundException, see here[1]. Is
> there any more information about the exception. Do you make sure the jar
> exist?
>
> [1]: https://github.com/apache/flink/blob/master/flink-clien
> ts/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L209
>
> <https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L209>
> apache/flink
> <https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L209>
> github.com
> flink - Apache Flink
>
>
> Thanks, vino.
>
> 2018-08-07 14:28 GMT+08:00 Gary Yao <g...@data-artisans.com>:
>
> Hi Florian,
>
> You write that Flink 1.4.2 works but what version is not working for you?
>
> Best,
> Gary
>
>
>
> On Tue, Aug 7, 2018 at 8:25 AM, Florian Simond <florian.sim...@hotmail.fr>
> wrote:
>
> Hi all,
>
>
> I'm trying to run the wordCount example on my YARN cluster and this is not
> working.. I get the error message specified in title: Could not build the
> program from JAR file.
>
>
>
> > $ ./bin/flink run -m yarn-cluster -yn 4 -yjm 1024 -ytm 4096
> ./examples/batch/WordCount.jar
> > Setting HADOOP_CONF_DIR=/etc/hadoop/conf because no HADOOP_CONF_DIR was
> set.
> > Could not build the program from JAR file.
>
> > Use the help option (-h or --help) to get help on the command.
>
>
> I also have the same problem with a custom JAR...
>
>
>
> With Flink 1.4.2, I have no problem at all. Both the WordCount example and
> my custom JAR are working...
>
>
>
> What do I do wrong ?
>
>
>
>
>
>
>
>

Reply via email to