Re: CLI help, documentation is confusing...

Marco Villalobos Fri, 13 Nov 2020 07:57:24 -0800

Hi Till,

Thank you for following up.


We were trying to set up s3 file sinks, and rocksdb with s3 checkpointing. We 
upgraded to Flink 1.11 and attempt to run the job in EMR. 

On startup, the logs showed an error that the flink-conf.yaml could not be 
found. I tried to trouble shoot the command line parameters, but the 
documentation was confusing me very much.

My co-worker fixed the issue. It turns out that hadoop configuration files in 
EMR were not to set to work with the s3a protocol out of the box. Once we 
placed the correct values in the Hadoop configuration file, everything worked.

Marco A. Villalobos



> On Nov 13, 2020, at 7:32 AM, Till Rohrmann <trohrm...@apache.org> wrote:
> 
> Hi Marco,
> 
> as Klou said, -m yarn-cluster should try to deploy a Yarn per job cluster on 
> your Yarn cluster. Could you maybe share a bit more details about what is 
> going wrong? E.g. the cli logs could be helpful to pinpoint the problem.
> 
> I've tested that both `bin/flink run -m yarn-cluster 
> examples/streaming/WindowJoin.jar` as well as `bin/flink run -t yarn-per-job 
> examples/streamingWindowJoin.jar` start a Flink per job cluster.
> 
> What was -yna supposed to do? -ynm should set the custom name of the Yarn 
> application.
> 
> @kkloudas <mailto:kklou...@apache.org> should we maybe improve the existing 
> documentation to better reflect the usage of -t/--target? The CLI 
> documentation [1] does not include a single example where we use the target 
> option. Moreover, we could think about retiring -m yarn-cluster in favour of 
> -t yarn-per-job. Moreover, should we somewhere document which 
> `execution.target` are all supported? What do you think?
> 
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#job-submission-examples
>  
> <https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#job-submission-examples>
> 
> Cheers,
> Till
> 
> On Tue, Nov 10, 2020 at 4:00 PM Kostas Kloudas <kklou...@gmail.com 
> <mailto:kklou...@gmail.com>> wrote:
> Hi Marco,
> 
> I agree with you that the -m help message is misleading but I do not
> think it has changed between releases.
> You can specify the address of the jobmanager or, for example, you can
> put "-m yarn-cluster" and depending on your environment setup Flink
> will pick up a session cluster or will create a per-job cluster.
> This was always the case.
> 
> For the -t and -e the change is that -e was deprecated (although still
> active) in favour of -t. But it still has the same meaning.
> 
> Finally on how to run Flink on EMR, I am not an expert so I will pull
> in Till who may have some input.
> 
> Cheers,
> Kostas
> 
> On Mon, Nov 9, 2020 at 10:46 PM Marco Villalobos
> <mvillalo...@kineteque.com <mailto:mvillalo...@kineteque.com>> wrote:
> >
> > The flink CLI documentation says that the -m option is to specify the job 
> > manager.
> >
> > but the examples are passing in an execution target.  I am quite confused 
> > by this.
> >
> > ./bin/flink run -m yarn-cluster \
> >                        ./examples/batch/WordCount.jar \
> >                        --input hdfs:///user/hamlet.txt --output 
> > hdfs:///user/wordcount_out
> >
> >
> > So what is it?
> >
> > I am trying to run Flink in EMR 6.1.0 but I have failed.
> >
> > It appears as though some of the command line parameters changed from 
> > version 1.10 to 1.11.
> >
> > For example, -yna is now -ynm.
> >
> > -e is now -t.
> >
> > But I am still confused by the -m option in both documentation.
> >
> > Can somebody please explain?
> >

Re: CLI help, documentation is confusing...

Reply via email to