What about setting the parallelism[1] to the total number of slots in your cluster? By default, all parts of your program are put into the same slot sharing group[2] and by setting the parallelism you would have this slot (with your whole program) in each parallel slot as well (minus/plus operators that have lower/higher parallelism), if I understand it correctly.
Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/ parallel.html [2] https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/ datastream_api.html#task-chaining-and-resource-groups On Monday, 27 February 2017 18:17:53 CET Evgeny Kincharov wrote: > Thanks for your answer. > The problem is that both slots are seized in the one node. Of course if this > node has enough free slots. Another nodes idle. I want to utilize cluster > resource little bit more. May be the other deployment modes allow it. > > BR, Evgeny. > > От: Nico Kruber<mailto:n...@data-artisans.com> > Отправлено: 27 февраля 2017 г. в 20:07 > Кому: user@flink.apache.org<mailto:user@flink.apache.org> > Копия: Evgeny Kincharov<mailto:evgeny_kincha...@epam.com> > Тема: Re: Running streaming job on every node of cluster > > Hi Evgeny, > I tried to reproduce your example with the following code, having another > console listening with "nc -l 12345" > > env.setParallelism(2); > env.addSource(new SocketTextStreamFunction("localhost", 12345, " ", 3)) > .map(new MapFunction<String, String>() { > @Override > public String map(final > String s) throws Exception { return s; } }) > .addSink(new DiscardingSink<String>()); > > This way, I do get a source with parallelism 1 and map & sink with > parallelism 2 and the whole program accompanying 2 slots as expected. You > can check in the web interface of your cluster how many slots are taken > after executing one instance of your program. > > How do you set your parallelism? > > > Nico > > On Monday, 27 February 2017 14:04:21 CET Evgeny Kincharov wrote: > > > Hi, > > > > > > > > I have the simplest streaming job, and I want to distribute my job on > > every node of my Flink cluster. > > > > > > > > Job is simple: > > > > > > > > source (SocketTextStream) -> map -> sink (AsyncHtttpSink). > > > > > > > > When I increase parallelism of my job when deploying or directly in code, > > no effect because source is can't work in parallel. Now I reduce "Tasks > > Slots" to 1 on ever nodes and deploy my job as many times as nodes in the > > cluster. It works when I have only one job. If I want deploy another in > > parallel there is no free slots. I hope more convenient way to do that is > > exists. Thanks. > > > > > > > > BR, > > Evgeny > > >
signature.asc
Description: This is a digitally signed message part.