Re: Parallelism questions

Till Rohrmann Tue, 15 Jan 2019 04:59:37 -0800

Hi Alexandru,

at the moment `/jobs/:jobid/rescaling` will always change the parallelism
for all operators. The maximum is the maximum parallelism which you have
defined for an operator.


I agree that it should also be possible to rescale an individual operator.
There internal functionality is already implemented (see
JobMaster#rescaleOperators) but has not been exposed.

Cheers,
Till

On Tue, Jan 15, 2019 at 1:03 PM Alexandru Gutan <alex.guta...@gmail.com>
wrote:

> Thanks Till!
>
> To execute the above (using Kubernetes), one would enter the running
> JobManager service and execute it?
> The following REST API call does the same */jobs/:jobid/rescaling*?
>
> I assume it changes the base parallelism, but what it will do if I had
> already set the parallelism of my operators?
> e.g.
> .source(..)
> .setParallelism(3)
> .setUID(..)
> .map(..)
> .setParallelism(8)
> .setUID(..)
> .sink(..)
> .setParallelism(3)
> .setUID(..)
>
> I think it would be a good idea to have */jobs/:jobid/rescaling,* additionally
> requiring the *operatorUID* as a queryParameter*, *so that the
> parallelism of specific operators could be changed.
>
> Best,
> Alex.
>
> On Tue, 15 Jan 2019 at 10:27, Till Rohrmann <trohrm...@apache.org> wrote:
>
>> Hi Alexandru,
>>
>> you can use the `modify` command `bin/flink modify <JOB_ID> --parallelism
>> <PARALLELISM>` to modify the parallelism of a job. At the moment, it is
>> implemented as first taking a savepoint, stopping the job and then
>> redeploying the job with the changed parallelism and resuming from the
>> savepoint.
>>
>> Cheers,
>> Till
>>
>> On Mon, Jan 14, 2019 at 4:21 PM Dawid Wysakowicz <dwysakow...@apache.org>
>> wrote:
>>
>>> Hi Alexandru
>>>
>>> As for 2, generally speaking the number of required slots depends on
>>> number of slot sharing groups. By default all operators belong to the
>>> default slot sharing group, that means a job requires as many slots as
>>> maximal parallelism in the job. More on the distributed runtime you can
>>> read here[1]
>>>
>>> As for 1 I cc'ed Gary and Till who might better answer your question.
>>>
>>> [1]
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.7/concepts/runtime.html#task-slots-and-resources
>>>
>>> Best,
>>>
>>> Dawid
>>> On 14/01/2019 15:26, Alexandru Gutan wrote:
>>>
>>> Hi everyone!
>>>
>>> 1. Is there a way to increase the parallelism (e.g. through REST) of
>>> some operators in a job without re-deploying the job? I found this
>>> <https://stackoverflow.com/questions/50719147/apache-flink-guideliness-for-setting-parallelism>
>>> answer which mentions scaling at runtime on Yarn/Mesos. Is it possible?
>>> How? Support for Kubernetes?
>>> 2. What happens when the number of parallel operator instances exceeds
>>> the number of task slots? For example: a job with a source (parallelism 3),
>>> a map (parallelism 8), a sink (parallelism 3), total of *14* operator
>>> instances and a setup with *8* task slots. Will the operators get
>>> chained? What if I disable operator chaining?
>>>
>>> Thank you!
>>>
>>>

Re: Parallelism questions

Reply via email to