Yes that's the way to go at the moment. Cheers, Till
On Thu, Jun 30, 2016 at 12:47 PM, Márton Balassi <balassi.mar...@gmail.com> wrote: > Hi Josh, > > Yes, currently that is a reasonable workaround. > > Best, > > Marton > > On Thu, Jun 30, 2016 at 12:38 PM, Josh <jof...@gmail.com> wrote: > >> Hi Till, >> >> Thanks, that's very helpful! >> So I guess in that case, since it isn't possible to increase the job >> parallelism later, it might be sensible to use say 10x the parallelism that >> I need right now (even if only running on a couple of task managers) - so >> that it's possible to scale the job in the future if I need to? >> >> Josh >> >> On Thu, Jun 30, 2016 at 11:17 AM, Till Rohrmann <trohrm...@apache.org> >> wrote: >> >>> Hi Josh, >>> >>> at the moment it is not possible to dynamically increase the parallelism >>> of your job. The same holds true for a restarting a job from a savepoint. >>> But we're currently working on exactly this. So in order to change the >>> parallelism of your job, you would have to restart the job from scratch. >>> >>> Adding task managers dynamically to your running Flink cluster, is >>> possible if you allocate new YARN containers and then start a TaskManager >>> process manually with the current job manager address and port. You can >>> either find the address and port out using the web dashboard under job >>> manager configuration or you look up the .yarn-properties file which is >>> stored in your temp directory on your machine. This file also contains the >>> job manager address. But the easier way would be to stop your yarn session >>> and then restart it with an increased number of containers. Because then, >>> you wouldn't have to ship the lib directory, which might contain user code >>> classes, manually. >>> >>> Cheers, >>> Till >>> >>> On Wed, Jun 29, 2016 at 10:13 PM, Josh <jof...@gmail.com> wrote: >>> >>>> I'm running a Flink cluster as a YARN application, started by: >>>> ./bin/yarn-session.sh -n 4 -jm 2048 -tm 4096 -d >>>> >>>> There are 2 worker nodes, so each are allocated 2 task managers. There >>>> is a stateful Flink job running on the cluster with a parallelism of 2. >>>> >>>> If I now want to increase the number of worker nodes to 3, and add 2 >>>> extra task managers, and then increase the job parallelism, how should I do >>>> this? >>>> >>>> I'm using EMR, so adding an extra worker node and making it available >>>> to YARN is easy to do via the AWS console. But I haven't been able to find >>>> any information in Flink docs about how to resize a running Flink cluster >>>> on YARN. Is it possible to resize it while the YARN application is running, >>>> or do I need to stop the YARN application and redeploy the cluster? Also do >>>> I need to redeploy my Flink job from a savepoint to increase its >>>> parallelism, or do I do this while the job is running? >>>> >>>> I tried redeploying the cluster having added a third worker node, via: >>>> >>>> > yarn application -kill myflinkcluster >>>> >>>> > ./bin/yarn-session.sh -n 6 -jm 2048 -tm 4096 -d >>>> >>>> (note increasing the number of task managers from 4 to 6) >>>> >>>> Surprisingly, this redeployed a Flink cluster with 4 task mangers (not >>>> 6!) and restored my job from the last checkpoint. >>>> >>>> Can anyone point me in the right direction? >>>> >>>> Thanks, >>>> >>>> Josh >>>> >>>> >>>> >>>> >>> >> >