To clarify Š it's 64HT cores per node, 16 nodes each with 128GB. Well actually I have 48 nodes Š but trying to limit it so we have a comparison with Spark/MPI/MapReduce all at the same node count.
Thanks for the information. -- Jonathan (Bill) Sparks Software Architecture Cray Inc. On 6/19/15 9:44 AM, "Ufuk Celebi" <[email protected]> wrote: >PS: I've read your last email as 64 HT cores per machine. If it was in >total over the 16 nodes, you have to adjust my response accordingly. ;) > >On 19 Jun 2015, at 16:42, Fabian Hueske <[email protected]> wrote: > >> Hi Bill, >> >> no worry, questions are the purpose of this mailing list. >> >> The number network buffers is a parameter that needs to be scaled with >>your setup. The reason for that is Flink's pipelined data transfer, >>which requires a certain number of network buffers to be available at >>the same time during processing. >> >> There is an FAQ entry that explains how to set this parameter according >>to your setup: >> --> >>http://flink.apache.org/faq.html#i-get-an-error-message-saying-that-not-e >>nough-buffers-are-available-how-do-i-fix-this >> >> The documentation for parallel execution can be found here: >> >>http://ci.apache.org/projects/flink/flink-docs-master/apis/programming_gu >>ide.html#parallel-execution >> >> If you are working on the latest snapshot you can also configure Flink >>to use batched data transfer instead of pipelined transfer. This is done >>via the ExecutionConfig.setExecutionMode(), which you obtain by calling >>getConfig() on your ExecutionEnvironment. >> >> Best, Fabian >> >> >> 2015-06-19 16:31 GMT+02:00 Maximilian Michels <[email protected]>: >> Hi Bill, >> >> You're right. Simply increasing the task manager slots doesn't do >>anything. It is correct to set the parallelism to taskManagers*slots. >>Simply increase the number of network buffers in the flink-conf.yaml, >>e.g. to 4096. In the future, we will configure this setting dynamically. >> >> Let us know if your runtime decreases :) >> >> Cheers, >> Max >> >> On Fri, Jun 19, 2015 at 4:24 PM, Bill Sparks <[email protected]> wrote: >> >> Sorry for the post again. I guess I'm not understanding thisŠ >> >> The question is how to scale up/increase the execution of a problem. >>What I'm trying to do, is get the best out of the available processors >>for a given node count and compare this against spark, using KMeans. >> >> For spark, one method is to increase the executors and RDD partitions >>- for Flink I can increase the number of task slots >>(taskmanager.numberOfTaskSlots). My empirical evidence suggests that >>just increasing the slots does not increase processing of the data. Is >>there something I'm missing? Much like spark with re-partitioning your >>datasets, is there an equivalent option for flink? What about the >>parallelism argument The referring document seems to be brokenŠ >> >> This seems to be a dead link: >>https://github.com/apache/flink/blob/master/docs/setup/%7B%7Bsite.baseurl >>%7D%7D/apis/programming_guide.html#parallel-execution >> >> If I do increase the parallelism to be (taskManagers*slots) I hit the >>"Insufficient number of network buffersŠ" >> >> I have 16 nodes (64 HT cores), and have run TaskSlots from 1, 4, 8, 16 >>and still the execution time is always around 5-6 minutes, using the >>default parallelism. >> >> Regards, >> Bill >> -- >> Jonathan (Bill) Sparks >> Software Architecture >> Cray Inc. >> >> >
