Hi Yiannis! You need to increase the number of buffers for your setup. Here is a FAQ entry with a few pointers:
http://flink.apache.org/docs/0.8/faq.html#i-get-an-error-message-saying-that-not-enough-buffers-are-available-how-do-i-fix-this Greetings, Stephan Am 18.02.2015 21:21 schrieb "Yiannis Gkoufas" <johngou...@gmail.com>: > Hi there, > > I have a cluster of 10 nodes with 12 CPUs each. > This is my configuration: > > jobmanager.rpc.port: 6123 > > jobmanager.heap.mb: 4024 > > taskmanager.heap.mb: 8096 > > taskmanager.numberOfTaskSlots: 12 > > parallelization.degree.default: 120 > > I have been getting the following error: > > java.lang.Exception: Failed to deploy the task Reduce (SUM(1)) (65/120) - > execution #0 to slot SimpleSlot (1)(0) - efc370a0b2a9a63f2e7b960cfe4e4c27 - > ALLOCATED/ALIVE: java.io.IOException: Insufficient number of network > buffers: required 120, but only 2 of 2048 available. > at > org.apache.flink.runtime.io.network.buffer.NetworkBufferPool.createBufferPool(NetworkBufferPool.java:155) > at > org.apache.flink.runtime.io.network.NetworkEnvironment.registerTask(NetworkEnvironment.java:163) > at org.apache.flink.runtime.taskmanager.TaskManager.org > $apache$flink$runtime$taskmanager$TaskManager$$submitTask(TaskManager.scala:426) > at > org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$receiveWithLogMessages$1.applyOrElse(TaskManager.scala:261) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) > at > scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) > at > org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:37) > at > org.apache.flink.runtime.ActorLogMessages$$anon$1.apply(ActorLogMessages.scala:30) > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:118) > at > org.apache.flink.runtime.ActorLogMessages$$anon$1.applyOrElse(ActorLogMessages.scala:30) > at akka.actor.Actor$class.aroundReceive(Actor.scala:465) > at > org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:89) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at akka.actor.ActorCell.invoke(ActorCell.scala:487) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) > at akka.dispatch.Mailbox.run(Mailbox.scala:221) > at akka.dispatch.Mailbox.exec(Mailbox.scala:231) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > > at > org.apache.flink.runtime.executiongraph.Execution$2.onComplete(Execution.java:344) > at akka.dispatch.OnComplete.internal(Future.scala:247) > at akka.dispatch.OnComplete.internal(Future.scala:244) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:174) > at akka.dispatch.japi$CallbackBridge.apply(Future.scala:171) > at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32) > at > scala.concurrent.impl.ExecutionContextImpl$$anon$3.exec(ExecutionContextImpl.scala:107) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > > > I failed to get any info online on how to solve it. > Any help would be welcome. > > Thank you! >