Update. the previous error probably was caused because I didn't restart the cluster before a re-execution. (maybe)
Then, I tried to execute the program on a cluster of one node on my laptop and, after solved some little issues, everything works fine. Now I'm trying to deploy the same jar on the real cluster. Initially everything seems to work correctly. giordano@giordano-2-2-100-1:~$ ./flink-1.3.2/bin/flink run flink-java-project-0.1.jar Cluster configuration: Standalone cluster with JobManager at localhost/127.0.0.1:6123 Using address localhost:6123 to connect to JobManager. JobManager web interface address http://localhost:8081 Starting execution of program Submitting job with JobID: 161d91dda7c7012c8f48fa8a104a1662. Waiting for job completion. Connected to JobManager at Actor[akka.tcp://flink@localhost:6123/user/jobmanager#430534598] with leader session id 00000000-0000-0000-0000-000000000000. 09/14/2017 22:05:00 Job execution switched to status RUNNING. 09/14/2017 22:05:00 Source: Custom Source -> Timestamps/Watermarks(1/1) switched to SCHEDULED 09/14/2017 22:05:00 Map -> Sink: Unnamed(1/1) switched to SCHEDULED 09/14/2017 22:05:00 Learn -> Select -> Process -> (Sink: Cassandra Sink, Sink: Cassandra Sink)(1/1) switched to SCHEDULED 09/14/2017 22:05:00 Source: Custom Source -> Timestamps/Watermarks(1/1) switched to DEPLOYING 09/14/2017 22:05:00 Map -> Sink: Unnamed(1/1) switched to DEPLOYING 09/14/2017 22:05:00 Learn -> Select -> Process -> (Sink: Cassandra Sink, Sink: Cassandra Sink)(1/1) switched to DEPLOYING 09/14/2017 22:05:01 Source: Custom Source -> Timestamps/Watermarks(1/1) switched to RUNNING 09/14/2017 22:05:01 Map -> Sink: Unnamed(1/1) switched to RUNNING 09/14/2017 22:05:01 Learn -> Select -> Process -> (Sink: Cassandra Sink, Sink: Cassandra Sink)(1/1) switched to RUNNING Unfortunately, after a minute about, the job fails: 09/14/2017 22:06:53 Map -> Sink: Unnamed(1/1) switched to FAILED java.lang.Exception: TaskManager was lost/killed: 413eda6bf77223085c59e104680259bc @ giordano-2-2-100-1 (dataPort=36498) at org.apache.flink.runtime.instance.SimpleSlot.releaseSlot(SimpleSlot.java:217) at org.apache.flink.runtime.instance.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:533) at org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:192) at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:167) at org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:212) at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$handleTaskManagerTerminated(JobManager.scala:1228) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:1131) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:49) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33) at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28) at akka.actor.Actor$class.aroundReceive(Actor.scala:467) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:125) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:44) at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369) at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501) at akka.actor.ActorCell.invoke(ActorCell.scala:486) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 09/14/2017 22:06:53 Job execution switched to status FAILING. java.lang.Exception: TaskManager was lost/killed: 413eda6bf77223085c59e104680259bc @ giordano-2-2-100-1 (dataPort=36498) at org.apache.flink.runtime.instance.SimpleSlot.releaseSlot(SimpleSlot.java:217) at org.apache.flink.runtime.instance.SlotSharingGroupAssignment.releaseSharedSlot(SlotSharingGroupAssignment.java:533) at org.apache.flink.runtime.instance.SharedSlot.releaseSlot(SharedSlot.java:192) at org.apache.flink.runtime.instance.Instance.markDead(Instance.java:167) at org.apache.flink.runtime.instance.InstanceManager.unregisterTaskManager(InstanceManager.java:212) at org.apache.flink.runtime.jobmanager.JobManager.org$apache$flink$runtime$jobmanager$JobManager$$handleTaskManagerTerminated(JobManager.scala:1228) at org.apache.flink.runtime.jobmanager.JobManager$$anonfun$handleMessage$1.applyOrElse(JobManager.scala:1131) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:49) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33) at org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28) at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28) at akka.actor.Actor$class.aroundReceive(Actor.scala:467) at org.apache.flink.runtime.jobmanager.JobManager.aroundReceive(JobManager.scala:125) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) at akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:44) at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369) at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501) at akka.actor.ActorCell.invoke(ActorCell.scala:486) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) at akka.dispatch.Mailbox.run(Mailbox.scala:220) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) 09/14/2017 22:06:53 Source: Custom Source -> Timestamps/Watermarks(1/1) switched to CANCELING 09/14/2017 22:06:53 Learn -> Select -> Process -> (Sink: Cassandra Sink, Sink: Cassandra Sink)(1/1) switched to CANCELING 09/14/2017 22:06:53 Source: Custom Source -> Timestamps/Watermarks(1/1) switched to CANCELED 09/14/2017 22:06:53 Learn -> Select -> Process -> (Sink: Cassandra Sink, Sink: Cassandra Sink)(1/1) switched to CANCELED Then the job is restarted but shows again the NoResourceAvailable error. I start the cluster using start-cluster.sh script and everything works fine starting task managers also in the other node. I set on every nodes number of task slots equal to core number (2) while parallelism key is commented. On the master node (it works as jobmanager and taskmanager) I set jobmanager.heap.mb: 756 taskmanager.heap.mb:756 (I have 2GB of Ram on it) while on the other two nodes: taskmanager.heap.mb:1512 (I have 2GB of Ram on them) Hints? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/