It appears actually that the slots used are all on the same host.My guess is because I am using the default partitioning method (forward, which defaults to the same host) However I now tried .shuffle() and .distribute() without any luck: I have a DataStream<String> text = env.socketTextStream(inHostName, inPort); this is the one socket input stream.Adding text.distribute().map(...)does not seem to distribute the .map() process on the other hosts.Is this the correct way to use .distribute() on a stream input? ThanksEmmanuel From: ele...@msn.com To: user@flink.apache.org Subject: Flink logs only written to one host Date: Thu, 12 Mar 2015 17:30:28 +0000
Hello, I'm using a 3 nodes (3VMs) cluster, 3CPUs each, parallelism of 9, I usually only see taskmanager.out logs generated only on one of the 3 nodes when I use the System.out.println() method, to print debug info in my main processing function. Is this expected? Or am I just doing something wrong? I stream from a socket with socketTextStream; I understand that this job runs on a single process, and I see that in the UI (using one slot only), but the computation task runs on 9 slots. That task includes the System.out.println() statement, yet it only shows on one host's .out log folder. The host is not always the same, so I have to tail all logs on all hosts, but I'm surprised of this behavior.Am I just missing something? Are 'print' statement to stdout aggregated on one host somehow? If so how is this controlled? Why would that host change? I would love to understand what is going on, and if maybe somehow the 9 slots may be running on a single host which would defeat the purpose. Thanks for the insight Emmanuel