RE: Flink logs only written to one host

2015-03-12 Thread Stephan Ewen
Hi Emmanuel! I think there is currently no way to tell the scheduler to pick the "least loaded host". That would be feature we would need to add to Flink. If you want, open a feature request, and give us some more information on how you would expect this to behave. Greetings, Stephan Am 12.03.2

RE: Flink logs only written to one host

2015-03-12 Thread Emmanuel
Thanks for the clarification Guyla, I had 9 slots per node, however one node only has 3 CPUs.So, my parallelism here was 9, and all tasks were allocated to the 9 slots on the one host I understand the strategy of trying to minimize network IOs by sending to the same host, but in this case where t

Re: Flink logs only written to one host

2015-03-12 Thread Gyula Fóra
Hey, Let me clarify some things regarding distribute(). You should only specify a partitioning scheme like distribute or shuffle or groupby in cases when it actually matters for you how the data is partitioned across operator instances. By default forwarding is applied which works in the followin

RE: Flink logs only written to one host

2015-03-12 Thread Emmanuel
Hi, I did change my config to have parallelism of 9 and 3 slots on each machine and now it does distribute properly. The other day I was told i could have many more slots than CPUs available and the system would distribute the load properly between the hosts with available CPU time, but it doesn

RE: Flink logs only written to one host

2015-03-12 Thread Fabian Hueske
Can you check the JM log file how many slots are available? Slots are configured per TM. If you configure 9 slots and 3 TMs you end up with 27 slots, 9 on each TM. On Mar 12, 2015 7:55 PM, "Emmanuel" wrote: > It appears actually that the slots used are all on the same host. > My guess is because

RE: Flink logs only written to one host

2015-03-12 Thread Emmanuel
It appears actually that the slots used are all on the same host.My guess is because I am using the default partitioning method (forward, which defaults to the same host) However I now tried .shuffle() and .distribute() without any luck: I have a DataStream text = env.socketTextStream(inHostName

Flink logs only written to one host

2015-03-12 Thread Emmanuel
Hello, I'm using a 3 nodes (3VMs) cluster, 3CPUs each, parallelism of 9, I usually only see taskmanager.out logs generated only on one of the 3 nodes when I use the System.out.println() method, to print debug info in my main processing function. Is this expected? Or am I just doing something wro

Fwd: Flink questions

2015-03-12 Thread Márton Balassi
Dear Emmanuel, I'm Marton, one of the Flink Streaming developers - Robert forwarded your issue to me. Thanks for trying out our project. 1) Debugging: TaskManager logs are currently not forwarded to the UI, but you can find them on the taskmanager machines in the log folder of your Flink distribu

Re: IDE indicates the data type error of my Filter operator in Scala

2015-03-12 Thread HungChang
Got it. Problem solved by changing the "map" val pldIndex = GraphUtils.readVertices(PLDIndexFile).map { vertex => AnnotatedVertex(vertex.annotation, vertex.id) } -- View this message in context: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/IDE-indicates-the-dat