Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

2016-07-10 Thread Saliya Ekanayake
pdsh is available in head node only, but when I tried to do *start-cluster *from head node (note Job manager node is not head node) it didn't work, which is why I modified the scripts. Yes, exactly, this is what I was trying to do. My research area has been on these NUMA related issues and binding

Re: Parameters to Control Intra-node Parallelism

2016-07-10 Thread Saliya Ekanayake
Greg, where did you see the OOM log as shown in this mail thread? In my case none of the TaskManagers nor JobManger reports an error like this. On Sun, Jul 10, 2016 at 8:45 PM, Greg Hogan wrote: > These symptoms sounds similar to what I was experiencing in the following > thread. Flink can have

Re: Parameters to Control Intra-node Parallelism

2016-07-10 Thread Greg Hogan
These symptoms sounds similar to what I was experiencing in the following thread. Flink can have some unexpected memory usage which can result in an OOM kill by the kernel, and this becomes more pronounced as the cluster size grows. https://www.mail-archive.com/dev@flink.apache.org/msg06346.html

Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

2016-07-10 Thread Greg Hogan
Hi Saliya, Would you happen to have pdsh (parallel distributed shell) installed? If so the TaskManager startup in start-cluster.sh will run in parallel. As to running 24 TaskManagers together, are these running across multiple NUMA nodes? I had filed FLINK-3163 ( https://issues.apache.org/jira/br

Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

2016-07-10 Thread Saliya Ekanayake
Thank you. Yes, the previous format is still supported. If a number is specified after the hostname then only it'll kick in this change. On Sun, Jul 10, 2016 at 5:42 PM, Gyula Fóra wrote: > Hi, > > I think this would be a nice addition especially for Flink clusters > running on big machines wh

Re: Modifying start-cluster scripts to efficiently spawn multiple TMs

2016-07-10 Thread Gyula Fóra
Hi, I think this would be a nice addition especially for Flink clusters running on big machines where you might want to run multiple task managers just to split the memory between multiple java processes. In any case the previous config format should also be supported as the default. I am curiou

Re: Record has Long.MIN_VALUE timestamp (= no timestamp marker). Is the time characteristic set to 'ProcessingTime', or did you forget to call 'DataStream.assignTimestampsAndWatermarks(...)'?

2016-07-10 Thread Stephan Ewen
Hi David! Have you had a look at the docs for Event Time and Watermark Generation? There are some examples for some typical cases: Event Time / Watermark Overview: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/event_time.html Typical Watermark Generators: https://ci.apac

The two inputs have different execution contexts.

2016-07-10 Thread Alieh Saeedi
I can not join or coGroup two tuple2 datasets of the same tome. The error is  java.lang.IllegalArgumentException: The two inputs have different execution contexts.:-(

Re: Record has Long.MIN_VALUE timestamp (= no timestamp marker). Is the time characteristic set to 'ProcessingTime', or did you forget to call 'DataStream.assignTimestampsAndWatermarks(...)'?

2016-07-10 Thread David Olsen
Changing to env.setStreamTimeCharacteristic(TimeCharacteristic.ProcessingTime), and removing assignTimestampsAndWatermarks(new MyTimestampExtractor) get the code executing now. One more question. I read the java doc[1] it seems watermark is a mark telling operators that no more elements will arriv