Re: Distribute DataSet to subset of nodes

2015-09-17 Thread Fabian Hueske
Hi Stefan, I think I have a solution for your problem :-) 1) Distribute both parts of the small data to each machine (you have done that) 2) Your mapper should have a parallelism of 10, the tasks with ID 0 to 4 (get ID via RichFunction.getRuntimeContext().getIndexOfThisSubtask()) read the first h

Re: Kinesis Connector

2015-09-17 Thread Stephan Ewen
Hi Giancarlo! I am not aware of any existing Kinesis connector. Would be definitely something to put onto the roadmap for the near future. This is a stream source we should support similarly to Kafka. I am not super familiar with Kinesis, but it looks a bit like offering a similar abstraction as

Re: Kinesis Connector

2015-09-17 Thread Márton Balassi
Hi Giancarlo, I have no knowledge of someone working on such a project. However it would be a valuable contribution, if you were to start the effort please keep us notified, I would also suggest to file a JIRA ticket for it. Best, Marton On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano wrote

Kinesis Connector

2015-09-17 Thread Giancarlo Pagano
Hi, Is there any project already working on a Kinesis connector for Flink or any plan to add a Kinesis connector to the main Flink distribution in the future? Thanks, Giancarlo

Joining Windowed Data Streams

2015-09-17 Thread Philipp Goetze
Hey community, is there a possibility to join two windowed data streams, instead of joining two data streams on a window? For example if one wants to implement Q1 of SRBench you load the data, create one window definition and then one would combine filters an