Re: How to create a stream of data batches

2015-09-07 Thread Juan Rodríguez Hortalá
Hi, I'm just a Flink newbie, but maybe I'd suggest using window operators with a Count policy for that https://ci.apache.org/projects/flink/flink-docs-release-0.9/apis/streaming_guide.html#window-operators Hope that helps. Greetings, Juan 2015-09-04 14:14 GMT+02:00 Stephan Ewen : > Interest

Re: How to create a stream of data batches

2015-09-04 Thread Stephan Ewen
Interesting question, you are the second to ask that. Batching in user code is a way, as Matthias said. We have on the roadmap a way to transform a stream to a set of batches, but it will be a bit until this is in. See https://cwiki.apache.org/confluence/display/FLINK/Streams+and+Operations+on+Str

Re: How to create a stream of data batches

2015-09-04 Thread Matthias J. Sax
Hi Andres, you could do this by using your own data type, for example > public class MyBatch { > private ArrayList data = new ArrayList > } In the DataSource, you need to specify your own InputFormat that reads multiple tuples into a batch and emits the whole batch at once. However, be aware,