Re: Streaming time window

2015-12-10 Thread Fabian Hueske
You are right, WindowFunctions collect all data in a window and are evaluated at once. Although FoldFunctions could be directly applied on each element that enters a window, this is not done at the moment. Only ReduceFunctions are eagerly applied. If you port your code to a ReduceFunction, you can

Re: Streaming time window

2015-12-10 Thread Martin Neumann
I will give this a try. Though I'm not sure I can switch over to WindowFunction. I work with potentially huge Windows, the Fold gives me a minimal and constant memory footprint. Switching to WindowFunction will require to keep the Window in Memory before it can be processed (at least to my underst

Re: Streaming time window

2015-12-10 Thread Fabian Hueske
Sure. You don't need a trigger, but a WindowFunction instead of the FoldFunction. Only the WindowFunction has access to the Window object. Something like this: poissHostStreams .timeWindow(Time.of(WINDOW_SIZE, TimeUnit.MILLISECONDS)) .apply(new WindowFunction() { @overr

Re: Streaming time window

2015-12-10 Thread Martin Neumann
Hi Fabian, thanks for your answer. Can I do the same in java using normal time windows (without additional trigger)? My current codes looks like this: poissHostStreams .timeWindow(Time.of(WINDOW_SIZE, TimeUnit.MILLISECONDS)) .fold(new Tuple2<>("", new HashMap<>()), new MultiValue

Re: Streaming time window

2015-12-10 Thread Fabian Hueske
Hi Martin, you can get the start and end time of a window from the TimeWindow object. The following Scala code snippet shows how to access the window end time (start time is equivalent): .timeWindow(Time.minutes(5)) .trigger(new EarlyCountTrigger(earlyCountThreshold)) .apply { ( key: Int, win

Streaming time window

2015-12-10 Thread Martin Neumann
Hej, Is it possible to extract the start and end window time stamps from within a window operator? I have an event time based window that does a simple fold function. I want to put the output into elasticsearch and want to preserve the start and end timestamp of the data so I can directly compare