Re: Caching collected objects in .apply()

2017-01-09 Thread Aljoscha Krettek
Hi, I think your approach with two window() operations is fine. There is no way to retrieve the result from a previous window because it is not strictly defined what the previous window is. Also, keeping data inside your user functions (in fields) is problematic because these function instances are

Re: Caching collected objects in .apply()

2017-01-05 Thread Fabian Hueske
Hi Matt, I think your approach should be fine. Although the second keyBy is logically a shuffle, the data will not be sent of the wire to a different machine if the parallelism of the first and second window operator are identical. It only cost one serialization / deserialization step. I would be

Re: Caching collected objects in .apply()

2017-01-05 Thread Matt
I'm still looking for an answer to this question. Hope you can give me some insight! On Thu, Dec 22, 2016 at 6:17 PM, Matt wrote: > Just to be clear, the stream is of String elements. The first part of the > pipeline (up to the first .apply) receives those strings, and returns > objects of anoth

Re: Caching collected objects in .apply()

2016-12-22 Thread Matt
Just to be clear, the stream is of String elements. The first part of the pipeline (up to the first .apply) receives those strings, and returns objects of another class ("A" let's say). On Thu, Dec 22, 2016 at 6:04 PM, Matt wrote: > Hello, > > I have a window processing 10 objects at a time, and

Caching collected objects in .apply()

2016-12-22 Thread Matt
Hello, I have a window processing 10 objects at a time, and creating 1 as a result. The problem is in order to create that object I need the object from the previous window. I'm doing this: stream .keyBy(...some key...) .countWindow(10, 1) .apply(...creates an element A...) .keyBy(...sam