Hi, Thank you for the responses. I am not sure if I will be able to use Fold/Reduce function, but I will keep that in mind.
I have one more question, so what is the implication of having a key that splits the data into window of very small size(=> large number of small windows) ? Thanks and Regards, Vishnu Viswanath, On Tue, Jun 28, 2016 at 5:04 AM, Aljoscha Krettek <aljos...@apache.org> wrote: > Hi, > one thing to add: if you use a ReduceFunction or a FoldFunction for your > window the state will not grow with bigger window sizes or larger numbers > of elements because the result is eagerly computed. In that case, state > size is only dependent on the number of individual keys. > > Cheers, > Aljoscha > > On Tue, 28 Jun 2016 at 10:36 Kostas Kloudas <k.klou...@data-artisans.com> > wrote: > >> Hi Vishnu, >> >> RocksDB allows for storing the window contents on disk when the state of >> a window becomes too big. >> BUT when you have to trigger and apply the computation of your window >> function on that big window, >> then all of its state is loaded in memory. >> >> So although during the window formation phase, RocksDB allows you to not >> worry about storage space, >> when it is time to fire your computation, then you have to consider how >> much RAM you have and if the >> window fits in it. >> >> Regards, >> Kostas >> >> >> On Jun 28, 2016, at 1:25 AM, Vishnu Viswanath < >> vishnu.viswanat...@gmail.com> wrote: >> >> Hi Kostas, >> >> Thank you. >> Yes 2) was exactly what I wanted to know. >> >> - So if I am using RocksDB as state backend, does that mean that I don't >> have to worry much about the memory available per node since RocksDB will >> use RAM and Disk to store the window state? >> >> Regards, >> Vishnu >> >> >> On Mon, Jun 27, 2016 at 6:19 PM, Kostas Kloudas < >> k.klou...@data-artisans.com> wrote: >> >>> Hi Vishnu, >>> >>> I hope the following will help answer your question: >>> >>> 1) Elements are first split by key (apart from global windows) and then >>> are put into windows. In other words, windows are keyed. >>> 2) A window belonging to a certain key is handled by a single node. In >>> other words, no matter how big the window is, its >>> state (the elements it contains) will never be split between two >>> or more nodes. >>> 3) Where the state is stored, depends on your state backend. Currently >>> Flink supports an in-memory one, a filesystem one, and >>> a rocksDB one which is in the middle (first in-memory and then >>> disk when needed). Of course you can implement your own. >>> >>> From the above, you can see that if you use the memory-backed state >>> backend, then your window size is limited by the memory >>> available at each of your nodes. If you use the fs state backend, then >>> your state is stored on disk. Finally, rocksDB will initially >>> use RAM and then spill on disk when no more memory is available. >>> >>> Here I have to add that the window documentation is currently being >>> re-written to explain new features introduced in Flink 1.1, >>> which include more flexible handling of late events and more explicit >>> state garbage collection. >>> >>> So please stay tuned! >>> >>> I hope this helps at answering your question, >>> Kostas >>> >>> > On Jun 27, 2016, at 8:22 PM, Vishnu Viswanath < >>> vishnu.viswanat...@gmail.com> wrote: >>> > >>> > Hi All, >>> > >>> > - Is there any restriction on the size of a window in Flink with >>> respect to the memory of the nodes? >>> > - What happens if a window size grows more than size of a node, will >>> it be split into multiple nodes? >>> > >>> > if I am going to have a huge window, should I have fewer nodes with >>> more memory. >>> > Is there any documentation on how memory is managed/handled in the >>> case of windows and also in the case of joins. >>> > >>> > Regards, >>> > Vishnu >>> >>> >> >>