Thank you! On Wednesday, 29 June 2016, Aljoscha Krettek <aljos...@apache.org> wrote:
> Hi, > the result of splitting by key is that processing can easily be > distributed among the workers because the windows for individual keys can > be processed independently. This should improve cluster utilization. > > Cheers, > Aljoscha > > On Tue, 28 Jun 2016 at 17:26 Vishnu Viswanath < > vishnu.viswanat...@gmail.com > <javascript:_e(%7B%7D,'cvml','vishnu.viswanat...@gmail.com');>> wrote: > >> Hi, >> >> Thank you for the responses. >> I am not sure if I will be able to use Fold/Reduce function, but I will >> keep that in mind. >> >> I have one more question, so what is the implication of having a key that >> splits the data into window of very small size(=> large number of small >> windows) ? >> >> Thanks and Regards, >> Vishnu Viswanath, >> >> On Tue, Jun 28, 2016 at 5:04 AM, Aljoscha Krettek <aljos...@apache.org >> <javascript:_e(%7B%7D,'cvml','aljos...@apache.org');>> wrote: >> >>> Hi, >>> one thing to add: if you use a ReduceFunction or a FoldFunction for your >>> window the state will not grow with bigger window sizes or larger numbers >>> of elements because the result is eagerly computed. In that case, state >>> size is only dependent on the number of individual keys. >>> >>> Cheers, >>> Aljoscha >>> >>> On Tue, 28 Jun 2016 at 10:36 Kostas Kloudas <k.klou...@data-artisans.com >>> <javascript:_e(%7B%7D,'cvml','k.klou...@data-artisans.com');>> wrote: >>> >>>> Hi Vishnu, >>>> >>>> RocksDB allows for storing the window contents on disk when the state >>>> of a window becomes too big. >>>> BUT when you have to trigger and apply the computation of your window >>>> function on that big window, >>>> then all of its state is loaded in memory. >>>> >>>> So although during the window formation phase, RocksDB allows you to >>>> not worry about storage space, >>>> when it is time to fire your computation, then you have to consider how >>>> much RAM you have and if the >>>> window fits in it. >>>> >>>> Regards, >>>> Kostas >>>> >>>> >>>> On Jun 28, 2016, at 1:25 AM, Vishnu Viswanath < >>>> vishnu.viswanat...@gmail.com >>>> <javascript:_e(%7B%7D,'cvml','vishnu.viswanat...@gmail.com');>> wrote: >>>> >>>> Hi Kostas, >>>> >>>> Thank you. >>>> Yes 2) was exactly what I wanted to know. >>>> >>>> - So if I am using RocksDB as state backend, does that mean that I >>>> don't have to worry much about the memory available per node since RocksDB >>>> will use RAM and Disk to store the window state? >>>> >>>> Regards, >>>> Vishnu >>>> >>>> >>>> On Mon, Jun 27, 2016 at 6:19 PM, Kostas Kloudas < >>>> k.klou...@data-artisans.com >>>> <javascript:_e(%7B%7D,'cvml','k.klou...@data-artisans.com');>> wrote: >>>> >>>>> Hi Vishnu, >>>>> >>>>> I hope the following will help answer your question: >>>>> >>>>> 1) Elements are first split by key (apart from global windows) and >>>>> then are put into windows. In other words, windows are keyed. >>>>> 2) A window belonging to a certain key is handled by a single node. In >>>>> other words, no matter how big the window is, its >>>>> state (the elements it contains) will never be split between >>>>> two or more nodes. >>>>> 3) Where the state is stored, depends on your state backend. Currently >>>>> Flink supports an in-memory one, a filesystem one, and >>>>> a rocksDB one which is in the middle (first in-memory and then >>>>> disk when needed). Of course you can implement your own. >>>>> >>>>> From the above, you can see that if you use the memory-backed state >>>>> backend, then your window size is limited by the memory >>>>> available at each of your nodes. If you use the fs state backend, then >>>>> your state is stored on disk. Finally, rocksDB will initially >>>>> use RAM and then spill on disk when no more memory is available. >>>>> >>>>> Here I have to add that the window documentation is currently being >>>>> re-written to explain new features introduced in Flink 1.1, >>>>> which include more flexible handling of late events and more explicit >>>>> state garbage collection. >>>>> >>>>> So please stay tuned! >>>>> >>>>> I hope this helps at answering your question, >>>>> Kostas >>>>> >>>>> > On Jun 27, 2016, at 8:22 PM, Vishnu Viswanath < >>>>> vishnu.viswanat...@gmail.com >>>>> <javascript:_e(%7B%7D,'cvml','vishnu.viswanat...@gmail.com');>> wrote: >>>>> > >>>>> > Hi All, >>>>> > >>>>> > - Is there any restriction on the size of a window in Flink with >>>>> respect to the memory of the nodes? >>>>> > - What happens if a window size grows more than size of a node, will >>>>> it be split into multiple nodes? >>>>> > >>>>> > if I am going to have a huge window, should I have fewer nodes with >>>>> more memory. >>>>> > Is there any documentation on how memory is managed/handled in the >>>>> case of windows and also in the case of joins. >>>>> > >>>>> > Regards, >>>>> > Vishnu >>>>> >>>>> >>>> >>>> >> >> -- Thanks and Regards, Vishnu Viswanath, *www.vishnuviswanath.com <http://www.vishnuviswanath.com>*