Hi,

Thank you for the responses.
I am not sure if I will be able to use Fold/Reduce function, but I will
keep that in mind.

I have one more question, so what is the implication of having a key that
splits the data into window of very small size(=> large number of small
windows) ?

Thanks and Regards,
Vishnu Viswanath,

On Tue, Jun 28, 2016 at 5:04 AM, Aljoscha Krettek <aljos...@apache.org>
wrote:

> Hi,
> one thing to add: if you use a ReduceFunction or a FoldFunction for your
> window the state will not grow with bigger window sizes or larger numbers
> of elements because the result is eagerly computed. In that case, state
> size is only dependent on the number of individual keys.
>
> Cheers,
> Aljoscha
>
> On Tue, 28 Jun 2016 at 10:36 Kostas Kloudas <k.klou...@data-artisans.com>
> wrote:
>
>> Hi Vishnu,
>>
>> RocksDB allows for storing the window contents on disk when the state of
>> a window becomes too big.
>> BUT when you have to trigger and apply the computation of your window
>> function on that big window,
>> then all of its state is loaded in memory.
>>
>> So although during the window formation phase, RocksDB allows you to not
>>  worry about storage space,
>> when it is time to fire your computation, then you have to consider how
>> much RAM you have and if the
>> window fits in it.
>>
>> Regards,
>> Kostas
>>
>>
>> On Jun 28, 2016, at 1:25 AM, Vishnu Viswanath <
>> vishnu.viswanat...@gmail.com> wrote:
>>
>> Hi Kostas,
>>
>> Thank you.
>> Yes 2) was exactly what I wanted to know.
>>
>> - So if I am using RocksDB as state backend, does that mean that I don't
>> have to worry much about the memory available per node since RocksDB will
>> use RAM and Disk to store the window state?
>>
>> Regards,
>> Vishnu
>>
>>
>> On Mon, Jun 27, 2016 at 6:19 PM, Kostas Kloudas <
>> k.klou...@data-artisans.com> wrote:
>>
>>> Hi Vishnu,
>>>
>>> I hope the following will help answer your question:
>>>
>>> 1) Elements are first split by key (apart from global windows) and then
>>> are put into windows. In other words, windows are keyed.
>>> 2) A window belonging to a certain key is handled by a single node. In
>>> other words, no matter how big the window is, its
>>>         state (the elements it contains) will never be split between two
>>> or more nodes.
>>> 3) Where the state is stored, depends on your state backend. Currently
>>> Flink supports an in-memory one, a filesystem one, and
>>>         a rocksDB one which is in the middle (first in-memory and then
>>> disk when needed). Of course you can implement your own.
>>>
>>> From the above, you can see that if you use the memory-backed state
>>> backend, then your window size is limited by the memory
>>> available at each of your nodes. If you use the fs state backend, then
>>> your state is stored on disk. Finally, rocksDB will initially
>>> use RAM and then spill on disk when no more memory is available.
>>>
>>> Here I have to add that the window documentation is currently being
>>> re-written to explain new features introduced in Flink 1.1,
>>> which include more flexible handling of late events and more explicit
>>> state garbage collection.
>>>
>>> So please stay tuned!
>>>
>>> I hope this helps at answering your question,
>>> Kostas
>>>
>>> > On Jun 27, 2016, at 8:22 PM, Vishnu Viswanath <
>>> vishnu.viswanat...@gmail.com> wrote:
>>> >
>>> > Hi All,
>>> >
>>> > - Is there any restriction on the size of a window in Flink with
>>> respect to the memory of the nodes?
>>> > - What happens if a window size grows more than size of a node, will
>>> it be split into multiple nodes?
>>> >
>>> > if I am going to have a huge window, should I have fewer nodes with
>>> more memory.
>>> > Is there any documentation on how memory is managed/handled in the
>>> case of windows and also in the case of joins.
>>> >
>>> > Regards,
>>> > Vishnu
>>>
>>>
>>
>>

Reply via email to