25, 2017 at 6:23 AM, Bowen Li
>>>> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> I do have a question for how Flink generates windows.
>>>>>
>>>>> We are using a 1-day sized sliding window with 1-hour slide to count
>>>>> som
ow with 1-hour slide to count some
> features of items based on event time. We have about 20million items. We
> observed that Flink only emit results on a fixed time in an hour (e.g. 1am,
> 2am, 3am, or 1:15am, 2:15am, 3:15am with a 15min offset). That's means
> 20million w
dow with 1-hour slide to count
>>>> some features of items based on event time. We have about 20million items.
>>>> We observed that Flink only emit results on a fixed time in an hour (e.g.
>>>> 1am, 2am, 3am, or 1:15am, 2:15am, 3:15am with a 15min offset). That's
>>>>
We have about 20million items.
>>> We observed that Flink only emit results on a fixed time in an hour (e.g.
>>> 1am, 2am, 3am, or 1:15am, 2:15am, 3:15am with a 15min offset). That's
>>> means 20million windows/records are generated at the same time every hour,
>
million windows/records are generated at the same time every hour, which
>> burns down our sink. But nothing is generated in the rest of that hour. The
>> pattern is like this:
>>
>> # generated windows
>> |
>> |/\ /\
>> | / \
______/__\_
> time
>
> Is there any way to even out the number of generated windows/records in an
> hour? Can we have evenly distributed generated load like this?
>
> # generated windows
> |
> |
> |
> |___
> time
>
> Thanks,
> Bowen
>
>
ndows
|
|/\ /\
| / \/ \
|_/__\___/__\_
time
Is there any way to even out the number of generated windows/records in an
hour? Can we have evenly distributed generated load like this?
# generated wi