Great, thanks for the update! And please share your feedback if it worked
or not.

Piotrek

niedz., 27 wrz 2020 o 11:20 hao kong <h...@lemonbox.me> napisał(a):

> Thanks for the tip!
>      I am currently trying to implement a zookeeper-based coordinator.use
> it to record the current watermark and control streaming according to your
> first suggest.
>
> Piotr Nowojski <pnowoj...@apache.org> 于2020年9月16日周三 下午11:56写道:
>
>> Hey,
>>
>> If you are worried about increased amount of buffered data by the
>> WindowOperator if watermarks/event time is not progressing uniformly across
>> multiple sources, then there is little you can do currently. FLIP-27 [1]
>> will allow us to address this problem in more generic way. What you can
>> currently do is one of two things:
>>
>> 1. Implement a custom throttling function/operator sitting after the
>> sources, that would throttle the sources. If you chain it with the source
>> function, it's relatively ok solution. Note, while you are blocking
>> execution, you will be blocking for example checkpoints from happening. So
>> it's better to sleep 10 ms per every record, compared to sleep 10 seconds
>> once every 1000 records.
>> 2. Throttle the sources themselves (you would need to modify or write
>> your custom sources).
>>
>> But in both cases you need to manually track the event time, and manually
>> make decision which source should be throttled and by how much.
>>
>> Best regards, Piotrek
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface#FLIP27:RefactorSourceInterface-EventTimeAlignment
>>
>> śr., 16 wrz 2020 o 04:17 hao kong <h...@lemonbox.me> napisał(a):
>>
>>> Hello guys,
>>>
>>> I have a job with multiple Kafka sources. They all contain certain
>>> historical data. If you use the events-time window, it will cause sources
>>> with less data to cover more sources through water mark.
>>>
>>>
>>> I can think of a solution, Implement a scheduler in the source phase,
>>> But it is quite complicated to implement. Are ther otherbetter solutions?
>>>
>>>
>>> Any suggestions?
>>> Thanks!
>>>
>>>
>>>

Reply via email to