Hey!
There is a wish to decrease amount of in-flight data which can improve
aligned checkpoint time(fewer in-flight data to process before
checkpoint can complete) and improve the behaviour and performance of
unaligned checkpoints (fewer in-flight data that needs to be persisted
in every unaligned checkpoint). The main idea is not to keep as much
in-flight data as much memory we have but keeping the amount of data
which can be predictably handling for configured amount of time(ex. we
keep data which can be processed in 1 sec). It can be achieved by
calculation of the effective throughput and following changes the buffer
size based on the this throughput. More details about the proposal you
can find here [1].
What are you thoughts about it?
[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment
--
Best regards,
Anton Kalashnikov