Hi Rinat,
I think there is one configuration {{state.checkpoints.num-retained}} to
control the maximum number of completed checkpoints to retain, the default
value is 1. So the risk you mentioned should not happen. Refer to
https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/config.html#checkpointing
you could find more configurations of checkpoint.
Best, Sihua
On 06/8/2018 22:55,Rinat<[email protected]> wrote:
Hi mates, got a question about different state backends.
As I've properly understood, on every checkpoint, Flink flushes it’s current
state into backend. In case of FsStateBackend we’ll have a separate file for
each checkpoint, and during the job lifecycle we got a risk of
a huge amount of state files in hdfs, that is not very cool for a hadoop
name-node.
Does Flink have any clean-up strategies for it’s state in different
implementation of backends ? If you could provide any links, where I could read
about more details of this process, it’ll be awesome ))
Thx a lot for your help.
Sincerely yours,
Rinat Sharipov
Software Engineer at 1DMP CORE Team
email: [email protected]
mobile: +7 (925) 416-37-26
CleverDATA
make your data clever