Hi, devs,

At present, at the beginning of checkpoint, all OperatorCoordinator will
first perform snapshot in JM, and operator will then perform snapshot
according to the order of receiving checkpoint barrier. In this case, does
the data stored in OperatorCoordinator really correspond to the checkpoint
of Operator?

For example, in the scenario of source -> map -> sink, we add an
OperatorCoordinator to the operator of the map to calculate the max of all
data. If we record the max value at the beginning of the checkpoint,
however, when the operator of the map actually starts to make checkpoint,
the max value may have been updated, so is the max value snapshoted by
OperatorCoordinator really what this checkpoint should correspond to?
Perhaps this example is not appropriate, but it can be really confusing,
especially when troubleshooting by querying the state of checkpoints.

Thanks,
Ming Li

Reply via email to