Hey all, It is currently impossible to enable state checkpointing for iterative jobs, because en exception is thrown when creating the jobgraph. This behaviour is motivated by the lack of precise guarantees that we can give with the current fault-tolerance implementations for cyclic graphs.
This PR <https://github.com/apache/flink/pull/812> adds an optional flag to force checkpoints even in case of iterations. The algorithm will take checkpoints periodically as before, but records in transit inside the loop will be lost. However even this guarantee is enough for most applications (Machine Learning for instance) and certainly much better than not having anything at all. I suggest we add this to the 0.9 release as currently many applications suffer from this limitation (SAMOA, ML pipelines, graph streaming etc.) Cheers, Gyula