Github user kl0u commented on the pull request: https://github.com/apache/flink/pull/934#issuecomment-126076384 Hi @mxm , Thanks a lot for the comments! I integrated most of them. Please have a look and let me know what you think. For the merging of the the different types of snapshots and handling them uniformly I do not have any current solution. If you have any, I am open, of course, to discuss it, because I agree that this would be nice. For the comment on the getAccumulatorResultsStringified(): 1) this is to be presented by the web interface to the user, just for monitoring purposes 2) this is called at the jobManager. The problem is that the jobManager has only the blobKeys that point to the stored accumulators. The serialized data reside in the blobCache and have to be fetched in order to be inspected. Currently the jobManager just forwards the blobKeys to the client, which fetches the blobs and does the deserialization and the final merging. This is done for jobManager scalability reasons, as given that we are talking about accumulators of arbitrary size, loading them from disk and deserializing them would be time and resource consuming. The same holds in the case that we wanted to get the type of these large accumulators (it is needed by the method). We would have to load and deserialize them at the jobManager. The currently implemented solution is just the result of this design decision. If you have any other strategy or solution that is worth implementing, let me know.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---