[ https://issues.apache.org/jira/browse/FLINK-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941379#comment-14941379 ]
Greg Hogan commented on FLINK-2716: ----------------------------------- [~StephanEwen], I would like to use {{TypeComparator.hash}} within a {{RichFlatMapFunction}} (similar to {{DataSet.count}}) for this implementation. You had noted earlier discussion about making serializers available to {{RichFunction}} implementations and access to type comparators could be implemented likewise. Ideally the user would only see the available number of serializers and type comparators: {{getInputSerializer()}} for single input functions, {{getFirstInputSerializer()}} and {{getSecondInputSerializer()}} for dual input functions. Currently {{RichFlatMapFunction}} inherits from {{AbstractRichFunction}} which implements access to the {{RuntimeContext}}. We could add a layer and have each single input function inherit from an {{AbstractSingleInputRichFunction}} (similar to how {{FlatMapOperator}} inherits from {{SingleInputUdfOperator}}) that would provide access to serializers and type comparators (and likewise with {{AbstractTwoInputRichFunction}} for dual input functions). > Checksum method for DataSet and Graph > ------------------------------------- > > Key: FLINK-2716 > URL: https://issues.apache.org/jira/browse/FLINK-2716 > Project: Flink > Issue Type: Improvement > Components: Gelly, Java API, Scala API > Affects Versions: master > Reporter: Greg Hogan > Assignee: Greg Hogan > Priority: Minor > > {{DataSet.count()}}, {{Graph.numberOfVertices()}}, and > {{Graph.numberOfEdges()}} provide measures of the number of distributed data > elements. New {{DataSet.checksum()}} and {{Graph.checksum()}} methods will > summarize the content of data elements and support algorithm validation, > integration testing, and benchmarking. -- This message was sent by Atlassian JIRA (v6.3.4#6332)