Hello Flinksters, What is the most idiomatic way in Flink to get the count of records grouped by a Key (the Key can have multiple fields)?
I have referred to this ticket <https://issues.apache.org/jira/browse/FLINK-1269> but because it is still open, I can't make out what has been the final decision. Let's say that we have following records (case class or tuple, whatever): f1, f2, f3, f4 ------------------ 1, 1, 2, "A" 1, 1, 2, "B" 2, 1, 3, "A" 3, 1, 4, "C" I group this DateSet on a composite key of (f2,f3) and then, I need the count: ([1,2], 2) ([1,3], 1) ([1,4], 1) I could have gone the way of accepted wisdom of /mapping/ with an extra '1' for every key and then, /reducing/ with a /sum/ operation, but I think it is somewhat low-level than what one is expected to do. Spark has this /countByKey/ operator for such a purpose. Could someone please nudge me to the right direction? -- Nirmalya -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Count-of-Grouped-DataSet-tp6592.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.