Hello Kevin, I'm not very familiar with the stream API, but I think you can achieve what you want by mapping over your elements to turn the strings into one-item lists, so that you get a key-value that is (K: String, V: (List[String], Int)) and then apply the window reduce function, which produces a data stream out of a windowed stream, you combine your lists there and sum the value. Again, it's not a great way to use reduce, since you are growing the list with each reduction.
Regards, Theodore On Thu, Aug 4, 2016 at 1:36 AM, Kevin Jacobs <kevin.jac...@cern.ch> wrote: > Hi, > > I have the following use case: > > 1. Group by a specific field. > > 2. Get a list of all messages belonging to the group. > > 3. Count the number of records in the group. > > With the use of DataSets, it is fairly easy to do this (see > http://stackoverflow.com/questions/38745446/apache-flink- > sum-and-keep-grouped/38747685#38747685): > > |fromElements(("a-b", "data1", 1), ("a-c", "data2", 1), ("a-b", "data3", > 1)). groupBy(0). reduceGroup { (it: Iterator[(String, String, Int)], out: > Collector[(String, List[String], Int)]) => { val group = it.toList if > (group.length > 0) out.collect((group(0)._1, group.map(_._2), > group.map(_._3).sum)) } | > > So, now I am moving to DataStreams (since the input is really a > DataStream). From my perspective, a Window should provide the same > functionality as a DataSet. This would easify the process a lot: > > 1. Window the elements. > > 2. Apply the same operations as before. > > Is there a way in Flink to do so? Otherwise, I would like to think of a > solution to this problem. > > Regards, > Kevin >