You could first transform each stream to a common format (in the worst case, an ugly Either-like capturing all possible types), union those streams, and then do a keyBy + window function.

This is how coGroup is implemented internally.

On 14/02/2022 16:08, Will Lauer wrote:
OK, here's what I hope is a stupid question: what's the most efficient way to co-group more than 2 DataStreams together? I'm looking at porting a pipeline from pig to flink, and in a couple of places I use Pig's COGROUP functionality to simultaneously group 3 or 4 and sometimes even more datasets on the same key simultaneously. Looking at the Datastream API, I see how to group 2 datastreams, but I don't see anything obvious for processing more than two simultaneously. Obviously I could cogroup two, then cogroup the result with the next one, etc adding each stream serially to the result, but that seems inefficient. Is there a better way?

Will


*
*

Will Lauer

*
*

Senior Principal Architect, Audience & Advertising Reporting

Data Platforms & Systems Engineering

*
*

M 508 561 6427

Champaign Office

1908 S. First St

Champaign, IL 61822

Reply via email to