+1 to this proposal. It's well thought out and easy to read. The only issue with it is that I think we can take a different approach to range sketches (like what Spark does) to avoid a huge amount of state or limitations on the sort key cardinality. Otherwise, it all looks good to me.
Thanks for the awesome proposal, Steven! Ryan On Thu, Oct 14, 2021 at 8:56 AM Steven Wu <stevenz...@gmail.com> wrote: > Hi, > > I wrote a design doc for bin packing and range distribution shuffling > support in Flink Iceberg sink for streaming ingestion. Would appreciate > your feedback. > > > https://docs.google.com/document/d/13N8cMqPi-ZPSKbkXGOBMPOzbv2Fua59j8bIjjtxLWqo/ > > Thanks, > Steven > -- Ryan Blue Tabular