Hi, Lu. Thank you for your proposal. I believe it is very helpful for joins to improve performance. After reading the Flip, I have the following questions and look forward to your reply.
1. I would like to confirm what happens when an Iceberg table is not a partitioned table but still supports SupportsPartitioning. Will SupportsPartitioning#outputPartitioning return empty? 2. In KeyGroupedPartitioning, could you explain why partition values need to be ordered? 3. Can you outline the relationship between the parallelism of join and the data sizes and number of partitions of the left and right tables? 4. Nit: Regarding the Flink flip, it is not necessary to specifically list out the changes that Iceberg needs to adapt to. If needed, these can be added as an appendix at the end. 5. In the "Partition Mismatch Problem" section, how do we inform table B that it needs to create an empty partition? 6. In the "Join Keys and Partition Keys Relationship" section, I noticed that when the join keys are a superset of the partition keys, c1 is still used as the partition key. Do you mean we will treat “A.c2 = B.c2” as a normal condition and apply this filter after the join? 7. Considering the "Correlated Partition Number", could we have something similar to `resolveCommonPartitioning` that allows the source to assist in merging two partitionings when their partition numbers are inconsistent or their partition values differ? 8. In the "Compatibility, Deprecation, and Migration Plan" section, I noticed you mentioned that “The feature will be opt-in through configuration, ensuring no impact on existing workloads.” However, I did not find the relevant configuration in the public interface—could you please elaborate on this? 9. Is it possible to eliminate the exchange between the source and join using FlinkExpandConversionRule? This seems more general and could benefit other operations like rank, group agg etc. as well. -- Best! Xuyang At 2025-06-26 06:36:38, "Lu Niu" <qqib...@gmail.com> wrote: >Hi devs, > >I’d like to start a discussion on FLIP-535: Support Storage Partition Join >in Flink: > >https://docs.google.com/document/d/1bX37wpwivO3fS4WrR9tZ0pDF92Mpux6pZE54Tw8czzU/edit?tab=t.0#heading=h.goqvqbz0cyel > > >The goal is to support storage partition join in Flink batch mode for >iceberg tables to avoid unnecessary shuffles. > >It would be great if someone could help to copy the contents to a FLIP >page. I don’t have the permission. > >Looking forward to your feedback. > >Best > >Lu