Hey Iceberg People, Just pinging this thread here. Any Flink expertise on the above questions is appreciated! :)
Gabor On Mon, Mar 25, 2024 at 3:43 PM Gabor Kaszab <gaborkas...@apache.org> wrote: > Hey Iceberg Community, > > I've recently had the chance to examine Iceberg's equality delete support > in a multi-engine perspective (Flink, Hive, Impala, Spark). > I started exploring how *Flink* can be used for writing and I observed > that there is a restriction that the users are forced to add the *partition > columns into the primary keys* when creating an upsert-mode table. This > came handy for me because it made the eq-delete read implementation easier > for me on the Impala side, but also made me curious about the original > motivation. So the questions I have in mind are: > - What was the motivation behind introducing this restriction? > - Technically would it be possible not to force partition cols into the > PK? Are there well known pros and cons? > - In theory if someone removed this restriction would the readers (for > instance Spark since that is mostly coupled engine into Iceberg) still be > able to read eq-deletes that doesn't contain the partition cols? > - Is there such a change to loosen this restriction on the roadmap for > anyone in the community? > > Thanks, > Gabor >