Hi iceberg-dev, I have a table that is partitioned by id (custom partitioning at the moment, not iceberg hidden partitioning) and event time. Individual DELETE finishes reasonably fast, for example:
*sql("DELETE FROM table where id_shard=111 and id=111456")* *sql("DELETE FROM table where id_shard=222 and id=222456")* I want to reduce table commits by creating a temp view and running a single DELETE for both. It becomes extremely slow: *df = spark.createDataFrame([(111, 111456), (222, 222456)], ["id_shard_t", "id_t"])* *df.createOrReplaceTempView("rows")* *sql(DELETE FROM table WHERE (id_shard, id) IN (SELECT id_shard_t, id_t FROM rows))* Looks like I lose the benefit of partitioning with multi column join. Any hint on reducing table commits while maintaining query efficiency? Thank you! -- Huadong