Thank you Anton! Follow up questions inline.
On Tue, May 25, 2021 at 10:31 AM Anton Okolnychyi
wrote:
> The performance degradation you see is because Spark cannot push down
> subquery expressions. That’s why Iceberg ends up scanning the complete
> table. You will still rewrite only files that h
The performance degradation you see is because Spark cannot push down subquery
expressions. That’s why Iceberg ends up scanning the complete table. You will
still rewrite only files that have matches but I assume joining the complete
table will be expensive.
In order to benefit from partition a
Hi iceberg-dev,
I have a table that is partitioned by id (custom partitioning at the
moment, not iceberg hidden partitioning) and event time. Individual DELETE
finishes reasonably fast, for example:
*sql("DELETE FROM table where id_shard=111 and id=111456")*
*sql("DELETE FROM table where id_shard