The owner of the table wanted to keep the column stats for all of the
columns, claiming that other users might/are using the statistics of the
columns. Even if I am not sure that their case was defendable, I think the
reader of the table is often not in the position to optimize the table for
their
+1 for removing to avoid misunderstanding :). It's cleaner/clearer now
with iceberg-python repo. Thanks Fokko & Ed !
Regards
JB
On Sun, Oct 8, 2023 at 9:07 PM Fokko Driesprong wrote:
>
> Hey everyone,
>
> It has been a week since PyIceberg migrated to its own repository. Should we
> move forwar
For that use case, it sounds like you'd be much better off not storing all
the stats rather that skipping them at read time. I understand the user
wants to keep them, but it may still not be a great choice. I'm just
worried that this is going to be a lot of effort for you that doesn't
really genera
The main things I’m still interested are alternative approaches. I think that some of the work that Anton is working on have shown some different bottlenecks in applying delete files that I’m not sure are addressed by this proposal.For example, this proposal suggests doing a 1 to 1 (or 1 rowgroup t