Oh good to know about the multi-layer proposal. Can you help share a link
to it if there's any? I will also draft a short proposal on the
manifest-level stats topic in a Google doc so that folks can review and
comment.
Thank you Yufei for your time and input.
On Thu, Sep 26, 2024 at 4:18 PM Yufei
I agree, this approach makes sense and could align well with the
multi-layer manifest file proposal. Each layer's manifest file could
potentially hold aggregated metrics, which would streamline the process.
However, so far, there have only been offline discussions, and no formal
proposal has been d
Thanks Yufei for taking a look.
Yes I think adding the min/max values to partition-level statistics will
also do. In fact, it has been proposed by [1]. However, my concern was that
calculating partition-level min/max values would be an expensive operation
because of the row-level deletes support (
Hi Xingyuan,
I've been reviewing the partition statistics file, and it seems that adding
partition-level min/max values would be a natural fit within Partition
Statistics File[1], which is one file per snapshot. We could introduce a
few new fields to accommodate these values.
While this addition c
Hi team,
Just bumping this up. What do you think of this? Does the alternative
solution make sense or is it too much of a spec change?
Goal is to improve engine CBO's efficiency and effectiveness. Today, it's
fairly an expensive operation for engine CBO to get table stats:
https://github.com/trin