Hey Iceberg Community, I've been working on a proposal <https://docs.google.com/document/d/1H9uYt53Q1_CcOXOfLcr0hXRxvqflg_k_xeVorMLrWbM> to extend the currently standardized statistics in Iceberg, by looking into what statistics are used by some query engines and trying to fill the gaps (credit also goes to Denys K to lay groundwork). The motivation is to use Iceberg for the source of truth when it comes to statistics across all the engines. Meanwhile, there have been movements on other proposals (Restructuring col-stats <https://docs.google.com/document/d/1uvbrwwAJW2TgsnoaIcwAFpjbhHkBUL5wY_24nKgtt9I/edit?tab=t.0#heading=h.hs6r9d26w1y2> , Restructuring metadata <https://docs.google.com/document/d/1k4x8utgh41Sn1tr98eynDKCWq035SV_f75rtNHcerVw/edit?tab=t.0#heading=h.unn922df0zzw>) that might overlap with mine. Let’s see how much of my proposal still holds up in light of these developments.
Any feedback is appreciated! Gabor