ueries where
> only a small number of columns are projected from a wide table.
>
I agree that it's an interesting idea, but it does add a lot of complexity,
and I'm not convinced that it's better from a performance standpoint
(metadata size increase, more I/Os). If we can g
On Fri, May 30, 2025 at 3:33 PM Péter Váry
wrote:
> One key advantage of introducing Physical Files is the flexibility to vary
> RowGroup sizes across columns. For instance, wide string columns could
> benefit from smaller RowGroups to reduce memory pressure, while numeric
> columns could use lar
I hope it's OK if I chime in. I'm one of the people responsible for the
format for position deletes that is used in Delta Lake and I've been
reading along with the discussion. Given that the main sticking point is
whether this compatibility is worth the associated "not pure" spec, I
figured that ma
I have some historical context that may or may not be relevant. I still
remember how we did the transition for Spark. This was ca. 2019, and there
were still many people mixing Spark 2.x and 3.0. Also, many other systems
were still using Java 7 which only supported Julian. As a result, Spark
3.0+ c