Re: [I] Introduce ProjectionMask To Allow Nested Projection Pushdown [datafusion]

2025-03-15 Thread via GitHub
alamb commented on issue #2581: URL: https://github.com/apache/datafusion/issues/2581#issuecomment-2704070949 Related ticket: - https://github.com/apache/datafusion/issues/14993 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to Git

Re: [I] Introduce ProjectionMask To Allow Nested Projection Pushdown [datafusion]

2025-01-07 Thread via GitHub
alamb commented on issue #2581: URL: https://github.com/apache/datafusion/issues/2581#issuecomment-2575099075 > Apologies, I should have checked the example value. 10_000 shows what I mean: Ah, yes, in this case the [UnwrapCastInComparison](https://docs.rs/datafusion/latest/datafusio

Re: [I] Introduce ProjectionMask To Allow Nested Projection Pushdown [datafusion]

2025-01-06 Thread via GitHub
gatesn commented on issue #2581: URL: https://github.com/apache/datafusion/issues/2581#issuecomment-2574525606 Apologies, I should have checked the example value. 10_000 shows what I mean: ``` explain select x = cast(1 AS int) from '/tmp/foo.parquet'; +---+--

Re: [I] Introduce ProjectionMask To Allow Nested Projection Pushdown [datafusion]

2025-01-04 Thread via GitHub
alamb commented on issue #2581: URL: https://github.com/apache/datafusion/issues/2581#issuecomment-2571424336 > There are also real optimizations available here. For example, suppose I write an Arrow int8 column to Parquet. The Arrow schema is serialized into Parquet metadata so at read tim

Re: [I] Introduce ProjectionMask To Allow Nested Projection Pushdown [datafusion]

2025-01-03 Thread via GitHub
gatesn commented on issue #2581: URL: https://github.com/apache/datafusion/issues/2581#issuecomment-2569432044 I'm not sure I agree that these are two separate ideas, rather, a generalization of the existing notion of projection. Projection today is all about selecting some subset of

Re: [I] Introduce ProjectionMask To Allow Nested Projection Pushdown [datafusion]

2025-01-02 Thread via GitHub
tustvold commented on issue #2581: URL: https://github.com/apache/datafusion/issues/2581#issuecomment-2568129442 I think that would be conflating two separate ideas, which I think would get confusing very quickly. Projection is a way to quickly and efficiently discard columns, expression ev