andygrove opened a new issue, #1040:
URL: https://github.com/apache/datafusion-comet/issues/1040

   ### What is the problem the feature request solves?
   
   Comet has native code for decoding Parquet structures into Arrow arrays. 
This issue is for discussing delegating to the [parquet 
crate](https://crates.io/crates/parquet) instead for these operations.
   
   The benefits of this approach include:
   
   - Support for complex types. The parquet crate already supports reading maps 
and structs. We could implement the same support in the Comet native code but 
it is probably a lot of work
   - Support for StringView and benefitting from related performance 
optimizations (see [1] and [2] for details)
   - Benefit from ongoing optimization work and active community
   - Reduce maintenance efforts in Comet
   
   Possible downsides of this approach:
   
   - Lose the performance benefit of re-using mutable buffers? (although this 
also comes with a maintenance cost)
   
   [1] 
https://datafusion.apache.org/blog/2024/09/13/string-view-german-style-strings-part-1/
   [2] 
https://datafusion.apache.org/blog/2024/09/13/string-view-german-style-strings-part-2/
   
   
   ### Describe the potential solution
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to