Re: [PR] fix: Avoid to call import and export Arrow array for native execution [datafusion-comet]

via GitHub Thu, 14 Nov 2024 14:51:54 -0800


kazuyukitanimura commented on PR #1055:
URL: 
https://github.com/apache/datafusion-comet/pull/1055#issuecomment-2477566237


   @viirya  Let me try to convince one more time.
   
   >  For queries running many minutes or hours, it has no difference.
   
   If the query time is long because the data size, this PR still helps (5% at 
least?). If the query time is long because the query itself is complex, this PR 
has less value.
   
   > the change is not small
   
   The latest change is pretty small after following your change on Arrow spec 
memory model. There are only 3 main changes
   1. avoid importing: 
common/src/main/java/org/apache/comet/parquet/ColumnReader.java
   2. avoid exporting: 
common/src/main/scala/org/apache/comet/vector/NativeUtil.scala
   3. type handling fix: native/core/src/execution/utils.rs
   
   The rest of changes are only for passing the information of the new mode is 
used.
   
   > doesn't look like in good design to me.
   
   Do you have any recommendations here? What if I add a feature flag to 
enable/disable this new code flow. The latest change is fully backward 
compatible. We can easily enable/disable with a single flag to manipulate 
`hasNativeOperations` boolean.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] fix: Avoid to call import and export Arrow array for native execution [datafusion-comet]

Reply via email to