viirya opened a new issue, #1059:
URL: https://github.com/apache/datafusion-comet/issues/1059

   ### Describe the bug
   
   This was found during debugging CI failures of #1050.
   
   One example of failed test is `date_add with int scalars` in 
`CometExpressionSuite`. The query is `"SELECT _20 + CAST(2 as $intType) from 
tbl` which has simply a CometScan + CometProject + Spark ColumnarToRowExec.
   
   CometProject (i.e., DataFusion ProjectExec) doesn't store arrays internally. 
The only possibility that fails the safety check is that the arrays are not 
released before we fill next values into the CometBuffers.
   
   In Spark `ColumnarToRowExec`, once it pulls out all rows from current 
ColumnarBatch, it simply assigns it to null to release the JVM object, but 
`close` is never called on the batch object to release vector resources (e.g., 
for Comet, Arrow arrays). It is more complicated than just add a `close` call 
there because Spark uses WritableColumnVector there for some components (e.g., 
Parquet reader). Once `close` is called on a WritableColumnVector, it will make 
the vector "not" writable anymore.
   
   To completely fix it, we need some changes in Spark. I did a quick 
experiment locally in Spark and verified that if a `close` is properly called 
on non WritableColumnVector there, failed tests can pass without failing the 
safety check.
   
   
   
   ### Steps to reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to