andygrove opened a new issue, #1536:
URL: https://github.com/apache/datafusion-comet/issues/1536
   ### Describe the bug
   
   `native_datafusion` scan is only enabled when `spark.comet.exec.enabled` is 
set.
   
   In `CometScanRule` we add `CometScanExec` as a placeholder for a 
`native_datafusion` scan:
   
   ```rust
             case scanExec @ FileSourceScanExec(
                   HadoopFsRelation(_, partitionSchema, _, _, fileFormat, _),
                   _: Seq[_],
                   requiredSchema,
                   _,
                   _,
                   _,
                   _,
                   _,
                   _)
                 if CometScanExec.isFileFormatSupported(fileFormat)
                   && CometNativeScanExec.isSchemaSupported(requiredSchema)
                   && CometNativeScanExec.isSchemaSupported(partitionSchema)
                   // TODO we only enable full native scan if 
COMET_EXEC_ENABLED is enabled
                   // but this is not really what we want .. we currently 
insert `CometScanExec`
                   // here and then it gets replaced with `CometNativeScanExec` 
in `CometExecRule`
                   // but that only happens if `COMET_EXEC_ENABLED` is enabled
                   && COMET_EXEC_ENABLED.get()
                   && COMET_NATIVE_SCAN_IMPL.get() == 
CometConf.SCAN_NATIVE_DATAFUSION =>
               logInfo("Comet extension enabled for v1 full native Scan")
               CometScanExec(scanExec, session)
   ```
   
   and then in `CometExecRule` we replace it:
   
   ```
         plan.transformUp {
           // Fully native scan for V1
           case scan: CometScanExec
               if 
COMET_NATIVE_SCAN_IMPL.get().equals(CometConf.SCAN_NATIVE_DATAFUSION) =>
             val nativeOp = QueryPlanSerde.operator2Proto(scan).get
             CometNativeScanExec(nativeOp, scan.wrapped, scan.session)
   ```
   
   
   
   ### Steps to reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to