Re: [I] Support OneRowRelation to Support Scalar Inputs? [datafusion-comet]

via GitHub Wed, 05 Jun 2024 12:10:05 -0700


tshauck commented on issue #516:
URL: 
https://github.com/apache/datafusion-comet/issues/516#issuecomment-2150771572


   Thanks, that's my understanding as well.
   
   Here's the extended explain for reference...
   
   ```
   spark-sql (default)> EXPLAIN EXTENDED SELECT trim('123 ');
   24/06/05 12:03:58 WARN CometSparkSessionExtensions$CometExecRule: Comet 
cannot execute some parts of this plan natively because Execute ExplainCommand 
is not supported
   Unsupported op node name: Scan OneRowRelation, Full class path: 
org.apache.spark.sql.execution.RDDScanExec
   24/06/05 12:03:58 WARN CometSparkSessionExtensions$CometExecRule: Comet 
cannot execute some parts of this plan natively because:
           - Execute ExplainCommand is not supported
           - CommandResult is not supported
   == Parsed Logical Plan ==
   'Project [unresolvedalias('trim(123 ), None)]
   +- OneRowRelation
   
   == Analyzed Logical Plan ==
   trim(123 ): string
   Project [trim(123 , None) AS trim(123 )#11]
   +- OneRowRelation
   
   == Optimized Logical Plan ==
   Project [123 AS trim(123 )#11]
   +- OneRowRelation
   
   == Physical Plan ==
   *(1) Project [123 AS trim(123 )#11]
   +- *(1) Scan OneRowRelation[]
   
   Time taken: 0.071 seconds, Fetched 1 row(s)
   ```
   
   This is from a debug statement I put in that shows the node that can't be 
transformed...
   
   ```
   Unsupported op node name: Scan OneRowRelation, Full class path: 
org.apache.spark.sql.execution.RDDScanExec
   ```
   
   When a constant is used with a parquet table things seem to work fine.
   
   ```
   spark-sql (default)> EXPLAIN EXTENDED SELECT trim(col), trim('123 ') FROM qq;
   == Parsed Logical Plan ==
   'Project [unresolvedalias('trim('col), None), unresolvedalias('trim(123 ), 
None)]
   +- 'UnresolvedRelation [qq], [], false
   
   == Analyzed Logical Plan ==
   trim(col): string, trim(123 ): string
   Project [trim(col#15, None) AS trim(col)#26, trim(123 , None) AS trim(123 
)#27]
   +- SubqueryAlias spark_catalog.default.qq
      +- Relation spark_catalog.default.qq[col#15] parquet
   
   == Optimized Logical Plan ==
   Project [trim(col#15, None) AS trim(col)#26, 123 AS trim(123 )#27]
   +- Relation spark_catalog.default.qq[col#15] parquet
   
   == Physical Plan ==
   *(1) ColumnarToRow
   +- CometProject [trim(col)#26, trim(123 )#27], [trim(col#15, None) AS 
trim(col)#26, 123 AS trim(123 )#27]
      +- CometScan parquet spark_catalog.default.qq[col#15] Batched: true, 
DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1 
paths)[file:/Users/thauck/personal/code/github.com/tshauck/arrow-datafusion-c...,
 PartitionFilters: [], PushedFilters: [], ReadSchema: struct<col:string>
   
   Time taken: 0.065 seconds, Fetched 1 row(s)
   ```
   
   Sounds like it makes sense to add the native node, so I'll give it a shot 
over the next week or so. Though please let me know if you immediate thoughts.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Support OneRowRelation to Support Scalar Inputs? [datafusion-comet]

Reply via email to