szehon-ho commented on code in PR #49840:
URL: https://github.com/apache/spark/pull/49840#discussion_r1950141582


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala:
##########
@@ -265,6 +267,23 @@ object Literal {
       s"Literal must have a corresponding value to ${dataType.catalogString}, 
" +
       s"but class ${Utils.getSimpleName(value.getClass)} found.")
   }
+
+  def fromSQL(sql: String): Expression = {
+    CatalystSqlParser.parseExpression(sql).transformUp {
+      case u: UnresolvedFunction =>
+        assert(u.nameParts.length == 1)
+        assert(!u.isDistinct)
+        assert(u.filter.isEmpty)
+        assert(!u.ignoreNulls)
+        assert(u.orderingWithinGroup.isEmpty)
+        assert(!u.isInternal)
+        
FunctionRegistry.builtin.lookupFunction(FunctionIdentifier(u.nameParts.head), 
u.arguments)
+    } match {
+      case c: Cast if c.needsTimeZone =>

Review Comment:
   @cloud-fan  sure, let me do that.
   
   BTW, I looked a little bit and couldnt reproduce a failure with the current 
implementation using a following unit test with a nested cast:
   ```
     test("SPARK-51119: array of timestamp should have timezone if default 
values castable") {
       withTable("t") {
         sql(s"CREATE TABLE t(key int, c ARRAY<STRING> DEFAULT " +
           s"ARRAY(CAST(timestamp '2018-11-17' AS STRING))) " +
           s"USING parquet")
         sql("INSERT INTO t (key) VALUES(1)")
         checkAnswer(sql("select * from t"), Row(1, Array("2018-11-17 
00:00:00")))
       }
     }
     ```
   
   Unlike the failing case of top-level cast:
   ```
     test("SPARK-46958: timestamp should have timezone for resolvable if 
default values castable") {
       val defaults = Seq("timestamp '2018-11-17'", "CAST(timestamp 
'2018-11-17' AS STRING)")
       defaults.foreach { default =>
         withTable("t") {
           sql(s"CREATE TABLE t(key int, c STRING DEFAULT $default) " +
             s"USING parquet")
           sql("INSERT INTO t (key) VALUES(1)")
           checkAnswer(sql("select * from t"), Row(1, "2018-11-17 00:00:00"))
         }
       }
     }
   ```
   
   EXISTS_DEFAULT is saved without a cast in the first case:  
`ARRAY('2018-11-17 00:00:00')` (looks like it got evaluated)
   and with a cast in the second case:  `CAST(TIMESTAMP '2018-11-17 00:00:00' 
AS STRING)`
   
   So I think in this particular scenario, it doesnt matter.  But agree that it 
is better to have it, as we are making  a generic method.  



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to