dejankrak-db commented on code in PR #49772: URL: https://github.com/apache/spark/pull/49772#discussion_r1952626714
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveDDLCommandStringTypes.scala: ########## @@ -155,22 +123,22 @@ object ResolveDefaultStringTypes extends Rule[LogicalPlan] { dataType.existsRecursively(isDefaultStringType) private def isDefaultStringType(dataType: DataType): Boolean = { + // STRING (without explicit collation) is considered default string type. + // STRING COLLATE <collation_name> (with explicit collation) is not considered + // default string type even when explicit collation is UTF8_BINARY (default collation). dataType match { - case st: StringType => - // should only return true for StringType object and not StringType("UTF8_BINARY") - st.eq(StringType) || st.isInstanceOf[TemporaryStringType] + // should only return true for StringType object and not for StringType("UTF8_BINARY") + case st: StringType => st.eq(StringType) Review Comment: Actually, putting case StringType => true here instead breaks the following scenario: CREATE TABLE foo (c1 STRING COLLATE UTF8_BINARY) DEFAULT COLLATION UNICODE In that case, c1 collation is set to UNICODE (replaced by table level collation as isDefaultStringType would return true with the proposed change). The correct behavior, as confirmed with Serge, is to set UTF8_BINARY collation for this column as explicitly specified in DDL command, i.e. skip replacing its type with table level collation, and that is achieved by doing reference check as originally implemented, hence the need for it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org