huaxingao commented on code in PR #1830:
URL: https://github.com/apache/datafusion-comet/pull/1830#discussion_r2127844183


##########
common/src/main/java/org/apache/comet/parquet/TypeUtil.java:
##########
@@ -74,7 +74,7 @@ public static ColumnDescriptor convertToParquet(StructField 
field) {
       builder = Types.primitive(PrimitiveType.PrimitiveTypeName.INT64, 
repetition);
     } else if (type == DataTypes.BinaryType) {
       builder = Types.primitive(PrimitiveType.PrimitiveTypeName.BINARY, 
repetition);
-    } else if (type == DataTypes.StringType) {
+    } else if (type == DataTypes.StringType || 
type.sameType(DataTypes.StringType)) {

Review Comment:
   This is to support String Collation in Spark 4.0. (e.g. [test("Check order 
by on table with collated string 
column")](https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/collation/CollationSuite.scala#L1117)
 )
   Without String Collation, it goes to 
https://github.com/apache/spark/blob/master/sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala#L89
 which uses singleton `DataTypes.StringType`, so type == DataTypes.StringType, 
But with String Collation, it goes to 
https://github.com/apache/spark/blob/master/sql/api/src/main/scala/org/apache/spark/sql/catalyst/parser/DataTypeAstBuilder.scala#L94,
 so type == DataTypes.StringType fails and I added 
type.sameType(DataTypes.StringType to let String Collation pass. Actually, I 
think it should be
   ```
   else if (
     type == DataTypes.StringType ||
     (type.sameType(DataTypes.StringType) && isSpark40Plus())
   )
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to