cloud-fan commented on code in PR #50937: URL: https://github.com/apache/spark/pull/50937#discussion_r2102705961
########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala: ########## @@ -91,6 +94,50 @@ object ApplyDefaultCollationToStringType extends Rule[LogicalPlan] { } } + /** + * Determines the default collation for an object in the following order: + * 1. Use the object's explicitly defined default collation, if available. + * 2. Otherwise, use the default collation defined by the object's schema. + * 3. If not defined in the schema, use the default collation from the object's catalog. + * + * If none of these collations are specified, None will be persisted as the default collation, + * which means the system default collation `UTF8_BINARY` will be used and the plan will not be + * changed. + * This function applies to DDL commands. An object's default collation is persisted at the moment + * of its creation, and altering the schema or catalog collation will not affect existing objects. + */ + def resolveDefaultCollation(plan: LogicalPlan): LogicalPlan = { + try { + plan match { + case createTable@CreateTable( + ResolvedIdentifier(catalog: SupportsNamespaces, identifier), _, _, tableSpec, _) Review Comment: This requires other rules to resolve the identifier first. To avoid tricky rule order issues, let's also wait for the identifier to be resolved in `def fetchDefaultCollation` of this rule as well. ########## sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/ApplyDefaultCollationToStringType.scala: ########## @@ -91,6 +94,50 @@ object ApplyDefaultCollationToStringType extends Rule[LogicalPlan] { } } + /** + * Determines the default collation for an object in the following order: + * 1. Use the object's explicitly defined default collation, if available. + * 2. Otherwise, use the default collation defined by the object's schema. + * 3. If not defined in the schema, use the default collation from the object's catalog. + * + * If none of these collations are specified, None will be persisted as the default collation, + * which means the system default collation `UTF8_BINARY` will be used and the plan will not be + * changed. + * This function applies to DDL commands. An object's default collation is persisted at the moment + * of its creation, and altering the schema or catalog collation will not affect existing objects. + */ + def resolveDefaultCollation(plan: LogicalPlan): LogicalPlan = { + try { + plan match { + case createTable@CreateTable( + ResolvedIdentifier(catalog: SupportsNamespaces, identifier), _, _, tableSpec, _) Review Comment: Furthermore, we can directly match the resolved `TableSpec` here to avoid adding a new method in `TableSpecBase`. And similarly, to avoid rule order issues, `def fetchDefaultCollation` should return the collation only when the tableSpec of `CreateTable`/`ReplaceTable` is resolved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org