twalthr commented on code in PR #27603:
URL: https://github.com/apache/flink/pull/27603#discussion_r2811298675


##########
flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/utils/LogicalTypeCasts.java:
##########
@@ -293,6 +293,88 @@ public static boolean supportsExplicitCast(LogicalType 
sourceType, LogicalType t
         return supportsCasting(sourceType, targetType, true);
     }
 
+    /**
+     * Returns whether the cast from source type to target type is injective
+     * (uniqueness-preserving).
+     *
+     * <p>An injective cast guarantees that every distinct input value maps to 
a distinct output
+     * value. This property is useful for upsert key tracking through 
projections: if a key column
+     * is cast using an injective conversion, the uniqueness of the key is 
preserved.
+     *
+     * <p>This method returns {@code true} for casts that are either:
+     *
+     * <ul>
+     *   <li>Implicit casts (handled by {@link #supportsImplicitCast}), which 
are always safe type
+     *       widenings
+     *   <li>Explicit casts to STRING from types with deterministic string 
representations: TINYINT,
+     *       SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, BOOLEAN, DATE, and 
TIMESTAMP variants
+     * </ul>
+     *
+     * <p>Note: BYTES to STRING is explicitly excluded in this first version 
because they are a
+     * special case and need additional testing (i.e., invalid UTF-8 bytes are 
replaced with the
+     * Unicode replacement character).
+     *
+     * @param sourceType the source type
+     * @param targetType the target type
+     * @return {@code true} if the cast preserves uniqueness
+     */
+    public static boolean supportsInjectiveCast(LogicalType sourceType, 
LogicalType targetType) {
+        // Implicit casts are always injective (safe widening)
+        if (supportsImplicitCast(sourceType, targetType)) {
+            return true;
+        }
+
+        // Handle DISTINCT types by unwrapping
+        final LogicalTypeRoot sourceRoot = sourceType.getTypeRoot();
+        final LogicalTypeRoot targetRoot = targetType.getTypeRoot();
+
+        if (sourceRoot == DISTINCT_TYPE) {
+            return supportsInjectiveCast(((DistinctType) 
sourceType).getSourceType(), targetType);
+        }
+        if (targetRoot == DISTINCT_TYPE) {
+            return supportsInjectiveCast(sourceType, ((DistinctType) 
targetType).getSourceType());
+        }
+
+        // Check explicit injective casts (primarily to STRING)
+        return isInjectiveExplicitCast(sourceRoot, targetRoot);

Review Comment:
   This method should also support "constructed types" (i.e. row) with nested 
injective casts.



##########
flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/utils/LogicalTypeCasts.java:
##########
@@ -293,6 +293,88 @@ public static boolean supportsExplicitCast(LogicalType 
sourceType, LogicalType t
         return supportsCasting(sourceType, targetType, true);
     }
 
+    /**
+     * Returns whether the cast from source type to target type is injective
+     * (uniqueness-preserving).
+     *
+     * <p>An injective cast guarantees that every distinct input value maps to 
a distinct output
+     * value. This property is useful for upsert key tracking through 
projections: if a key column
+     * is cast using an injective conversion, the uniqueness of the key is 
preserved.
+     *
+     * <p>This method returns {@code true} for casts that are either:
+     *
+     * <ul>
+     *   <li>Implicit casts (handled by {@link #supportsImplicitCast}), which 
are always safe type
+     *       widenings
+     *   <li>Explicit casts to STRING from types with deterministic string 
representations: TINYINT,
+     *       SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, BOOLEAN, DATE, and 
TIMESTAMP variants
+     * </ul>
+     *
+     * <p>Note: BYTES to STRING is explicitly excluded in this first version 
because they are a
+     * special case and need additional testing (i.e., invalid UTF-8 bytes are 
replaced with the
+     * Unicode replacement character).
+     *
+     * @param sourceType the source type
+     * @param targetType the target type
+     * @return {@code true} if the cast preserves uniqueness
+     */
+    public static boolean supportsInjectiveCast(LogicalType sourceType, 
LogicalType targetType) {
+        // Implicit casts are always injective (safe widening)
+        if (supportsImplicitCast(sourceType, targetType)) {

Review Comment:
   We should not rely on `supportsImplicitCast` but come up with dedicated 
rules. E.g.:
   ```
   castTo(DATE)
                   .implicitFrom(DATE, TIMESTAMP_WITHOUT_TIME_ZONE)
   ```
   is not correct for an injective cast.
   
   We can add new rules to the builder pattern above:
   ```
   castTo(DATE)
                   .implicitFrom(DATE, TIMESTAMP_WITHOUT_TIME_ZONE)
                   .injectiveFrom(DATE)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to