github-actions[bot] commented on code in PR #61317:
URL: https://github.com/apache/doris/pull/61317#discussion_r2937836214


##########
fe/fe-core/src/main/java/org/apache/doris/job/util/StreamingJobUtils.java:
##########
@@ -355,7 +372,8 @@ public static List<CreateTableCommand> 
generateCreateTableCmds(String targetDb,
                     ImmutableList.of() // clusterKeyColumnNames
             );
             CreateTableCommand createtblCmd = new 
CreateTableCommand(Optional.empty(), createtblInfo);
-            createtblCmds.add(createtblCmd);
+            // Key: source (PG/MySQL) table name; Value: command that creates 
the Doris target table
+            createtblCmds.put(table, createtblCmd);
         }

Review Comment:
   [Medium] **Multi-table merge schema safety**: When two source tables map to 
the same Doris target table, this `LinkedHashMap` will contain two entries with 
different keys but both `CreateTableCommand` values targeting the same Doris 
table name. In `createTableIfNotExists()`, the second entry's 
`CreateTableCommand` is silently skipped (because the table already exists 
after the first entry creates it).
   
   This means the Doris target table is created with **only the first source 
table's schema**. If the two source tables have different columns, data from 
the second source may fail at stream-load time.
   
   Consider either:
   1. Validating that all source tables mapping to the same target have 
compatible schemas (at least same column names and compatible types), or
   2. Documenting this limitation clearly, or
   3. Merging column definitions (union of columns) when building the 
`CreateTableCommand` for shared targets.
   
   The current test works because both PG source tables have identical schemas 
`(id int, name varchar(200))`.



##########
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/DataSourceConfigValidator.java:
##########
@@ -43,11 +43,33 @@ public class DataSourceConfigValidator {
             DataSourceConfigKeys.SSL_ROOTCERT
     );
 
+    // Known suffixes for per-table config keys (format: 
"table.<tableName>.<suffix>")
+    private static final Set<String> ALLOW_TABLE_LEVEL_SUFFIXES = 
Sets.newHashSet(
+            DataSourceConfigKeys.TABLE_TARGET_TABLE_SUFFIX
+    );
+
+    private static final String TABLE_LEVEL_PREFIX = 
DataSourceConfigKeys.TABLE + ".";
+
     public static void validateSource(Map<String, String> input) throws 
IllegalArgumentException {
         for (Map.Entry<String, String> entry : input.entrySet()) {
             String key = entry.getKey();
             String value = entry.getValue();
 
+            if (key.startsWith(TABLE_LEVEL_PREFIX)) {
+                // per-table config key must be exactly: 
table.<tableName>.<suffix>
+                // reject malformed keys like "table.exclude_columns" (missing 
tableName)
+                String[] parts = key.split("\\.", -1);
+                if (parts.length != 3 || parts[1].isEmpty()) {
+                    throw new IllegalArgumentException("Malformed per-table 
config key: '" + key

Review Comment:
   [Low] The `split("\\.", -1)` + `parts.length != 3` approach assumes table 
names never contain dots. While this is typically true for PG/MySQL, note that 
`ConfigUtil.parseAllTargetTableMappings` on the cdc_client side uses a 
prefix/suffix substring approach that correctly handles dots in table names. 
The two modules use **inconsistent parsing strategies** for the same key format.
   
   This is a minor issue in practice (PG/MySQL identifiers rarely contain 
dots), but worth noting as a design inconsistency. Consider aligning both sides 
to use the same parsing strategy — the prefix/suffix approach is strictly more 
correct.



##########
fe/fe-common/src/main/java/org/apache/doris/job/cdc/DataSourceConfigKeys.java:
##########
@@ -38,6 +38,11 @@ public class DataSourceConfigKeys {
     public static final String SSL_MODE = "ssl_mode";
     public static final String SSL_ROOTCERT = "ssl_rootcert";
 
+    // per-table config: key format is "table.<tableName>.<suffix>"
+    public static final String TABLE = "table";
+    public static final String TABLE_EXCLUDE_COLUMNS_SUFFIX = 
"exclude_columns";
+    public static final String TABLE_TARGET_TABLE_SUFFIX = "target_table";

Review Comment:
   [Medium] `TABLE_EXCLUDE_COLUMNS_SUFFIX` is defined here but is **not** added 
to `ALLOW_TABLE_LEVEL_SUFFIXES` in `DataSourceConfigValidator`, meaning any 
config key like `table.foo.exclude_columns` would be rejected at validation 
time with "Unknown per-table config key". If this is a planned future feature, 
consider adding a `// TODO` comment here and removing this constant until it's 
actually supported. Shipping an unused public constant that looks usable but 
gets rejected at runtime is confusing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to