Re: [PR] Normalize sequence values to machine format before templating [shardingsphere]

via GitHub Mon, 22 Dec 2025 09:39:32 -0800


Copilot commented on code in PR #37465:
URL: https://github.com/apache/shardingsphere/pull/37465#discussion_r2640676274



##########
kernel/data-pipeline/dialect/postgresql/src/main/java/org/apache/shardingsphere/data/pipeline/postgresql/sqlbuilder/ddl/column/PostgreSQLColumnPropertiesAppender.java:
##########
@@ -147,6 +150,33 @@ private void columnFormatter(final Map<String, Object> 
column, final Collection<
         column.put("cltype", parseTypeName(column.get("cltype").toString()));
     }
     
+    private void normalizeSequenceValues(final Map<String, Object> column) {
+        normalizeSequenceValue(column, "seqincrement");
+        normalizeSequenceValue(column, "seqstart");
+        normalizeSequenceValue(column, "seqmin");
+        normalizeSequenceValue(column, "seqmax");
+        normalizeSequenceValue(column, "seqcache");
+    }
+    
+    private void normalizeSequenceValue(final Map<String, Object> column, 
final String key) {
+        if (!column.containsKey(key)) {
+            return;
+        }
+        Object value = column.get(key);
+        if (null == value || value instanceof Number) {
+            return;
+        }
+        String sanitized = 
NON_DIGIT_WITH_SIGN_PATTERN.matcher(value.toString()).replaceAll("");
+        if (sanitized.isEmpty()) {

Review Comment:
   When the sanitized value is empty after removing non-digit and non-sign 
characters, the method returns early without modifying the column map. This 
means invalid values containing only formatting characters (like spaces, 
commas) will remain as strings in the map, which could cause template rendering 
issues. Consider removing the key from the map or setting it to null in this 
case to ensure consistent handling.
   ```suggestion
           if (sanitized.isEmpty()) {
               column.put(key, null);
   ```



##########
kernel/data-pipeline/dialect/postgresql/src/main/java/org/apache/shardingsphere/data/pipeline/postgresql/sqlbuilder/ddl/column/PostgreSQLColumnPropertiesAppender.java:
##########
@@ -147,6 +150,33 @@ private void columnFormatter(final Map<String, Object> 
column, final Collection<
         column.put("cltype", parseTypeName(column.get("cltype").toString()));
     }
     
+    private void normalizeSequenceValues(final Map<String, Object> column) {
+        normalizeSequenceValue(column, "seqincrement");
+        normalizeSequenceValue(column, "seqstart");
+        normalizeSequenceValue(column, "seqmin");
+        normalizeSequenceValue(column, "seqmax");
+        normalizeSequenceValue(column, "seqcache");
+    }
+    
+    private void normalizeSequenceValue(final Map<String, Object> column, 
final String key) {
+        if (!column.containsKey(key)) {
+            return;
+        }
+        Object value = column.get(key);
+        if (null == value || value instanceof Number) {
+            return;
+        }
+        String sanitized = 
NON_DIGIT_WITH_SIGN_PATTERN.matcher(value.toString()).replaceAll("");
+        if (sanitized.isEmpty()) {
+            return;
+        }
+        try {
+            column.put(key, Long.parseLong(sanitized));
+        } catch (final NumberFormatException ignored) {
+            column.put(key, sanitized);

Review Comment:
   The fallback behavior on NumberFormatException stores the sanitized string 
back into the column map. This could lead to inconsistent types in the map 
where some sequence values are Long and others are String. The FreeMarker 
templates use `?c?number` which expects numeric types, and having string values 
could cause template rendering failures. Consider either removing the key when 
parsing fails or logging a warning about the invalid value.
   ```suggestion
               column.remove(key);
   ```



##########
kernel/data-pipeline/dialect/postgresql/src/main/java/org/apache/shardingsphere/data/pipeline/postgresql/sqlbuilder/ddl/column/PostgreSQLColumnPropertiesAppender.java:
##########
@@ -147,6 +150,33 @@ private void columnFormatter(final Map<String, Object> 
column, final Collection<
         column.put("cltype", parseTypeName(column.get("cltype").toString()));
     }
     
+    private void normalizeSequenceValues(final Map<String, Object> column) {
+        normalizeSequenceValue(column, "seqincrement");
+        normalizeSequenceValue(column, "seqstart");
+        normalizeSequenceValue(column, "seqmin");
+        normalizeSequenceValue(column, "seqmax");
+        normalizeSequenceValue(column, "seqcache");
+    }
+    
+    private void normalizeSequenceValue(final Map<String, Object> column, 
final String key) {
+        if (!column.containsKey(key)) {
+            return;
+        }
+        Object value = column.get(key);
+        if (null == value || value instanceof Number) {
+            return;
+        }
+        String sanitized = 
NON_DIGIT_WITH_SIGN_PATTERN.matcher(value.toString()).replaceAll("");
+        if (sanitized.isEmpty()) {
+            return;
+        }
+        try {
+            column.put(key, Long.parseLong(sanitized));

Review Comment:
   The regex pattern allows both plus and minus signs anywhere in the string, 
which could result in invalid number formats like "+-123", "--456", or 
"12-34+5" being passed to Long.parseLong, which will throw 
NumberFormatException. While the exception is caught and the sanitized string 
is stored as-is, this could still cause issues in template processing. Consider 
using a more restrictive pattern that only allows an optional leading sign, 
such as "^[+-]?\\d+$" with replaceAll("[^0-9]", "") for the numeric part, or 
validate the sanitized string format before parsing.



##########
kernel/data-pipeline/dialect/postgresql/src/test/java/org/apache/shardingsphere/data/pipeline/postgresql/sqlbuilder/template/PostgreSQLPipelineFreemarkerManagerTest.java:
##########
@@ -20,28 +20,47 @@
 import org.junit.jupiter.api.Test;
 
 import java.util.Collections;
-import java.util.HashMap;
+import java.util.LinkedHashMap;
 import java.util.Map;
 
-import static org.hamcrest.CoreMatchers.is;
 import static org.hamcrest.MatcherAssert.assertThat;
-import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.hamcrest.CoreMatchers.is;
 
 class PostgreSQLPipelineFreemarkerManagerTest {
     
     @Test
-    void assertGetSQLByDefaultVersion() {
-        String actual = 
PostgreSQLPipelineFreemarkerManager.getSQLByVersion(Collections.singletonMap("databaseName",
 "foo_db"), "component/table/%s/get_database_id.ftl", 10, 0);
-        String expected = "\n" + "SELECT oid AS did, datlastsysoid FROM 
pg_catalog.pg_database WHERE datname = 'foo_db';" + "\n";
-        assertThat(actual, is(expected));
-    }
-    
-    @Test
-    void assertGetSQLByVersion() {
-        Map<String, Object> dataModel = new HashMap<>(2, 1F);
-        dataModel.put("tid", 1);
-        dataModel.put("tname", "foo_tb l");
-        String actual = 
PostgreSQLPipelineFreemarkerManager.getSQLByVersion(dataModel, 
"component/table/%s/get_columns_for_table.ftl", 10, 0);
-        assertNotNull(actual);
+    void assertCreateTableTemplateHandlesHumanFormattedSequenceNumbers() {
+        Map<String, Object> column = new LinkedHashMap<>(16, 1F);
+        column.put("name", "id");
+        column.put("cltype", "int4");
+        column.put("displaytypname", "integer");
+        column.put("attnotnull", true);
+        column.put("attidentity", "a");
+        column.put("colconstype", "i");
+        column.put("seqincrement", 1L);
+        column.put("seqstart", 1L);
+        column.put("seqmin", 1L);
+        column.put("seqmax", 2147483647L);
+        column.put("seqcache", 1L);
+        column.put("seqcycle", false);
+        Map<String, Object> dataModel = new LinkedHashMap<>(8, 1F);
+        dataModel.put("schema", "public");
+        dataModel.put("name", "t_order");
+        dataModel.put("columns", Collections.singletonList(column));
+        dataModel.put("primary_key", Collections.emptyList());
+        dataModel.put("unique_constraint", Collections.emptyList());
+        dataModel.put("foreign_key", Collections.emptyList());
+        dataModel.put("check_constraint", Collections.emptyList());
+        dataModel.put("exclude_constraint", Collections.emptyList());
+        dataModel.put("coll_inherits", Collections.emptyList());
+        dataModel.put("autovacuum_enabled", "x");
+        dataModel.put("toast_autovacuum_enabled", "x");
+        dataModel.put("autovacuum_custom", false);
+        dataModel.put("toast_autovacuum", false);
+        dataModel.put("add_vacuum_settings_in_sql", false);
+        String sql = 
PostgreSQLPipelineFreemarkerManager.getSQLByVersion(dataModel, 
"component/table/%s/create.ftl", 12, 0);
+        String compactSql = sql.replaceAll("\\s+", "");
+        String expectedSql = 
"CREATETABLEIFNOTEXISTSpublic.t_order(idintegerNOTNULLGENERATEDALWAYSASIDENTITY(INCREMENT1START1MINVALUE1MAXVALUE2147483647CACHE1))";
+        assertThat(compactSql, is(expectedSql));
     }

Review Comment:
   The test name suggests it tests handling of human-formatted sequence numbers 
(like "2,147,483,647" with grouping separators), but the test actually uses 
already-normalized Long values (1L, 2147483647L). This doesn't test the 
normalization logic at all - it only tests the template rendering with 
pre-normalized values. Consider either renaming the test to reflect what it 
actually tests (e.g., 
"assertCreateTableTemplateRendersSequenceNumbersCorrectly") or modifying the 
test to use human-formatted string values and verify they are properly handled 
through the full pipeline including normalization.



##########
kernel/data-pipeline/dialect/postgresql/src/test/java/org/apache/shardingsphere/data/pipeline/postgresql/sqlbuilder/ddl/column/PostgreSQLColumnPropertiesAppenderTest.java:
##########
@@ -237,6 +237,29 @@ void assertAppendCopiesInheritedFromTable() {
         assertThat(getSingleColumn(context).get("inheritedfromtable"), 
is("parent_table"));
     }
     
+    @Test
+    void assertNormalizeSequenceValuesRemovesGroupingSeparator() {
+        Map<String, Object> column = createTextColumnWithName("id");
+        column.put("attidentity", "a");
+        column.put("colconstype", "i");
+        column.put("seqincrement", "1");
+        column.put("seqstart", "1");
+        column.put("seqmin", "1");
+        column.put("seqmax", "2,147,483,647");
+        column.put("seqcache", "1");
+        Map<String, Object> context = new LinkedHashMap<>();
+        when(templateExecutor.executeByTemplate(context, 
"component/table/%s/get_columns_for_table.ftl")).thenReturn(Collections.emptyList());
+        when(templateExecutor.executeByTemplate(context, 
"component/columns/%s/properties.ftl")).thenReturn(Collections.singletonList(column));
+        when(templateExecutor.executeByTemplate(anyMap(), 
eq("component/columns/%s/edit_mode_types_multi.ftl"))).thenReturn(Collections.emptyList());
+        appender.append(context);
+        Map<String, Object> singleColumn = getSingleColumn(context);
+        assertThat(singleColumn.get("seqincrement"), is(1L));
+        assertThat(singleColumn.get("seqstart"), is(1L));
+        assertThat(singleColumn.get("seqmin"), is(1L));
+        assertThat(singleColumn.get("seqmax"), is(2147483647L));
+        assertThat(singleColumn.get("seqcache"), is(1L));
+    }

Review Comment:
   The test only covers the happy path with comma-separated numbers. Consider 
adding test cases for edge cases such as: values that are already Numbers 
(should not be modified), null values (should be ignored), empty strings 
(current behavior leaves them as-is), values with only non-numeric characters 
(current behavior leaves them as-is), and values with multiple signs like 
"+-123" (would fail parsing and be stored as sanitized string). These edge 
cases would help ensure robust handling of malformed input data.



##########
kernel/data-pipeline/dialect/postgresql/src/test/java/org/apache/shardingsphere/data/pipeline/postgresql/sqlbuilder/template/PostgreSQLPipelineFreemarkerManagerTest.java:
##########
@@ -20,28 +20,47 @@
 import org.junit.jupiter.api.Test;
 
 import java.util.Collections;
-import java.util.HashMap;
+import java.util.LinkedHashMap;
 import java.util.Map;
 
-import static org.hamcrest.CoreMatchers.is;
 import static org.hamcrest.MatcherAssert.assertThat;
-import static org.junit.jupiter.api.Assertions.assertNotNull;
+import static org.hamcrest.CoreMatchers.is;
 
 class PostgreSQLPipelineFreemarkerManagerTest {
     
     @Test
-    void assertGetSQLByDefaultVersion() {
-        String actual = 
PostgreSQLPipelineFreemarkerManager.getSQLByVersion(Collections.singletonMap("databaseName",
 "foo_db"), "component/table/%s/get_database_id.ftl", 10, 0);
-        String expected = "\n" + "SELECT oid AS did, datlastsysoid FROM 
pg_catalog.pg_database WHERE datname = 'foo_db';" + "\n";
-        assertThat(actual, is(expected));
-    }
-    
-    @Test
-    void assertGetSQLByVersion() {
-        Map<String, Object> dataModel = new HashMap<>(2, 1F);
-        dataModel.put("tid", 1);
-        dataModel.put("tname", "foo_tb l");
-        String actual = 
PostgreSQLPipelineFreemarkerManager.getSQLByVersion(dataModel, 
"component/table/%s/get_columns_for_table.ftl", 10, 0);
-        assertNotNull(actual);
+    void assertCreateTableTemplateHandlesHumanFormattedSequenceNumbers() {
+        Map<String, Object> column = new LinkedHashMap<>(16, 1F);
+        column.put("name", "id");
+        column.put("cltype", "int4");
+        column.put("displaytypname", "integer");
+        column.put("attnotnull", true);
+        column.put("attidentity", "a");
+        column.put("colconstype", "i");
+        column.put("seqincrement", 1L);
+        column.put("seqstart", 1L);
+        column.put("seqmin", 1L);
+        column.put("seqmax", 2147483647L);
+        column.put("seqcache", 1L);
+        column.put("seqcycle", false);
+        Map<String, Object> dataModel = new LinkedHashMap<>(8, 1F);
+        dataModel.put("schema", "public");
+        dataModel.put("name", "t_order");
+        dataModel.put("columns", Collections.singletonList(column));
+        dataModel.put("primary_key", Collections.emptyList());
+        dataModel.put("unique_constraint", Collections.emptyList());
+        dataModel.put("foreign_key", Collections.emptyList());
+        dataModel.put("check_constraint", Collections.emptyList());
+        dataModel.put("exclude_constraint", Collections.emptyList());
+        dataModel.put("coll_inherits", Collections.emptyList());
+        dataModel.put("autovacuum_enabled", "x");
+        dataModel.put("toast_autovacuum_enabled", "x");
+        dataModel.put("autovacuum_custom", false);
+        dataModel.put("toast_autovacuum", false);
+        dataModel.put("add_vacuum_settings_in_sql", false);
+        String sql = 
PostgreSQLPipelineFreemarkerManager.getSQLByVersion(dataModel, 
"component/table/%s/create.ftl", 12, 0);
+        String compactSql = sql.replaceAll("\\s+", "");
+        String expectedSql = 
"CREATETABLEIFNOTEXISTSpublic.t_order(idintegerNOTNULLGENERATEDALWAYSASIDENTITY(INCREMENT1START1MINVALUE1MAXVALUE2147483647CACHE1))";
+        assertThat(compactSql, is(expectedSql));
     }

Review Comment:
   The test replaces two existing tests (assertGetSQLByDefaultVersion and 
assertGetSQLByVersion) that tested different aspects of the SQL generation. The 
removed tests covered scenarios like getting database ID and getting columns 
for a table with special characters in the name. While the new test focuses on 
sequence value handling, it doesn't cover these other scenarios. Consider 
whether the removed test coverage should be preserved in separate test methods 
to maintain comprehensive testing of the PostgreSQLPipelineFreemarkerManager 
functionality.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Normalize sequence values to machine format before templating [shardingsphere]

Reply via email to