[
https://issues.apache.org/jira/browse/FLINK-38196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18063194#comment-18063194
]
Leonard Xu commented on FLINK-38196:
------------------------------------
Thanks [~loserwang1024] for the detailed analysis, +1 to throw exception
instead of sending an incorrect result. We may need to support Variable DECIMAL
type for this user case in the future.
> IndexOutOfBoundsException when processing PostgreSQL tables with numeric(0)
> fields in Flink CDC
> -----------------------------------------------------------------------------------------------
>
> Key: FLINK-38196
> URL: https://issues.apache.org/jira/browse/FLINK-38196
> Project: Flink
> Issue Type: Bug
> Components: Flink CDC
> Affects Versions: cdc-3.5.0
> Environment: - **Database**: PostgreSQL with tables containing
> `numeric(0)` columns
> - **Connector**: PostgreSQL CDC Connector
> - **Configuration**: Any `debezium.decimal.handling.mode` setting
> Reporter: tchivs
> Priority: Major
> Labels: pull-request-available
> Attachments: image-2025-08-06-16-14-20-192.png,
> image-2025-08-06-18-01-05-115.png, image-2026-03-05-16-28-07-968.png,
> image-2026-03-05-16-51-11-585.png
>
>
> h3. Problem Statement
> Flink CDC fails with {{IndexOutOfBoundsException}} when processing PostgreSQL
> tables containing {{numeric(0)}} fields, particularly when these fields have
> NULL values. This causes complete pipeline failure during binary decimal data
> deserialization.
> h3. Error Details
> {{Caused by: java.lang.IndexOutOfBoundsException: pos: -2130706416, length:
> 48, index: -2130706432, offset: 0
> at org.apache.flink.core.memory.MemorySegment.get(MemorySegment.java:467)
> at
> org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:131)
> at
> org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.readDecimalData(BinarySegmentUtils.java:1003)
> at
> org.apache.flink.cdc.common.data.binary.BinaryRecordData.getDecimal(BinaryRecordData.java:163)
> at
> org.apache.flink.cdc.common.data.RecordData.lambda$createFieldGetter$7b8ca8ef$1(RecordData.java:195)}}
> h3. Steps to Reproduce
> # Create a PostgreSQL table with {{numeric(0)}} columns:
> {{CREATE TABLE test_table (
> id SERIAL PRIMARY KEY,
> amount numeric, – This causes the issue
> name VARCHAR(100)
> );}}
> # Insert data including NULL values:
> {{INSERT INTO test_table (amount, name) VALUES (NULL, 'test');
> INSERT INTO test_table (amount, name) VALUES (123, 'test2');}}
> # Configure Flink CDC pipeline:
> {{source:
> type: postgres
> hostname: localhost
> port: 5432
> username: postgres
> password: password
> database-name: testdb
> schema-name: public
> table-name: test_table
> debezium.decimal.handling.mode: string}}
> # Run the pipeline - it will crash with IndexOutOfBoundsException
> h3. Expected Result
> * Pipeline should successfully process all rows including those with NULL
> {{numeric(0)}} values
> * No exceptions should be thrown during data processing
> h3. Actual Result
> * Pipeline crashes with {{IndexOutOfBoundsException}}
> * Processing stops completely, preventing any data from being synchronized
> h3. Root Cause Analysis
> # {*}Type Mapping Issue{*}: PostgreSQL {{numeric(0)}} fields are incorrectly
> mapped to {{DECIMAL}} with maximum precision in {{PostgresTypeUtils.java}}
> # {*}Binary Serialization Problem{*}: High-precision DECIMAL types create
> complex binary representations that are prone to corruption
> # {*}Missing Validation{*}: {{BinarySegmentUtils.readDecimalData()}} lacks
> bounds checking for invalid binary data, allowing negative memory offsets
> h3. Impact Assessment
> * {*}Severity{*}: Major - Complete pipeline failure
> * {*}Scope{*}: Affects any PostgreSQL database using {{numeric(0)}} columns
> * {*}Workaround{*}: None available (users must modify table schema)
> h3. Proposed Solution
> # Map PostgreSQL {{numeric(0)}} fields to {{BIGINT}} instead of {{DECIMAL}}
> to avoid complex binary serialization
> # Add defensive bounds checking in {{BinarySegmentUtils.readDecimalData()}}
> # Handle edge cases gracefully by returning zero values instead of crashing
> h3. Additional Context
> * This issue commonly occurs in PostgreSQL databases where {{numeric(0)}} is
> used to store whole numbers without decimal places
> * The problem is exacerbated when fields are nullable, as NULL values create
> invalid binary offset calculations
> * Different {{debezium.decimal.handling.mode}} settings (string, double,
> precise) all exhibit the same issue
> h3. Test Case Requirements
> * Unit tests for {{PostgresTypeUtils}} covering {{numeric(0)}} mapping
> * Unit tests for {{BinarySegmentUtils}} defensive bounds checking
> * Integration test with actual PostgreSQL table containing {{numeric(0)}}
> fields
> h3. Documentation Impact
> * Update connector documentation to clarify {{numeric(0)}} field handling
> * Add troubleshooting section for IndexOutOfBoundsException issues
--
This message was sent by Atlassian Jira
(v8.20.10#820010)