tchivs created FLINK-38196:
------------------------------

             Summary: IndexOutOfBoundsException when processing PostgreSQL 
tables with numeric(0) fields in Flink CDC
                 Key: FLINK-38196
                 URL: https://issues.apache.org/jira/browse/FLINK-38196
             Project: Flink
          Issue Type: Bug
          Components: Flink CDC
    Affects Versions: cdc-3.4.0
         Environment: - **Database**: PostgreSQL with tables containing 
`numeric(0)` columns
- **Connector**: PostgreSQL CDC Connector
- **Configuration**: Any `debezium.decimal.handling.mode` setting
            Reporter: tchivs


### Problem Statement
Flink CDC fails with `IndexOutOfBoundsException` when processing PostgreSQL 
tables containing `numeric(0)` fields, particularly when these fields have NULL 
values. This causes complete pipeline failure during binary decimal data 
deserialization.

### Error Details
```
Caused by: java.lang.IndexOutOfBoundsException: pos: -2130706416, length: 48, 
index: -2130706432, offset: 0
    at org.apache.flink.core.memory.MemorySegment.get(MemorySegment.java:467)
    at 
org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.copyToBytes(BinarySegmentUtils.java:131)
    at 
org.apache.flink.cdc.common.data.binary.BinarySegmentUtils.readDecimalData(BinarySegmentUtils.java:1003)
    at 
org.apache.flink.cdc.common.data.binary.BinaryRecordData.getDecimal(BinaryRecordData.java:163)
    at 
org.apache.flink.cdc.common.data.RecordData.lambda$createFieldGetter$7b8ca8ef$1(RecordData.java:195)
```

### Steps to Reproduce
1. Create a PostgreSQL table with `numeric(0)` columns:
   ```sql
   CREATE TABLE test_table (
       id SERIAL PRIMARY KEY,
       amount numeric(0),  -- This causes the issue
       name VARCHAR(100)
   );
   ```

2. Insert data including NULL values:
   ```sql
   INSERT INTO test_table (amount, name) VALUES (NULL, 'test');
   INSERT INTO test_table (amount, name) VALUES (123, 'test2');
   ```

3. Configure Flink CDC pipeline:
   ```yaml
   source:
     type: postgres
     hostname: localhost
     port: 5432
     username: postgres
     password: password
     database-name: testdb
     schema-name: public
     table-name: test_table
     debezium.decimal.handling.mode: string
   ```

4. Run the pipeline - it will crash with IndexOutOfBoundsException

### Expected Result
- Pipeline should successfully process all rows including those with NULL 
`numeric(0)` values
- No exceptions should be thrown during data processing

### Actual Result
- Pipeline crashes with `IndexOutOfBoundsException`
- Processing stops completely, preventing any data from being synchronized

### Root Cause Analysis
1. **Type Mapping Issue**: PostgreSQL `numeric(0)` fields are incorrectly 
mapped to `DECIMAL` with maximum precision in `PostgresTypeUtils.java`
2. **Binary Serialization Problem**: High-precision DECIMAL types create 
complex binary representations that are prone to corruption
3. **Missing Validation**: `BinarySegmentUtils.readDecimalData()` lacks bounds 
checking for invalid binary data, allowing negative memory offsets

### Impact Assessment
- **Severity**: Major - Complete pipeline failure
- **Scope**: Affects any PostgreSQL database using `numeric(0)` columns
- **Workaround**: None available (users must modify table schema)

### Proposed Solution
1. Map PostgreSQL `numeric(0)` fields to `BIGINT` instead of `DECIMAL` to avoid 
complex binary serialization
2. Add defensive bounds checking in `BinarySegmentUtils.readDecimalData()` 
3. Handle edge cases gracefully by returning zero values instead of crashing

### Additional Context
- This issue commonly occurs in PostgreSQL databases where `numeric(0)` is used 
to store whole numbers without decimal places
- The problem is exacerbated when fields are nullable, as NULL values create 
invalid binary offset calculations
- Different `debezium.decimal.handling.mode` settings (string, double, precise) 
all exhibit the same issue

### Test Case Requirements
- Unit tests for `PostgresTypeUtils` covering `numeric(0)` mapping
- Unit tests for `BinarySegmentUtils` defensive bounds checking  
- Integration test with actual PostgreSQL table containing `numeric(0)` fields

### Documentation Impact
- Update connector documentation to clarify `numeric(0)` field handling
- Add troubleshooting section for IndexOutOfBoundsException issues



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to