[
https://issues.apache.org/jira/browse/FLINK-38839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
tchivs updated FLINK-38839:
---------------------------
Description:
Problem
The transform rule table-options currently uses comma to separate key/value
pairs:
table-options: key1=value1,key2=value2
This format cannot represent options whose values contain commas, e.g.
multi-value options:
table-options: sequence.field=gxsj,jjsj,file-index.range-bitmap.columns=jjsj
The parser will incorrectly split sequence.field=gxsj,jjsj into two parts and
fail or produce wrong options.
Reproduction
Use a transform rule with table-options where a value contains comma:
transform:
- source-table: mydb.my_table
projection: "*"
table-options: sequence.field=gxsj,jjsj,file-index.bloom-filter.columns=jjdbh
Expected: sequence.field keeps value gxsj,jjsj. Actual: parsing splits by comma
and breaks the option value.
Proposal
Support semicolon (;) as an additional delimiter between key/value pairs while
keeping the existing comma format for backward compatibility:
If the string contains ;, split pairs by ;
Otherwise split pairs by , (existing behavior)
Split key=value by the first = only (split("=", 2))
Example with comma-in-value:
table-options:
sequence.field=gxsj,jjsj;file-index.range-bitmap.columns=jjsj;file-index.bloom-filter.columns=jjdbh
Compatibility
Backward compatible: existing key1=value1,key2=value2 keeps working.
New format is opt-in: users only need ; when option values include commas.
Tests
Add unit tests for comma delimiter, semicolon delimiter, comma-in-value, and =
in value.
Docs
Update transform docs (EN/ZH) to document semicolon delimiter when values
contain commas.
was:
## Problem
The transform rule `table-options` currently uses comma to separate key/value
pairs:
```yaml
table-options: key1=value1,key2=value2
```
This format cannot represent options whose **values contain commas**, e.g.
multi-value options:
```yaml
table-options: sequence.field=gxsj,jjsj,file-index.range-bitmap.columns=jjsj
```
The parser will incorrectly split `sequence.field=gxsj,jjsj` into two parts and
fail or produce wrong options.
## Reproduction
Use a transform rule with `table-options` where a value contains comma:
```yaml
transform:
- source-table: mydb.my_table
projection: "*"
table-options:
sequence.field=gxsj,jjsj,file-index.bloom-filter.columns=jjdbh
```
Expected: `sequence.field` keeps value `gxsj,jjsj`.
Actual: parsing splits by comma and breaks the option value.
## Proposal
Support **semicolon (`;`)** as an additional delimiter between key/value pairs
while keeping the existing comma format for backward compatibility:
- If the string contains `;`, split pairs by `;`
- Otherwise split pairs by `,` (existing behavior)
- Split `key=value` by the **first** `=` only (`split("=", 2)`)
Example with comma-in-value:
```yaml
table-options:
sequence.field=gxsj,jjsj;file-index.range-bitmap.columns=jjsj;file-index.bloom-filter.columns=jjdbh
```
## Compatibility
- Backward compatible: existing `key1=value1,key2=value2` keeps working.
- New format is opt-in: users only need `;` when option values include commas.
## Tests
- Add unit tests for comma delimiter, semicolon delimiter, comma-in-value, and
`=` in value.
## Docs
- Update transform docs (EN/ZH) to document semicolon delimiter when values
contain commas.
> Support semicolon delimiter for transform table-options
> -------------------------------------------------------
>
> Key: FLINK-38839
> URL: https://issues.apache.org/jira/browse/FLINK-38839
> Project: Flink
> Issue Type: Improvement
> Components: Flink CDC
> Reporter: tchivs
> Priority: Major
>
> Problem
> The transform rule table-options currently uses comma to separate key/value
> pairs:
> table-options: key1=value1,key2=value2
> This format cannot represent options whose values contain commas, e.g.
> multi-value options:
> table-options: sequence.field=gxsj,jjsj,file-index.range-bitmap.columns=jjsj
> The parser will incorrectly split sequence.field=gxsj,jjsj into two parts and
> fail or produce wrong options.
> Reproduction
> Use a transform rule with table-options where a value contains comma:
> transform:
> - source-table: mydb.my_table
> projection: "*"
> table-options: sequence.field=gxsj,jjsj,file-index.bloom-filter.columns=jjdbh
> Expected: sequence.field keeps value gxsj,jjsj. Actual: parsing splits by
> comma and breaks the option value.
> Proposal
> Support semicolon (;) as an additional delimiter between key/value pairs
> while keeping the existing comma format for backward compatibility:
> If the string contains ;, split pairs by ;
> Otherwise split pairs by , (existing behavior)
> Split key=value by the first = only (split("=", 2))
> Example with comma-in-value:
> table-options:
> sequence.field=gxsj,jjsj;file-index.range-bitmap.columns=jjsj;file-index.bloom-filter.columns=jjdbh
> Compatibility
> Backward compatible: existing key1=value1,key2=value2 keeps working.
> New format is opt-in: users only need ; when option values include commas.
> Tests
> Add unit tests for comma delimiter, semicolon delimiter, comma-in-value, and
> = in value.
> Docs
> Update transform docs (EN/ZH) to document semicolon delimiter when values
> contain commas.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)