[ 
https://issues.apache.org/jira/browse/FLINK-38839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tchivs updated FLINK-38839:
---------------------------
    Description: 
Problem
The transform rule table-options currently uses comma to separate key/value 
pairs:

table-options: key1=value1,key2=value2
This format cannot represent options whose values contain commas, e.g. 
multi-value options:

table-options: sequence.field=gxsj,jjsj,file-index.range-bitmap.columns=jjsj
The parser will incorrectly split sequence.field=gxsj,jjsj into two parts and 
fail or produce wrong options.

Reproduction
Use a transform rule with table-options where a value contains comma:

transform:
 - source-table: mydb.my_table
projection: "*"
table-options: sequence.field=gxsj,jjsj,file-index.bloom-filter.columns=jjdbh
Expected: sequence.field keeps value gxsj,jjsj. Actual: parsing splits by comma 
and breaks the option value.

Proposal
Support semicolon (;) as an additional delimiter between key/value pairs while 
keeping the existing comma format for backward compatibility:

If the string contains ;, split pairs by ;
Otherwise split pairs by , (existing behavior)
Split key=value by the first = only (split("=", 2))
Example with comma-in-value:

table-options: 
sequence.field=gxsj,jjsj;file-index.range-bitmap.columns=jjsj;file-index.bloom-filter.columns=jjdbh
Compatibility
Backward compatible: existing key1=value1,key2=value2 keeps working.
New format is opt-in: users only need ; when option values include commas.
Tests
Add unit tests for comma delimiter, semicolon delimiter, comma-in-value, and = 
in value.
Docs
Update transform docs (EN/ZH) to document semicolon delimiter when values 
contain commas.

  was:

## Problem
The transform rule `table-options` currently uses comma to separate key/value 
pairs:

```yaml
table-options: key1=value1,key2=value2
```

This format cannot represent options whose **values contain commas**, e.g. 
multi-value options:

```yaml
table-options: sequence.field=gxsj,jjsj,file-index.range-bitmap.columns=jjsj
```

The parser will incorrectly split `sequence.field=gxsj,jjsj` into two parts and 
fail or produce wrong options.

## Reproduction
Use a transform rule with `table-options` where a value contains comma:

```yaml
transform:
  - source-table: mydb.my_table
    projection: "*"
    table-options: 
sequence.field=gxsj,jjsj,file-index.bloom-filter.columns=jjdbh
```

Expected: `sequence.field` keeps value `gxsj,jjsj`.
Actual: parsing splits by comma and breaks the option value.

## Proposal
Support **semicolon (`;`)** as an additional delimiter between key/value pairs 
while keeping the existing comma format for backward compatibility:

- If the string contains `;`, split pairs by `;`
- Otherwise split pairs by `,` (existing behavior)
- Split `key=value` by the **first** `=` only (`split("=", 2)`)

Example with comma-in-value:

```yaml
table-options: 
sequence.field=gxsj,jjsj;file-index.range-bitmap.columns=jjsj;file-index.bloom-filter.columns=jjdbh
```

## Compatibility
- Backward compatible: existing `key1=value1,key2=value2` keeps working.
- New format is opt-in: users only need `;` when option values include commas.

## Tests
- Add unit tests for comma delimiter, semicolon delimiter, comma-in-value, and 
`=` in value.

## Docs
- Update transform docs (EN/ZH) to document semicolon delimiter when values 
contain commas.




> Support semicolon delimiter for transform table-options
> -------------------------------------------------------
>
>                 Key: FLINK-38839
>                 URL: https://issues.apache.org/jira/browse/FLINK-38839
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>            Reporter: tchivs
>            Priority: Major
>
> Problem
> The transform rule table-options currently uses comma to separate key/value 
> pairs:
> table-options: key1=value1,key2=value2
> This format cannot represent options whose values contain commas, e.g. 
> multi-value options:
> table-options: sequence.field=gxsj,jjsj,file-index.range-bitmap.columns=jjsj
> The parser will incorrectly split sequence.field=gxsj,jjsj into two parts and 
> fail or produce wrong options.
> Reproduction
> Use a transform rule with table-options where a value contains comma:
> transform:
>  - source-table: mydb.my_table
> projection: "*"
> table-options: sequence.field=gxsj,jjsj,file-index.bloom-filter.columns=jjdbh
> Expected: sequence.field keeps value gxsj,jjsj. Actual: parsing splits by 
> comma and breaks the option value.
> Proposal
> Support semicolon (;) as an additional delimiter between key/value pairs 
> while keeping the existing comma format for backward compatibility:
> If the string contains ;, split pairs by ;
> Otherwise split pairs by , (existing behavior)
> Split key=value by the first = only (split("=", 2))
> Example with comma-in-value:
> table-options: 
> sequence.field=gxsj,jjsj;file-index.range-bitmap.columns=jjsj;file-index.bloom-filter.columns=jjdbh
> Compatibility
> Backward compatible: existing key1=value1,key2=value2 keeps working.
> New format is opt-in: users only need ; when option values include commas.
> Tests
> Add unit tests for comma delimiter, semicolon delimiter, comma-in-value, and 
> = in value.
> Docs
> Update transform docs (EN/ZH) to document semicolon delimiter when values 
> contain commas.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to