nzw921rx commented on issue #10596: URL: https://github.com/apache/seatunnel/issues/10596#issuecomment-4056083773
Hi @davidzollo, I'd like to take on this issue. ### About me I've been working with SeaTunnel in production, building custom Transform and Sink plugins for our data pipeline, including: - A multi-table-aware **DataSnapshot Transform** (extends `AbstractMultiCatalogMapTransform`) - A **DataDiff Sink** (implements `SupportMultiTableSink`) - **KMS encryption** and **data guard** transforms on top of CDC sources (`MySQL-CDC`) ### My understanding of this issue Add a `sample-sharding.enable` (CDC) / `split.sample-sharding.enable` (JDBC) boolean option (default `true`) to let users explicitly disable sampling-based sharding. When set to `false`, the system falls back to unevenly-sized chunk splitting regardless of shard count. Key changes would be in: | Module | File | Change | |--------|------|--------| | CDC | `JdbcSourceOptions` / `BaseSourceConfig` | Add the new option | | JDBC | `JdbcSourceOptions` / `JdbcSourceConfig` | Add the new option | | CDC | `AbstractJdbcSourceChunkSplitter.evenlyColumnSplitChunks()` | Guard the sampling path | | JDBC | `DynamicChunkSplitter.evenlyColumnSplitChunks()` | Same guard | I'll also add corresponding **unit tests** and update the **documentation**. Could you please assign this issue to me? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
