This is an automated email from the ASF dual-hosted git repository. hope pushed a commit to branch release-1.4 in repository https://gitbox.apache.org/repos/asf/paimon.git
commit 94222cffafa7c3681df3dfcccd3ac318899108b5 Author: JingsongLi <[email protected]> AuthorDate: Thu Mar 26 18:05:50 2026 +0800 [doc] Modify scenario-guide to changelog-producer --- docs/content/learn-paimon/scenario-guide.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/content/learn-paimon/scenario-guide.md b/docs/content/learn-paimon/scenario-guide.md index 9cb7c76bb6..9fe0f13dff 100644 --- a/docs/content/learn-paimon/scenario-guide.md +++ b/docs/content/learn-paimon/scenario-guide.md @@ -83,8 +83,10 @@ CREATE TABLE orders ( This mode gives you the best balance of write and read performance. Compared to the default MOR mode, MOW avoids merging at read time, which greatly improves OLAP query performance. - **`changelog-producer = lookup`**: Generates a complete [changelog]({{< ref "primary-key-table/changelog-producer#lookup" >}}) - for downstream streaming consumers. If no downstream streaming read is needed, you can omit this to save - compaction resources. + for downstream streaming consumers. If your CDC source is directly connected to a database (e.g., MySQL CDC, Postgres CDC), + you can use `changelog-producer = input` instead, since the database CDC stream already provides a complete changelog. + However, if your CDC source comes from Kafka (or other message queues), `input` may not be reliable — use `lookup` to + ensure changelog correctness. If no downstream streaming read is needed, you can omit this to save compaction resources. - **`sequence.field = update_time`**: Guarantees correct update ordering even when data arrives out of order. - **Bucketing**: Use the default Dynamic Bucket (`bucket = -1`). The system automatically adjusts bucket count based on data volume. If you are sensitive to data visibility latency, set a fixed bucket number (e.g. `'bucket' = '5'`)
