This is an automated email from the ASF dual-hosted git repository.
lzljs3620320 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/paimon.git
The following commit(s) were added to refs/heads/master by this push:
new 20217886a2 [doc] Modify scenario-guide to changelog-producer
20217886a2 is described below
commit 20217886a2f1ac2931b6b8d4736fc6eafb22d74a
Author: JingsongLi <[email protected]>
AuthorDate: Thu Mar 26 18:05:50 2026 +0800
[doc] Modify scenario-guide to changelog-producer
---
docs/content/learn-paimon/scenario-guide.md | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/docs/content/learn-paimon/scenario-guide.md
b/docs/content/learn-paimon/scenario-guide.md
index 9cb7c76bb6..9fe0f13dff 100644
--- a/docs/content/learn-paimon/scenario-guide.md
+++ b/docs/content/learn-paimon/scenario-guide.md
@@ -83,8 +83,10 @@ CREATE TABLE orders (
This mode gives you the best balance of write and read performance. Compared
to the default MOR mode, MOW
avoids merging at read time, which greatly improves OLAP query performance.
- **`changelog-producer = lookup`**: Generates a complete [changelog]({{< ref
"primary-key-table/changelog-producer#lookup" >}})
- for downstream streaming consumers. If no downstream streaming read is
needed, you can omit this to save
- compaction resources.
+ for downstream streaming consumers. If your CDC source is directly connected
to a database (e.g., MySQL CDC, Postgres CDC),
+ you can use `changelog-producer = input` instead, since the database CDC
stream already provides a complete changelog.
+ However, if your CDC source comes from Kafka (or other message queues),
`input` may not be reliable — use `lookup` to
+ ensure changelog correctness. If no downstream streaming read is needed, you
can omit this to save compaction resources.
- **`sequence.field = update_time`**: Guarantees correct update ordering even
when data arrives out of order.
- **Bucketing**: Use the default Dynamic Bucket (`bucket = -1`). The system
automatically adjusts bucket count based
on data volume. If you are sensitive to data visibility latency, set a fixed
bucket number (e.g. `'bucket' = '5'`)