(paimon) 08/41: [doc] Update Documentations for Append table

hope Tue, 31 Mar 2026 19:24:32 -0700

This is an automated email from the ASF dual-hosted git repository.

hope pushed a commit to branch release-1.4
in repository https://gitbox.apache.org/repos/asf/paimon.git


commit fcf405e54d6aef98b52c73b50d0a4c3a327fb7da
Author: JingsongLi <[email protected]>
AuthorDate: Wed Mar 25 23:10:11 2026 +0800

    [doc] Update Documentations for Append table
---
 docs/content/append-table/bucketed.md              |  89 ++++++------
 docs/content/append-table/data-evolution.md        |   2 +-
 .../content/append-table/incremental-clustering.md |   2 +-
 docs/content/append-table/overview.md              | 157 ++++++++++++++++++++-
 docs/content/append-table/query-performance.md     | 102 -------------
 docs/content/append-table/streaming.md             | 104 --------------
 docs/content/append-table/update.md                |  41 ------
 docs/content/learn-paimon/understand-files.md      |   2 +-
 8 files changed, 198 insertions(+), 301 deletions(-)

diff --git a/docs/content/append-table/bucketed.md 
b/docs/content/append-table/bucketed.md
index 5643da0c05..04dc30699b 100644
--- a/docs/content/append-table/bucketed.md
+++ b/docs/content/append-table/bucketed.md
@@ -1,6 +1,6 @@
 ---
 title: "Bucketed"
-weight: 5
+weight: 3
 type: docs
 aliases:
 - /append-table/bucketed.html
@@ -46,7 +46,46 @@ CREATE TABLE my_table (
 {{< /tab >}}
 {{< /tabs >}}
 
-## Streaming
+## Data Skipping
+
+The primary and most significant advantage of a bucketed append table is 
**data skipping**. When queries contain
+equality (`=`) or `IN` filter conditions on the `bucket-key`, Paimon can 
efficiently push these predicates down to
+skip irrelevant bucket files entirely. This means a large number of files that 
do not match the filter are pruned
+before reading, drastically reducing I/O and accelerating queries.
+
+For example, if `bucket-key` is `product_id` and you query:
+
+```sql
+SELECT * FROM my_table WHERE product_id = 12345;
+
+SELECT * FROM my_table WHERE product_id IN (1, 2, 3);
+```
+
+Paimon will only read the bucket that contains the matching `product_id` 
values, filtering out all other bucket files.
+This is extremely effective when the table has many buckets and you are 
querying a small subset of bucket-key values.
+
+## Bucketed Join
+
+Bucketed table can also be used to accelerate join queries by avoiding costly 
shuffle operations in batch processing.
+For example, you can use the following Spark SQL to read a Paimon table:
+
+```sql
+SET spark.sql.sources.v2.bucketing.enabled = true;
+
+CREATE TABLE FACT_TABLE (order_id INT, f1 STRING) TBLPROPERTIES 
('bucket'='10', 'bucket-key' = 'order_id');
+
+CREATE TABLE DIM_TABLE (order_id INT, f2 STRING) TBLPROPERTIES ('bucket'='10', 
'primary-key' = 'order_id');
+
+SELECT * FROM FACT_TABLE JOIN DIM_TABLE on t1.order_id = t4.order_id;
+```
+
+The `spark.sql.sources.v2.bucketing.enabled` config is used to enable 
bucketing for V2 data sources. When turned on,
+Spark will recognize the specific distribution reported by a V2 data source 
through SupportsReportPartitioning, and
+will try to avoid shuffle if necessary.
+
+The costly join shuffle will be avoided if two tables have the same bucketing 
strategy and same number of buckets.
+
+## Bucketed Streaming
 
 An ordinary Append table has no strict ordering guarantees for its streaming 
writes and reads, but there are some cases
 where you need to define a key similar to Kafka's.
@@ -57,43 +96,7 @@ bucket as a queue.
 
 {{< img src="/img/for-queue.png">}}
 
-### Compaction in Bucket
-
-By default, the sink node will automatically perform compaction to control the 
number of files. The following options
-control the strategy of compaction:
-
-<table class="configuration table table-bordered">
-    <thead>
-        <tr>
-            <th class="text-left" style="width: 20%">Key</th>
-            <th class="text-left" style="width: 15%">Default</th>
-            <th class="text-left" style="width: 10%">Type</th>
-            <th class="text-left" style="width: 55%">Description</th>
-        </tr>
-    </thead>
-    <tbody>
-        <tr>
-            <td><h5>write-only</h5></td>
-            <td style="word-wrap: break-word;">false</td>
-            <td>Boolean</td>
-            <td>If set to true, compactions and snapshot expiration will be 
skipped. This option is used along with dedicated compact jobs.</td>
-        </tr>
-        <tr>
-            <td><h5>compaction.min.file-num</h5></td>
-            <td style="word-wrap: break-word;">5</td>
-            <td>Integer</td>
-            <td>For file set [f_0,...,f_N], the minimum file number to trigger 
a compaction for append table.</td>
-        </tr>
-        <tr>
-            <td><h5>full-compaction.delta-commits</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>Integer</td>
-            <td>Full compaction will be constantly triggered after delta 
commits.</td>
-        </tr>
-    </tbody>
-</table>
-
-### Streaming Read Order
+**Streaming Read Order**
 
 For streaming reads, records are produced in the following order:
 
@@ -103,7 +106,7 @@ For streaming reads, records are produced in the following 
order:
 * For any two records from the same partition and the same bucket, the first 
written record will be produced first.
 * For any two records from the same partition but two different buckets, 
different buckets are processed by different tasks, there is no order guarantee 
between them.
 
-### Watermark Definition
+**Watermark Definition**
 
 You can define watermark for reading Paimon tables:
 
@@ -148,7 +151,7 @@ which will make sure no sources/splits/shards/partitions 
increase their watermar
     </tbody>
 </table>
 
-### Bounded Stream
+**Bounded Stream**
 
 Streaming Source can also be bounded, you can specify 'scan.bounded.watermark' 
to define the end condition for bounded streaming mode, stream reading will end 
until a larger watermark snapshot is encountered.
 
@@ -170,7 +173,3 @@ INSERT INTO paimon_table SELECT * FROM kakfa_table;
 -- launch a bounded streaming job to read paimon_table
 SELECT * FROM paimon_table /*+ OPTIONS('scan.bounded.watermark'='...') */;
 ```
-
-## Bucketed Join
-
-Bucketed table can be used to avoid shuffle if necessary in batch query, see 
[Bucketed Join]({{< ref "append-table/query-performance#bucketed-join" >}}).
diff --git a/docs/content/append-table/data-evolution.md 
b/docs/content/append-table/data-evolution.md
index fb3b91da5f..2356ef365d 100644
--- a/docs/content/append-table/data-evolution.md
+++ b/docs/content/append-table/data-evolution.md
@@ -84,7 +84,7 @@ This statement updates only the `b` column in the target 
table `target_table` ba
 `source_table`. The `id` column and `c` column remain unchanged, and new 
records are inserted with the specified values. The difference between this and 
table those are not enabled with data evolution is that only the `b` column 
data is written to new files.
 
 Note that: 
-* Data Evolution Table does not support 'Delete', 'Update', or 'Compact' 
statement yet.
+* Data Evolution Table does not support 'Delete' and 'Update' statement yet.
 * Merge Into for Data Evolution Table does not support 'WHEN NOT MATCHED BY 
SOURCE' clause.
 
 ### Flink
diff --git a/docs/content/append-table/incremental-clustering.md 
b/docs/content/append-table/incremental-clustering.md
index 5358c040d1..232ee13d7c 100644
--- a/docs/content/append-table/incremental-clustering.md
+++ b/docs/content/append-table/incremental-clustering.md
@@ -1,6 +1,6 @@
 ---
 title: "Incremental Clustering"
-weight: 4
+weight: 2
 type: docs
 aliases:
 - /append-table/incremental-clustering.html
diff --git a/docs/content/append-table/overview.md 
b/docs/content/append-table/overview.md
index 67d063c584..8644106e83 100644
--- a/docs/content/append-table/overview.md
+++ b/docs/content/append-table/overview.md
@@ -50,9 +50,154 @@ CREATE TABLE my_table (
 Batch write and batch read in typical application scenarios, similar to a 
regular Hive partition table, but compared to
 the Hive table, it can bring:
 
-1. Object storage (S3, OSS) friendly
-2. Time Travel and Rollback
-3. DELETE / UPDATE with low cost
-4. Automatic small file merging in streaming sink
-5. Streaming read & write like a queue
-6. High performance query with order and index
+1. Time travel enables reproducible queries that use exactly the same table 
snapshot, or lets users easily examine
+   changes. Version rollback allows users to quickly correct problems by 
resetting tables to a good state.
+2. Scan planning is fast — data files are pruned with partition and 
column-level stats, using table metadata. File
+   Index (BloomFilter, Bitmap, Range Bitmap) and aggregate push-down further 
accelerate queries.
+3. Schema evolution supports add, drop, update, or rename columns, and has no 
side-effects.
+4. Rich ecosystem — adds tables to compute engines including Flink, Spark, 
Hive, Trino, Presto, StarRocks, and Doris,
+   working just like a SQL table.
+5. Incremental Clustering with z-order/hilbert/order sorting to optimize data 
layout at low cost.
+6. Streaming read & write like a queue, DELETE / UPDATE / MERGE INTO support 
low-cost row-level operations.
+
+---
+title: "Streaming"
+weight: 2
+type: docs
+aliases:
+- /append-table/streaming.html
+---
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+## Append Streaming
+
+You can stream write to the Append table in a very flexible way through Flink, 
or read the Append table through
+Flink, using it like a queue. The only difference is that its latency is in 
minutes. Its advantages are very low cost
+and the ability to push down filters and projection.
+
+**Pre small files merging**
+
+"Pre" means that this compact occurs before committing files to the snapshot.
+
+If Flink's checkpoint interval is short (for example, 30 seconds), each 
snapshot may produce lots of small changelog
+files. Too many files may put a burden on the distributed storage cluster.
+
+In order to compact small changelog files into large ones, you can set the 
table option `precommit-compact = true`.
+Default value of this option is false, if true, it will add a compact 
coordinator and worker operator after the writer
+operator, which copies changelog files into large ones.
+
+**Post small files merging**
+
+"Post" means that this compact occurs after committing files to the snapshot.
+
+In streaming write job, without bucket definition, there is no compaction in 
writer, instead, will use
+`Compact Coordinator` to scan the small files and pass compaction task to 
`Compact Worker`. In streaming mode, if you
+run insert sql in flink, the topology will be like this:
+
+{{< img src="/img/unaware-bucket-topo.png">}}
+
+Do not worry about backpressure, compaction never backpressure.
+
+If you set `write-only` to true, the `Compact Coordinator` and `Compact 
Worker` will be removed in the topology.
+
+The auto compaction is only supported in Flink engine streaming mode. You can 
also start a compaction job in Flink by
+Flink action in Paimon and disable all the other compactions by setting 
`write-only`.
+
+**Streaming Query**
+
+You can stream the Append table and use it like a Message Queue. As with 
primary key tables, there are two options
+for streaming reads:
+1. By default, Streaming read produces the latest snapshot on the table upon 
first startup, and continue to read the
+   latest incremental records.
+2. You can specify `scan.mode`, `scan.snapshot-id`, `scan.timestamp-millis` 
and/or `scan.file-creation-time-millis` to
+   stream read incremental only.
+
+Similar to flink-kafka, order is not guaranteed by default, if your data has 
some sort of order requirement, you also
+need to consider defining a `bucket-key`, see [Bucketed Append]({{< ref 
"append-table/bucketed" >}})
+
+## Aggregate push down
+
+Append Table supports aggregate push down:
+
+```sql
+SELECT COUNT(*) FROM TABLE WHERE DT = '20230101';
+```
+
+This query can be accelerated during compilation and returns very quickly.
+
+For Spark SQL, table with default `metadata.stats-mode` can be accelerated:
+
+```sql
+SELECT MIN(a), MAX(b) FROM TABLE WHERE DT = '20230101';
+
+SELECT * FROM TABLE ORDER BY a LIMIT 1;
+```
+
+Min max topN query can be also accelerated during compilation and returns very 
quickly.
+
+## Data Skipping By Order
+
+Paimon by default records the maximum and minimum values of each field in the 
manifest file.
+
+In the query, according to the `WHERE` condition of the query, together with 
the statistics in the manifest we can
+perform file filtering. If the filtering effect is good, the query that would 
have cost minutes will be accelerated to
+milliseconds to complete the execution.
+
+Often the data distribution is not always ideal for filtering, so can we sort 
the data by the field in `WHERE` condition?
+You can take a look at [Flink COMPACT Action]({{< ref 
"maintenance/dedicated-compaction#sort-compact" >}}),
+[Flink COMPACT Procedure]({{< ref "flink/procedures" >}}) or [Spark COMPACT 
Procedure]({{< ref "spark/procedures" >}}).
+
+## Data Skipping By File Index
+
+You can use file index too, it filters files by indexing on the reading side.
+
+Define `file-index.bitmap.columns`, Data file index is an external index file 
and Paimon will create its
+corresponding index file for each file. If the index file is too small, it 
will be stored directly in the manifest,
+otherwise in the directory of the data file. Each data file corresponds to an 
index file, which has a separate file
+definition and can contain different types of indexes with multiple columns.
+
+Different file indexes may be efficient in different scenarios. For example 
bloom filter may speed up query in point lookup
+scenario. Using a bitmap may consume more space but can result in greater 
accuracy.
+
+* [BloomFilter]({{< ref "concepts/spec/fileindex#index-bloomfilter" >}}): 
`file-index.bloom-filter.columns`.
+* [Bitmap]({{< ref "concepts/spec/fileindex#index-bitmap" >}}): 
`file-index.bitmap.columns`.
+* [Range Bitmap]({{< ref "concepts/spec/fileindex#index-range-bitmap" >}}): 
`file-index.range-bitmap.columns`.
+
+If you want to add file index to existing table, without any rewrite, you can 
use `rewrite_file_index` procedure. Before
+we use the procedure, you should config appropriate configurations in target 
table. You can use ALTER clause to config
+`file-index.<filter-type>.columns` to the table.
+
+How to invoke: see [flink procedures]({{< ref "flink/procedures#procedures" 
>}}) 
+
+## Row Level Operations
+
+Now, only Spark SQL supports DELETE & UPDATE & MERGE INTO, you can take a look 
at [Spark Write]({{< ref "spark/sql-write" >}}).
+
+Example:
+```sql
+DELETE FROM my_table WHERE currency = 'UNKNOWN';
+```
+
+Update append table has two modes:
+
+1. COW (Copy on Write): search for the hit files and then rewrite each file to 
remove the data that needs to be deleted
+   from the files. This operation is costly.
+2. MOW (Merge on Write): By specifying `'deletion-vectors.enabled' = 'true'`, 
the Deletion Vectors mode can be enabled.
+   Only marks certain records of the corresponding file for deletion and 
writes the deletion file, without rewriting the entire file.
diff --git a/docs/content/append-table/query-performance.md 
b/docs/content/append-table/query-performance.md
deleted file mode 100644
index aad62636e5..0000000000
--- a/docs/content/append-table/query-performance.md
+++ /dev/null
@@ -1,102 +0,0 @@
----
-title: "Query Performance"
-weight: 3
-type: docs
-aliases:
-- /append-table/query-performance.html
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Query Performance
-
-## Aggregate push down
-
-Append Table supports aggregate push down:
-
-```sql
-SELECT COUNT(*) FROM TABLE WHERE DT = '20230101';
-```
-
-This query can be accelerated during compilation and returns very quickly.
-
-For Spark SQL, table with default `metadata.stats-mode` can be accelerated:
-
-```sql
-SELECT MIN(a), MAX(b) FROM TABLE WHERE DT = '20230101';
-
-SELECT * FROM TABLE ORDER BY a LIMIT 1;
-```
-
-Min max topN query can be also accelerated during compilation and returns very 
quickly.
-
-## Data Skipping By Order
-
-Paimon by default records the maximum and minimum values of each field in the 
manifest file.
-
-In the query, according to the `WHERE` condition of the query, together with 
the statistics in the manifest we can
-perform file filtering. If the filtering effect is good, the query that would 
have cost minutes will be accelerated to
-milliseconds to complete the execution.
-
-Often the data distribution is not always ideal for filtering, so can we sort 
the data by the field in `WHERE` condition?
-You can take a look at [Flink COMPACT Action]({{< ref 
"maintenance/dedicated-compaction#sort-compact" >}}),
-[Flink COMPACT Procedure]({{< ref "flink/procedures" >}}) or [Spark COMPACT 
Procedure]({{< ref "spark/procedures" >}}).
-
-## Data Skipping By File Index
-
-You can use file index too, it filters files by indexing on the reading side.
-
-Define `file-index.bitmap.columns`, Data file index is an external index file 
and Paimon will create its
-corresponding index file for each file. If the index file is too small, it 
will be stored directly in the manifest,
-otherwise in the directory of the data file. Each data file corresponds to an 
index file, which has a separate file
-definition and can contain different types of indexes with multiple columns.
-
-Different file indexes may be efficient in different scenarios. For example 
bloom filter may speed up query in point lookup
-scenario. Using a bitmap may consume more space but can result in greater 
accuracy.
-
-* [BloomFilter]({{< ref "concepts/spec/fileindex#index-bloomfilter" >}}): 
`file-index.bloom-filter.columns`.
-* [Bitmap]({{< ref "concepts/spec/fileindex#index-bitmap" >}}): 
`file-index.bitmap.columns`.
-* [Range Bitmap]({{< ref "concepts/spec/fileindex#index-range-bitmap" >}}): 
`file-index.range-bitmap.columns`.
-
-If you want to add file index to existing table, without any rewrite, you can 
use `rewrite_file_index` procedure. Before
-we use the procedure, you should config appropriate configurations in target 
table. You can use ALTER clause to config
-`file-index.<filter-type>.columns` to the table.
-
-How to invoke: see [flink procedures]({{< ref "flink/procedures#procedures" 
>}}) 
-
-## Bucketed Join
-
-Bucketed table can be used to avoid shuffle if necessary in batch query, for 
example, you can use the following Spark
-SQL to read a Paimon table:
-
-```sql
-SET spark.sql.sources.v2.bucketing.enabled = true;
-
-CREATE TABLE FACT_TABLE (order_id INT, f1 STRING) TBLPROPERTIES 
('bucket'='10', 'bucket-key' = 'order_id');
-
-CREATE TABLE DIM_TABLE (order_id INT, f2 STRING) TBLPROPERTIES ('bucket'='10', 
'primary-key' = 'order_id');
-
-SELECT * FROM FACT_TABLE JOIN DIM_TABLE on t1.order_id = t4.order_id;
-```
-
-The `spark.sql.sources.v2.bucketing.enabled` config is used to enable 
bucketing for V2 data sources. When turned on,
-Spark will recognize the specific distribution reported by a V2 data source 
through SupportsReportPartitioning, and
-will try to avoid shuffle if necessary.
-
-The costly join shuffle will be avoided if two tables have the same bucketing 
strategy and same number of buckets.
diff --git a/docs/content/append-table/streaming.md 
b/docs/content/append-table/streaming.md
deleted file mode 100644
index 49f44a7f13..0000000000
--- a/docs/content/append-table/streaming.md
+++ /dev/null
@@ -1,104 +0,0 @@
----
-title: "Streaming"
-weight: 2
-type: docs
-aliases:
-- /append-table/streaming.html
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Streaming
-
-You can stream write to the Append table in a very flexible way through Flink, 
or read the Append table through
-Flink, using it like a queue. The only difference is that its latency is in 
minutes. Its advantages are very low cost
-and the ability to push down filters and projection.
-
-## Pre small files merging
-
-"Pre" means that this compact occurs before committing files to the snapshot.
-
-If Flink's checkpoint interval is short (for example, 30 seconds), each 
snapshot may produce lots of small changelog
-files. Too many files may put a burden on the distributed storage cluster.
-
-In order to compact small changelog files into large ones, you can set the 
table option `precommit-compact = true`.
-Default value of this option is false, if true, it will add a compact 
coordinator and worker operator after the writer
-operator, which copies changelog files into large ones.
-
-## Post small files merging
-
-"Post" means that this compact occurs after committing files to the snapshot.
-
-In streaming write job, without bucket definition, there is no compaction in 
writer, instead, will use
-`Compact Coordinator` to scan the small files and pass compaction task to 
`Compact Worker`. In streaming mode, if you
-run insert sql in flink, the topology will be like this:
-
-{{< img src="/img/unaware-bucket-topo.png">}}
-
-Do not worry about backpressure, compaction never backpressure.
-
-If you set `write-only` to true, the `Compact Coordinator` and `Compact 
Worker` will be removed in the topology.
-
-The auto compaction is only supported in Flink engine streaming mode. You can 
also start a compaction job in Flink by
-Flink action in Paimon and disable all the other compactions by setting 
`write-only`.
-
-The following options control the strategy of compaction:
-
-<table class="configuration table table-bordered">
-    <thead>
-        <tr>
-            <th class="text-left" style="width: 20%">Key</th>
-            <th class="text-left" style="width: 15%">Default</th>
-            <th class="text-left" style="width: 10%">Type</th>
-            <th class="text-left" style="width: 55%">Description</th>
-        </tr>
-    </thead>
-    <tbody>
-        <tr>
-            <td><h5>write-only</h5></td>
-            <td style="word-wrap: break-word;">false</td>
-            <td>Boolean</td>
-            <td>If set to true, compactions and snapshot expiration will be 
skipped. This option is used along with dedicated compact jobs.</td>
-        </tr>
-        <tr>
-            <td><h5>compaction.min.file-num</h5></td>
-            <td style="word-wrap: break-word;">5</td>
-            <td>Integer</td>
-            <td>For file set [f_0,...,f_N], the minimum file number to trigger 
a compaction for append table.</td>
-        </tr>
-        <tr>
-            <td><h5>compaction.delete-ratio-threshold</h5></td>
-            <td style="word-wrap: break-word;">(none)</td>
-            <td>Double</td>
-            <td>Ratio of the deleted rows in a data file to be forced 
compacted.</td>
-        </tr>
-    </tbody>
-</table>
-
-## Streaming Query
-
-You can stream the Append table and use it like a Message Queue. As with 
primary key tables, there are two options
-for streaming reads:
-1. By default, Streaming read produces the latest snapshot on the table upon 
first startup, and continue to read the
-   latest incremental records.
-2. You can specify `scan.mode`, `scan.snapshot-id`, `scan.timestamp-millis` 
and/or `scan.file-creation-time-millis` to
-   stream read incremental only.
-
-Similar to flink-kafka, order is not guaranteed by default, if your data has 
some sort of order requirement, you also
-need to consider defining a `bucket-key`, see [Bucketed Append]({{< ref 
"append-table/bucketed" >}})
diff --git a/docs/content/append-table/update.md 
b/docs/content/append-table/update.md
deleted file mode 100644
index 5e373cbd69..0000000000
--- a/docs/content/append-table/update.md
+++ /dev/null
@@ -1,41 +0,0 @@
----
-title: "Update"
-weight: 4
-type: docs
-aliases:
-- /append-table/update.html
----
-<!--
-Licensed to the Apache Software Foundation (ASF) under one
-or more contributor license agreements.  See the NOTICE file
-distributed with this work for additional information
-regarding copyright ownership.  The ASF licenses this file
-to you under the Apache License, Version 2.0 (the
-"License"); you may not use this file except in compliance
-with the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing,
-software distributed under the License is distributed on an
-"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-KIND, either express or implied.  See the License for the
-specific language governing permissions and limitations
-under the License.
--->
-
-# Update
-
-Now, only Spark SQL supports DELETE & UPDATE, you can take a look at [Spark 
Write]({{< ref "spark/sql-write" >}}).
-
-Example:
-```sql
-DELETE FROM my_table WHERE currency = 'UNKNOWN';
-```
-
-Update append table has two modes:
-
-1. COW (Copy on Write): search for the hit files and then rewrite each file to 
remove the data that needs to be deleted
-   from the files. This operation is costly.
-2. MOW (Merge on Write): By specifying `'deletion-vectors.enabled' = 'true'`, 
the Deletion Vectors mode can be enabled.
-   Only marks certain records of the corresponding file for deletion and 
writes the deletion file, without rewriting the entire file.
diff --git a/docs/content/learn-paimon/understand-files.md 
b/docs/content/learn-paimon/understand-files.md
index a8a9ac52ce..58fed51b84 100644
--- a/docs/content/learn-paimon/understand-files.md
+++ b/docs/content/learn-paimon/understand-files.md
@@ -488,7 +488,7 @@ this means that there are at least 5 files in a bucket. If 
you want to reduce th
 By default, Append also does automatic compaction to reduce the number of 
small files.
 
 However, for Bucketed Append table, it will only compact the files within the 
Bucket for sequential
-purposes, which may keep more small files. See [Bucketed Append]({{< ref 
"append-table/streaming#bucketed-append" >}}).
+purposes, which may keep more small files. See [Bucketed Append]({{< ref 
"append-table/bucketed" >}}).
 
 ### Understand Full-Compaction

(paimon) 08/41: [doc] Update Documentations for Append table

Reply via email to