This is an automated email from the ASF dual-hosted git repository. gavinchou pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 89c06a106df [doc](typo) fix some typo in table-design section (#1240) 89c06a106df is described below commit 89c06a106df7e50e2b26f5f4cbcba8663ab059bf Author: yagagagaga <zhangminkefromflyd...@gmail.com> AuthorDate: Wed Oct 30 19:47:58 2024 +0800 [doc](typo) fix some typo in table-design section (#1240) # Versions - [x] dev - [x] 3.0 - [x] 2.1 - [x] 2.0 # Languages - [x] Chinese - [x] English --- docs/table-design/auto-increment.md | 2 +- docs/table-design/index/inverted-index.md | 2 +- docs/table-design/index/ngram-bloomfilter-index.md | 10 +- docs/table-design/schema-change.md | 144 +++++++++++++++---- .../current/table-design/auto-increment.md | 2 +- .../current/table-design/index/inverted-index.md | 2 +- .../table-design/index/ngram-bloomfilter-index.md | 8 ++ .../current/table-design/index/prefix-index.md | 4 +- .../current/table-design/schema-change.md | 142 +++++++++++++++---- .../version-1.2/data-table/index/prefix-index.md | 6 +- .../table-design/index/ngram-bloomfilter-index.md | 8 ++ .../version-2.0/table-design/index/prefix-index.md | 4 +- .../version-2.0/table-design/schema-change.md | 144 +++++++++++++++---- .../version-2.1/table-design/auto-increment.md | 2 +- .../table-design/index/ngram-bloomfilter-index.md | 8 ++ .../version-2.1/table-design/index/prefix-index.md | 4 +- .../version-2.1/table-design/schema-change.md | 142 +++++++++++++++---- .../version-3.0/table-design/auto-increment.md | 2 +- .../table-design/index/inverted-index.md | 2 +- .../table-design/index/ngram-bloomfilter-index.md | 8 ++ .../version-3.0/table-design/index/prefix-index.md | 4 +- .../version-3.0/table-design/schema-change.md | 142 +++++++++++++++---- .../table-design/index/ngram-bloomfilter-index.md | 10 +- .../version-2.0/table-design/schema-change.md | 146 ++++++++++++++++---- .../version-2.1/table-design/auto-increment.md | 2 +- .../table-design/index/ngram-bloomfilter-index.md | 10 +- .../version-2.1/table-design/schema-change.md | 152 +++++++++++++++++---- .../version-3.0/table-design/auto-increment.md | 2 +- .../table-design/index/inverted-index.md | 2 +- .../table-design/index/ngram-bloomfilter-index.md | 10 +- .../version-3.0/table-design/schema-change.md | 152 +++++++++++++++++---- 31 files changed, 1047 insertions(+), 231 deletions(-) diff --git a/docs/table-design/auto-increment.md b/docs/table-design/auto-increment.md index be2ea78d71c..1e1eae38451 100644 --- a/docs/table-design/auto-increment.md +++ b/docs/table-design/auto-increment.md @@ -456,7 +456,7 @@ select * from records_tbl order by `key`, `name` limit 100; Fetching the data for the second page can be accomplished by: ```sql -select * from records_tbl order by `key`, `name` limit 100, offset 100; +select * from records_tbl order by `key`, `name` limit 100 offset 100; ``` However, when performing deep pagination queries (with large offsets), even if the actual required data rows are few, this method still reads all data into memory for full sorting before subsequent processing, which is quite inefficient. Using an auto-incrementa column assigns a unique value to each row, allowing the use of where `unique_value` > x limit y to filter a significant amount of data beforehand, making pagination more efficient. diff --git a/docs/table-design/index/inverted-index.md b/docs/table-design/index/inverted-index.md index f54078ae846..e7f37511857 100644 --- a/docs/table-design/index/inverted-index.md +++ b/docs/table-design/index/inverted-index.md @@ -234,7 +234,7 @@ CANCEL BUILD INDEX ON table_name (job_id1, job_id2, ...); `BUILD INDEX` creates an asynchronous task executed by multiple threads on each BE. The number of threads can be set using the BE config `alter_index_worker_count`, with a default value of 3. -In versions before 2.0.12 and 2.1.4, `BUILD INDEX` would keep retrying until it succeeded. Starting from these versions, failure and timeout mechanisms prevent endless retries. +In versions before 2.0.12 and 2.1.4, `BUILD INDEX` would keep retrying until it succeeded. Starting from these versions, failure and timeout mechanisms prevent endless retries. 3.0 (Cloud Mode) does not support this command as this moment. 1. If the majority of replicas for a tablet fail to `BUILD INDEX`, the entire `BUILD INDEX` operation fails. 2. If the time exceeds `alter_table_timeout_second`, the `BUILD INDEX` operation times out. diff --git a/docs/table-design/index/ngram-bloomfilter-index.md b/docs/table-design/index/ngram-bloomfilter-index.md index c0a422a891c..71368ead8c0 100644 --- a/docs/table-design/index/ngram-bloomfilter-index.md +++ b/docs/table-design/index/ngram-bloomfilter-index.md @@ -142,6 +142,14 @@ curl --location-trusted -u root: -T amazon_reviews_2014.snappy.parquet -H "forma curl --location-trusted -u root: -T amazon_reviews_2015.snappy.parquet -H "format:parquet" http://127.0.0.1:8030/api/${DB}/amazon_reviews/_stream_load ``` +:::info +The data file may exceed 10 GB, and you may need to adjust the streaming_road_max_mb in be.conf to prevent exceeding the upload size limit of the stream load. You can dynamically adjust it by following the steps below: +```bash +curl -X POST http://{be_ip}:{be_http_port}/api/update_config?streaming_load_max_mb=32768 +``` +Every BE needs to execute the above command. +::: + **Run a count query to confirm successful data import:** ```sql @@ -208,4 +216,4 @@ ALTER TABLE amazon_reviews ADD INDEX review_body_ngram_idx(review_body) USING NG | B00KWVZ750 | Angry Birds Epic RPG | 5 | 6 | +------------+------------------------------------------+--------------------+-------+ 5 rows in set (0.93 sec) -``` \ No newline at end of file +``` diff --git a/docs/table-design/schema-change.md b/docs/table-design/schema-change.md index 608b2a5bbca..5f60d70a445 100644 --- a/docs/table-design/schema-change.md +++ b/docs/table-design/schema-change.md @@ -97,7 +97,27 @@ ALTER TABLE table_name ADD COLUMN column_name column_type [KEY | agg_type] [DEFA ### Examples -**1. Adding a key column `new_col` (non-aggregate model) after col1 to `example_rollup_index`** +#### non-aggregate model + +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int, + col5 int +) DUPLICATE KEY(col1, col2, col3) +DISTRIBUTED BY RANDOM BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**1. Adding a key column `new_col` after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -105,7 +125,7 @@ ADD COLUMN new_col INT KEY DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**2. Adding a value column `new_col` (non-aggregate model) with a default value of 0 after col1 to `example_rollup_index`** +**2. Adding a value column `new_col` with a default value of 0 after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -113,7 +133,27 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**3. Adding a Key column `new_col` (aggregate model) after col1 to `example_rollup_index`** +#### aggregate model + +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col1, col2, col3) +DISTRIBUTED BY HASH(col1) BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**3. Adding a Key column `new_col` after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -121,7 +161,7 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**4. Adding a value column `new_col` with SUM aggregation type (aggregate model) after col1 to `example_rollup_index`** +**4. Adding a value column `new_col` with SUM aggregation type after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -151,7 +191,7 @@ Adding multiple columns (aggregate model) to `example_rollup_index`: ```sql ALTER TABLE example_db.my_table -ADD COLUMN (col1 INT DEFAULT "1", col2 FLOAT SUM DEFAULT "2.3") +ADD COLUMN (c1 INT DEFAULT "1", c2 FLOAT SUM DEFAULT "0") TO example_rollup_index; ``` @@ -174,7 +214,7 @@ Removing column col2 from `example_rollup_index`: ```sql ALTER TABLE example_db.my_table -DROP COLUMN col2 +DROP COLUMN col3 FROM example_rollup_index; ``` @@ -221,6 +261,22 @@ ALTER TABLE table_name MODIFY COLUMN column_name column_type [KEY | agg_type] [N ### Examples +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col0 int, + col1 int DEFAULT "1", + col2 int, + col3 varchar(32), + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col0, col1, col2, col3) +DISTRIBUTED BY HASH(col0) BUCKETS 1 +PROPERTIES ( + "replication_num" = "1" +) +``` + **1. Modifying the column type of Key column col1 to BIGINT in the base index and moving it after column col2** ```sql @@ -233,8 +289,8 @@ Note: whether modifying a key column or a value column, the complete column info **2. Modifying the maximum length of column val1 in the Base Index. The original val1 is (val1 VARCHAR(32) REPLACE DEFAULT "abc")** ```sql -ALTER TABLE example_db.my_table -MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc" +ALTER TABLE example_db.my_table +MODIFY COLUMN col5 VARCHAR(64) REPLACE DEFAULT "abc"; ``` Note: only the column type can be modified while keeping the other properties of the column unchanged. @@ -242,7 +298,8 @@ Note: only the column type can be modified while keeping the other properties of **3. Modifying the length of a field in the key column of a duplicate key table** ```sql -alter table example_tbl modify column k3 varchar(50) key null comment 'to 50' +ALTER TABLE example_db.my_table +MODIFY COLUMN col3 varchar(50) KEY NULL comment 'to 50' ``` ## Reorder columns for a specified index @@ -261,6 +318,24 @@ ALTER TABLE table_name ORDER BY (column_name1, column_name2, ...) ### Example +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + v2 int MAX, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k2, k3, v1, v2) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + Reorder columns in the index `example_rollup_index` (assuming the original column order is: k1, k2, k3, v1, v2). ```sql @@ -277,19 +352,20 @@ Schema change can modify multiple indexes in a single job. Source Schema: -```Plain -+-----------+-------+------+------+------+---------+-------+ -| IndexName | Field | Type | Null | Key | Default | Extra | -+-----------+-------+------+------+------+---------+-------+ -| tbl1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -| | k3 | INT | No | true | N/A | | -| | | | | | | | -| rollup2 | k2 | INT | No | true | N/A | | -| | | | | | | | -| rollup1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -+-----------+-------+------+------+------+---------+-------+ +```sql +CREATE TABLE IF NOT EXISTS example_db.tbl1( + k1 int, + k2 int, + k3 int +) AGGREGATE KEY(k1, k2, k3) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + rollup1 (k1, k2), + rollup2 (k2) +) +PROPERTIES ( + "replication_num" = "1" +) ``` You can use the following command to add a column k4 to rollup1 and rollup2, and add an additional column k5 to rollup2: @@ -329,9 +405,27 @@ Additionally, it is not allowed to add columns to a rollup that already exist in ### Example 2 +table's DDL +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k3, k2, v1) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + ```sql ALTER TABLE example_db.my_table -ADD COLUMN v2 INT MAX DEFAULT "0" AFTER k2 TO example_rollup_index, +ADD COLUMN v2 INT MAX DEFAULT "0" TO example_rollup_index, ORDER BY (k3,k1,k2,v2,v1) FROM example_rollup_index; ``` @@ -350,7 +444,7 @@ The creation of a schema change is an asynchronous process. After a job is succe `SHOW ALTER TABLE COLUMN ` allows you to view the currently executing or completed schema Change jobs. When a schema change job involves multiple indexes, the command will display multiple rows, with each row corresponding to an index. For example: ```sql -mysql SHOW ALTER TABLE COLUMN\G; +mysql > SHOW ALTER TABLE COLUMN\G; *************************** 1. row *************************** JobId: 20021 TableName: tbl1 @@ -484,4 +578,4 @@ ADMIN SET FRONTEND CONFIG ("disable_balance" = "true"); ## More Details -For more detailed syntax and best practices regarding Schema Change, please refer to the [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) command manual. You can also enter `HELP ALTER TABLE COLUMN `in the MySQL client command line for more help information. \ No newline at end of file +For more detailed syntax and best practices regarding Schema Change, please refer to the [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) command manual. diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/auto-increment.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/auto-increment.md index a1477bddeed..fb348308995 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/auto-increment.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/auto-increment.md @@ -452,7 +452,7 @@ select * from records_tbl order by user_id, name limit 100; 获取第 2 页的数据可以使用如下 sql 进行查询: ```sql -select * from records_tbl order by user_id, name limit 100, offset 100; +select * from records_tbl order by user_id, name limit 100 offset 100; ``` 然而,当进行深分页查询时 (offset 很大时),即使实际需要需要的数据行很少,该方法依然会将全部数据读取到内存中进行全量排序后再进行后续处理,这种方法比较低效。可以通过自增列给每行数据一个唯一值,在查询时就可以通过记录之前页面`unique_value`列的最大值`max_value`,然后使用 `where unique_value > max_value limit rows_per_page` 的方式通过提下推谓词提前过滤大量数据,从而更高效地实现分页。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/inverted-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/inverted-index.md index 0fd8c2a7b3c..f18c2f8285c 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/inverted-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/inverted-index.md @@ -237,7 +237,7 @@ CANCEL BUILD INDEX ON table_name (job_id1,jobid_2,...); `BUILD INDEX` 会生成一个异步任务执行,在每个 BE 上有多个线程执行索引构建任务,通过 BE 参数 `alter_index_worker_count` 可以设置,默认值是3。 -2.0.12 和 2.1.4 之前的版本 `BUILD INDEX` 会一直重试直到成功,从这两个版本开始通过失败和超时机制避免一直重试。 +2.0.12 和 2.1.4 之前的版本 `BUILD INDEX` 会一直重试直到成功,从这两个版本开始通过失败和超时机制避免一直重试。3.0 存算分离模式暂不支持此命令。 1. 一个 tablet 的多数副本 `BUILD INDEX` 失败后,整个 `BUILD INDEX` 失败结束 2. 时间超过 `alter_table_timeout_second` (),`BUILD INDEX` 超时结束 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/ngram-bloomfilter-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/ngram-bloomfilter-index.md index be1e09610e9..2aaca37b112 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/ngram-bloomfilter-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/ngram-bloomfilter-index.md @@ -151,6 +151,14 @@ curl --location-trusted -u root: -T amazon_reviews_2014.snappy.parquet -H "forma curl --location-trusted -u root: -T amazon_reviews_2015.snappy.parquet -H "format:parquet" http://127.0.0.1:8030/api/${DB}/amazon_reviews/_stream_load ``` +:::info +上面的文件可能超过 10 GB,您可能需要调整 be.conf 的 streaming_load_max_mb 防止超过 stream load 文件上传大小的限制,可以通过下面方式动态调整 +```bash +curl -X POST http://{be_ip}:{be_http_port}/api/update_config?streaming_load_max_mb=32768 +``` +需要每台 be 都执行上述命令。 +::: + **SQL 运行 count() 确认导入数据成功** ``` mysql> SELECT COUNT() FROM amazon_reviews; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/prefix-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/prefix-index.md index 63888a6cb6b..f138bcf35ac 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/prefix-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/index/prefix-index.md @@ -79,13 +79,13 @@ Apache Doris 的数据存储在类似 SSTable(Sorted String Table)的数据 - 当我们的查询条件,是前缀索引的前缀时,可以极大地加快查询速度。比如在第一个例子中,执行如下查询: ```sql -SELECT * FROM table WHERE user_id=1829239 and age=20; +SELECT * FROM table WHERE user_id=1829239 and age=20; ``` 该查询的效率会远高于如下查询: ```sql -SELECT * FROM table WHERE age=20; +SELECT * FROM table WHERE age=20; ``` 所以在建表时,正确选择列顺序,能够极大地提高查询效率。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/schema-change.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/schema-change.md index f3fb6fda4ec..ae878902f7f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/schema-change.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/table-design/schema-change.md @@ -97,7 +97,27 @@ ALTER TABLE table_name ADD COLUMN column_name column_type [KEY | agg_type] [DEFA ### 示例 -**1. 向 `example_rollup_index ` 的 col1 后添加一个 Key 列 `new_col` (非聚合模型)** +#### 非聚合模型 + +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int, + col5 int +) DUPLICATE KEY(col1, col2, col3) +DISTRIBUTED BY RANDOM BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**1. 向 `example_rollup_index ` 的 col1 后添加一个 Key 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -105,7 +125,7 @@ ADD COLUMN new_col INT KEY DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**2. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_col` (非聚合模型)** +**2. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -113,7 +133,27 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**3. 向 `example_rollup_index` 的 col1 后添加一个 Key 列 `new_col` (聚合模型)** +#### 聚合模型 + +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col1, col2, col3) +DISTRIBUTED BY HASH(col1) BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**3. 向 `example_rollup_index` 的 col1 后添加一个 Key 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -121,7 +161,7 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**4. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_co``l` SUM 聚合类型 (聚合模型)** +**4. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_co``l` SUM 聚合类型** ```sql ALTER TABLE example_db.my_table @@ -151,7 +191,7 @@ ALTER TABLE table_name ADD COLUMN (column_name1 column_type [KEY | agg_type] DEF ```sql ALTER TABLE example_db.my_table -ADD COLUMN (col1 INT DEFAULT "1", col2 FLOAT SUM DEFAULT "2.3") +ADD COLUMN (c1 INT DEFAULT "1", c2 FLOAT SUM DEFAULT "0") TO example_rollup_index; ``` @@ -174,7 +214,7 @@ ALTER TABLE table_name DROP COLUMN column_name ```sql ALTER TABLE example_db.my_table -DROP COLUMN col2 +DROP COLUMN col3 FROM example_rollup_index; ``` @@ -221,6 +261,22 @@ ALTER TABLE table_name MODIFY COLUMN column_name column_type [KEY | agg_type] [N ### 示例 +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col0 int, + col1 int DEFAULT "1", + col2 int, + col3 varchar(32), + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col0, col1, col2, col3) +DISTRIBUTED BY HASH(col0) BUCKETS 1 +PROPERTIES ( + "replication_num" = "1" +) +``` + **1. 修改 Base Index 的 Key 列 col1 的类型为 BIGINT,并移动到 col2 列后面。** ```sql @@ -234,7 +290,7 @@ MODIFY COLUMN col1 BIGINT KEY DEFAULT "1" AFTER col2; ```sql ALTER TABLE example_db.my_table -MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc"; +MODIFY COLUMN col5 VARCHAR(64) REPLACE DEFAULT "abc"; ``` 注意:只能修改列的类型,列的其他属性维持原样 @@ -242,7 +298,8 @@ MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc"; **3. 修改 Duplicate Key 表 Key 列的某个字段的长度** ```sql -alter table example_tbl modify column k3 varchar(50) key null comment 'to 50' +ALTER TABLE example_db.my_table +MODIFY COLUMN col3 varchar(50) KEY NULL comment 'to 50' ``` ## 对指定 Index 的列进行重新排序 @@ -261,6 +318,24 @@ ALTER TABLE table_name ORDER BY (column_name1, column_name2, ...) ### 示例 +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + v2 int MAX, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k2, k3, v1, v2) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + 重新排序 `example_rollup_index` 中的列(设原列顺序为:k1, k2, k3, v1, v2) ```sql @@ -277,19 +352,20 @@ Schema Change 可以在一个作业中,对多个 Index 进行不同的修改 源 Schema: -```Plain -+-----------+-------+------+------+------+---------+-------+ -| IndexName | Field | Type | Null | Key | Default | Extra | -+-----------+-------+------+------+------+---------+-------+ -| tbl1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -| | k3 | INT | No | true | N/A | | -| | | | | | | | -| rollup2 | k2 | INT | No | true | N/A | | -| | | | | | | | -| rollup1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -+-----------+-------+------+------+------+---------+-------+ +```sql +CREATE TABLE IF NOT EXISTS example_db.tbl1( + k1 int, + k2 int, + k3 int +) AGGREGATE KEY(k1, k2, k3) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + rollup1 (k1, k2), + rollup2 (k2) +) +PROPERTIES ( + "replication_num" = "1" +) ``` 可以通过以下命令给 rollup1 和 rollup2 都加入一列 k4,并且再给 rollup2 加入一列 k5: @@ -329,9 +405,27 @@ ADD COLUMN k5 INT default "1" to rollup2; ### 示例 2 +建表语句 +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k3, k2, v1) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + ```sql ALTER TABLE example_db.my_table -ADD COLUMN v2 INT MAX DEFAULT "0" AFTER k2 TO example_rollup_index, +ADD COLUMN v2 INT MAX DEFAULT "0" TO example_rollup_index, ORDER BY (k3,k1,k2,v2,v1) FROM example_rollup_index; ``` @@ -350,7 +444,7 @@ Schema Change 的创建是一个异步过程,作业提交成功后,用户需 `SHOW ALTER TABLE COLUMN` 可以查看当前正在执行或已经完成的 Schema Change 作业。当一次 Schema Change 作业涉及到多个 Index 时,该命令会显示多行,每行对应一个 Index。举例如下: ```sql -mysql SHOW ALTER TABLE COLUMN\G; +mysql > SHOW ALTER TABLE COLUMN\G; *************************** 1. row *************************** JobId: 20021 TableName: tbl1 @@ -486,4 +580,4 @@ ADMIN SET FRONTEND CONFIG ("disable_balance" = "true"); ## 更多参考 -关于 Schema Change 使用的更多详细语法及最佳实践,请参阅 [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 命令手册,你也可以在 MySQL 客户端命令行下输入 `HELP ALTER TABLE COLUMN` 获取更多帮助信息。 \ No newline at end of file +关于 Schema Change 使用的更多详细语法及最佳实践,请参阅 [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 命令手册。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/data-table/index/prefix-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/data-table/index/prefix-index.md index 3bcc07e1e53..7a08faba48f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/data-table/index/prefix-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-1.2/data-table/index/prefix-index.md @@ -70,17 +70,17 @@ under the License. 当我们的查询条件,是**前缀索引的前缀**时,可以极大的加快查询速度。比如在第一个例子中,我们执行如下查询: ```sql -SELECT * FROM table WHERE user_id=1829239 and age=20; +SELECT * FROM table WHERE user_id=1829239 and age=20; ``` 该查询的效率会**远高于**如下查询: ```sql -SELECT * FROM table WHERE age=20; +SELECT * FROM table WHERE age=20; ``` 所以在建表时,**正确的选择列顺序,能够极大地提高查询效率**。 ## 通过 Rollup 来调整前缀索引 -因为建表时已经指定了列顺序,所以一个表只有一种前缀索引。这对于使用其他不能命中前缀索引的列作为条件进行的查询来说,效率上可能无法满足需求。因此,我们可以通过创建 ROLLUP 来人为的调整列顺序。详情可参考 [ROLLUP](../hit-the-rollup.md)。 \ No newline at end of file +因为建表时已经指定了列顺序,所以一个表只有一种前缀索引。这对于使用其他不能命中前缀索引的列作为条件进行的查询来说,效率上可能无法满足需求。因此,我们可以通过创建 ROLLUP 来人为的调整列顺序。详情可参考 [ROLLUP](../hit-the-rollup.md)。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/index/ngram-bloomfilter-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/index/ngram-bloomfilter-index.md index be1e09610e9..2aaca37b112 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/index/ngram-bloomfilter-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/index/ngram-bloomfilter-index.md @@ -151,6 +151,14 @@ curl --location-trusted -u root: -T amazon_reviews_2014.snappy.parquet -H "forma curl --location-trusted -u root: -T amazon_reviews_2015.snappy.parquet -H "format:parquet" http://127.0.0.1:8030/api/${DB}/amazon_reviews/_stream_load ``` +:::info +上面的文件可能超过 10 GB,您可能需要调整 be.conf 的 streaming_load_max_mb 防止超过 stream load 文件上传大小的限制,可以通过下面方式动态调整 +```bash +curl -X POST http://{be_ip}:{be_http_port}/api/update_config?streaming_load_max_mb=32768 +``` +需要每台 be 都执行上述命令。 +::: + **SQL 运行 count() 确认导入数据成功** ``` mysql> SELECT COUNT() FROM amazon_reviews; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/index/prefix-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/index/prefix-index.md index ab3d1a25b61..c32e8d845c1 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/index/prefix-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/index/prefix-index.md @@ -79,13 +79,13 @@ Apache Doris 的数据存储在类似 SSTable(Sorted String Table)的数据 - 当我们的查询条件,是前缀索引的前缀时,可以极大地加快查询速度。比如在第一个例子中,执行如下查询: ```sql -SELECT * FROM table WHERE user_id=1829239 and age=20; +SELECT * FROM table WHERE user_id=1829239 and age=20; ``` 该查询的效率会远高于如下查询: ```sql -SELECT * FROM table WHERE age=20; +SELECT * FROM table WHERE age=20; ``` 所以在建表时,正确选择列顺序,能够极大地提高查询效率。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/schema-change.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/schema-change.md index 84c771c646a..42baebb380f 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/schema-change.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.0/table-design/schema-change.md @@ -76,7 +76,7 @@ under the License. 在开始转换历史数据之前,Doris 会获取一个最新的 Transaction ID。并等待这个 Transaction ID 之前的所有导入事务完成。这个 Transaction ID 成为分水岭。意思是,Doris 保证在分水岭之后的所有导入任务,都会同时为原表 /Index 和新表 /Index 生成数据。这样当历史数据转换完成后,可以保证新的表中的数据是完整的。 -创建 Schema Change 的具体语法可以查看帮助 [ALTER TABLE COLUMN](../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 中 Schema Change 部分的说明。 +创建 Schema Change 的具体语法可以查看帮助 [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 中 Schema Change 部分的说明。 ## 向指定 Index 的指定位置添加一列 @@ -97,7 +97,27 @@ ALTER TABLE table_name ADD COLUMN column_name column_type [KEY | agg_type] [DEFA ### 示例 -**1. 向 `example_rollup_index ` 的 col1 后添加一个 Key 列 `new_col` (非聚合模型)** +#### 非聚合模型 + +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int, + col5 int +) DUPLICATE KEY(col1, col2, col3) +DISTRIBUTED BY RANDOM BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**1. 向 `example_rollup_index ` 的 col1 后添加一个 Key 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -105,7 +125,7 @@ ADD COLUMN new_col INT KEY DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**2. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_col` (非聚合模型)** +**2. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -113,7 +133,27 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**3. 向 `example_rollup_index` 的 col1 后添加一个 Key 列 `new_col` (聚合模型)** +#### 聚合模型 + +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col1, col2, col3) +DISTRIBUTED BY HASH(col1) BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**3. 向 `example_rollup_index` 的 col1 后添加一个 Key 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -121,7 +161,7 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**4. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_co``l` SUM 聚合类型 (聚合模型)** +**4. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_co``l` SUM 聚合类型** ```sql ALTER TABLE example_db.my_table @@ -151,7 +191,7 @@ ALTER TABLE table_name ADD COLUMN (column_name1 column_type [KEY | agg_type] DEF ```sql ALTER TABLE example_db.my_table -ADD COLUMN (col1 INT DEFAULT "1", col2 FLOAT SUM DEFAULT "2.3") +ADD COLUMN (c1 INT DEFAULT "1", c2 FLOAT SUM DEFAULT "0") TO example_rollup_index; ``` @@ -174,7 +214,7 @@ ALTER TABLE table_name DROP COLUMN column_name ```sql ALTER TABLE example_db.my_table -DROP COLUMN col2 +DROP COLUMN col3 FROM example_rollup_index; ``` @@ -221,6 +261,22 @@ ALTER TABLE table_name MODIFY COLUMN column_name column_type [KEY | agg_type] [N ### 示例 +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col0 int, + col1 int DEFAULT "1", + col2 int, + col3 varchar(32), + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col0, col1, col2, col3) +DISTRIBUTED BY HASH(col0) BUCKETS 1 +PROPERTIES ( + "replication_num" = "1" +) +``` + **1. 修改 Base Index 的 Key 列 col1 的类型为 BIGINT,并移动到 col2 列后面。** ```sql @@ -234,7 +290,7 @@ MODIFY COLUMN col1 BIGINT KEY DEFAULT "1" AFTER col2; ```sql ALTER TABLE example_db.my_table -MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc"; +MODIFY COLUMN col5 VARCHAR(64) REPLACE DEFAULT "abc"; ``` 注意:只能修改列的类型,列的其他属性维持原样 @@ -242,7 +298,8 @@ MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc"; **3. 修改 Duplicate Key 表 Key 列的某个字段的长度** ```sql -alter table example_tbl modify column k3 varchar(50) key null comment 'to 50' +ALTER TABLE example_db.my_table +MODIFY COLUMN col3 varchar(50) KEY NULL comment 'to 50' ``` ## 对指定 Index 的列进行重新排序 @@ -261,6 +318,24 @@ ALTER TABLE table_name ORDER BY (column_name1, column_name2, ...) ### 示例 +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + v2 int MAX, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k2, k3, v1, v2) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + 重新排序 `example_rollup_index` 中的列(设原列顺序为:k1, k2, k3, v1, v2) ```sql @@ -277,19 +352,20 @@ Schema Change 可以在一个作业中,对多个 Index 进行不同的修改 源 Schema: -```Plain -+-----------+-------+------+------+------+---------+-------+ -| IndexName | Field | Type | Null | Key | Default | Extra | -+-----------+-------+------+------+------+---------+-------+ -| tbl1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -| | k3 | INT | No | true | N/A | | -| | | | | | | | -| rollup2 | k2 | INT | No | true | N/A | | -| | | | | | | | -| rollup1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -+-----------+-------+------+------+------+---------+-------+ +```sql +CREATE TABLE IF NOT EXISTS example_db.tbl1( + k1 int, + k2 int, + k3 int +) AGGREGATE KEY(k1, k2, k3) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + rollup1 (k1, k2), + rollup2 (k2) +) +PROPERTIES ( + "replication_num" = "1" +) ``` 可以通过以下命令给 rollup1 和 rollup2 都加入一列 k4,并且再给 rollup2 加入一列 k5: @@ -329,9 +405,27 @@ ADD COLUMN k5 INT default "1" to rollup2; ### 示例 2 +建表语句 +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k3, k2, v1) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + ```sql ALTER TABLE example_db.my_table -ADD COLUMN v2 INT MAX DEFAULT "0" AFTER k2 TO example_rollup_index, +ADD COLUMN v2 INT MAX DEFAULT "0" TO example_rollup_index, ORDER BY (k3,k1,k2,v2,v1) FROM example_rollup_index; ``` @@ -350,7 +444,7 @@ Schema Change 的创建是一个异步过程,作业提交成功后,用户需 `SHOW ALTER TABLE COLUMN` 可以查看当前正在执行或已经完成的 Schema Change 作业。当一次 Schema Change 作业涉及到多个 Index 时,该命令会显示多行,每行对应一个 Index。举例如下: ```sql -mysql SHOW ALTER TABLE COLUMN\G; +mysql > SHOW ALTER TABLE COLUMN\G; *************************** 1. row *************************** JobId: 20021 TableName: tbl1 @@ -486,4 +580,4 @@ ADMIN SET FRONTEND CONFIG ("disable_balance" = "true"); ## 更多参考 -关于 Schema Change 使用的更多详细语法及最佳实践,请参阅 [ALTER TABLE COLUMN](../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 命令手册,你也可以在 MySQL 客户端命令行下输入 `HELP ALTER TABLE COLUMN` 获取更多帮助信息。 \ No newline at end of file +关于 Schema Change 使用的更多详细语法及最佳实践,请参阅 [ALTER TABLE COLUMN](../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 命令手册,你也可以在 MySQL 客户端命令行下输入 `HELP ALTER TABLE COLUMN` 获取更多帮助信息。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/auto-increment.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/auto-increment.md index a1477bddeed..fb348308995 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/auto-increment.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/auto-increment.md @@ -452,7 +452,7 @@ select * from records_tbl order by user_id, name limit 100; 获取第 2 页的数据可以使用如下 sql 进行查询: ```sql -select * from records_tbl order by user_id, name limit 100, offset 100; +select * from records_tbl order by user_id, name limit 100 offset 100; ``` 然而,当进行深分页查询时 (offset 很大时),即使实际需要需要的数据行很少,该方法依然会将全部数据读取到内存中进行全量排序后再进行后续处理,这种方法比较低效。可以通过自增列给每行数据一个唯一值,在查询时就可以通过记录之前页面`unique_value`列的最大值`max_value`,然后使用 `where unique_value > max_value limit rows_per_page` 的方式通过提下推谓词提前过滤大量数据,从而更高效地实现分页。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/ngram-bloomfilter-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/ngram-bloomfilter-index.md index be1e09610e9..2aaca37b112 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/ngram-bloomfilter-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/ngram-bloomfilter-index.md @@ -151,6 +151,14 @@ curl --location-trusted -u root: -T amazon_reviews_2014.snappy.parquet -H "forma curl --location-trusted -u root: -T amazon_reviews_2015.snappy.parquet -H "format:parquet" http://127.0.0.1:8030/api/${DB}/amazon_reviews/_stream_load ``` +:::info +上面的文件可能超过 10 GB,您可能需要调整 be.conf 的 streaming_load_max_mb 防止超过 stream load 文件上传大小的限制,可以通过下面方式动态调整 +```bash +curl -X POST http://{be_ip}:{be_http_port}/api/update_config?streaming_load_max_mb=32768 +``` +需要每台 be 都执行上述命令。 +::: + **SQL 运行 count() 确认导入数据成功** ``` mysql> SELECT COUNT() FROM amazon_reviews; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/prefix-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/prefix-index.md index ab3d1a25b61..c32e8d845c1 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/prefix-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/index/prefix-index.md @@ -79,13 +79,13 @@ Apache Doris 的数据存储在类似 SSTable(Sorted String Table)的数据 - 当我们的查询条件,是前缀索引的前缀时,可以极大地加快查询速度。比如在第一个例子中,执行如下查询: ```sql -SELECT * FROM table WHERE user_id=1829239 and age=20; +SELECT * FROM table WHERE user_id=1829239 and age=20; ``` 该查询的效率会远高于如下查询: ```sql -SELECT * FROM table WHERE age=20; +SELECT * FROM table WHERE age=20; ``` 所以在建表时,正确选择列顺序,能够极大地提高查询效率。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/schema-change.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/schema-change.md index f3fb6fda4ec..1723f31f253 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/schema-change.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/table-design/schema-change.md @@ -97,7 +97,27 @@ ALTER TABLE table_name ADD COLUMN column_name column_type [KEY | agg_type] [DEFA ### 示例 -**1. 向 `example_rollup_index ` 的 col1 后添加一个 Key 列 `new_col` (非聚合模型)** +#### 非聚合模型 + +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int, + col5 int +) DUPLICATE KEY(col1, col2, col3) + DISTRIBUTED BY RANDOM BUCKETS 1 + ROLLUP ( + example_rollup_index (col1, col3, col4, col5) + ) + PROPERTIES ( + "replication_num" = "1" + ) +``` + +**1. 向 `example_rollup_index ` 的 col1 后添加一个 Key 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -105,7 +125,7 @@ ADD COLUMN new_col INT KEY DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**2. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_col` (非聚合模型)** +**2. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -113,7 +133,27 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**3. 向 `example_rollup_index` 的 col1 后添加一个 Key 列 `new_col` (聚合模型)** +#### 聚合模型 + +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col1, col2, col3) +DISTRIBUTED BY HASH(col1) BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**3. 向 `example_rollup_index` 的 col1 后添加一个 Key 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -121,7 +161,7 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**4. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_co``l` SUM 聚合类型 (聚合模型)** +**4. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_co``l` SUM 聚合类型** ```sql ALTER TABLE example_db.my_table @@ -151,7 +191,7 @@ ALTER TABLE table_name ADD COLUMN (column_name1 column_type [KEY | agg_type] DEF ```sql ALTER TABLE example_db.my_table -ADD COLUMN (col1 INT DEFAULT "1", col2 FLOAT SUM DEFAULT "2.3") +ADD COLUMN (c1 INT DEFAULT "1", c2 FLOAT SUM DEFAULT "0") TO example_rollup_index; ``` @@ -174,7 +214,7 @@ ALTER TABLE table_name DROP COLUMN column_name ```sql ALTER TABLE example_db.my_table -DROP COLUMN col2 +DROP COLUMN col3 FROM example_rollup_index; ``` @@ -221,6 +261,22 @@ ALTER TABLE table_name MODIFY COLUMN column_name column_type [KEY | agg_type] [N ### 示例 +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col0 int, + col1 int DEFAULT "1", + col2 int, + col3 varchar(32), + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col0, col1, col2, col3) +DISTRIBUTED BY HASH(col0) BUCKETS 1 +PROPERTIES ( + "replication_num" = "1" +) +``` + **1. 修改 Base Index 的 Key 列 col1 的类型为 BIGINT,并移动到 col2 列后面。** ```sql @@ -234,7 +290,7 @@ MODIFY COLUMN col1 BIGINT KEY DEFAULT "1" AFTER col2; ```sql ALTER TABLE example_db.my_table -MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc"; +MODIFY COLUMN col5 VARCHAR(64) REPLACE DEFAULT "abc"; ``` 注意:只能修改列的类型,列的其他属性维持原样 @@ -242,7 +298,8 @@ MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc"; **3. 修改 Duplicate Key 表 Key 列的某个字段的长度** ```sql -alter table example_tbl modify column k3 varchar(50) key null comment 'to 50' +ALTER TABLE example_db.my_table +MODIFY COLUMN col3 varchar(50) KEY NULL comment 'to 50' ``` ## 对指定 Index 的列进行重新排序 @@ -261,6 +318,24 @@ ALTER TABLE table_name ORDER BY (column_name1, column_name2, ...) ### 示例 +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + v2 int MAX, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k2, k3, v1, v2) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + 重新排序 `example_rollup_index` 中的列(设原列顺序为:k1, k2, k3, v1, v2) ```sql @@ -277,19 +352,20 @@ Schema Change 可以在一个作业中,对多个 Index 进行不同的修改 源 Schema: -```Plain -+-----------+-------+------+------+------+---------+-------+ -| IndexName | Field | Type | Null | Key | Default | Extra | -+-----------+-------+------+------+------+---------+-------+ -| tbl1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -| | k3 | INT | No | true | N/A | | -| | | | | | | | -| rollup2 | k2 | INT | No | true | N/A | | -| | | | | | | | -| rollup1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -+-----------+-------+------+------+------+---------+-------+ +```sql +CREATE TABLE IF NOT EXISTS example_db.tbl1( + k1 int, + k2 int, + k3 int +) AGGREGATE KEY(k1, k2, k3) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + rollup1 (k1, k2), + rollup2 (k2) +) +PROPERTIES ( + "replication_num" = "1" +) ``` 可以通过以下命令给 rollup1 和 rollup2 都加入一列 k4,并且再给 rollup2 加入一列 k5: @@ -329,9 +405,27 @@ ADD COLUMN k5 INT default "1" to rollup2; ### 示例 2 +建表语句 +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + ) AGGREGATE KEY(k1, k2, k3, k4) + DISTRIBUTED BY HASH(k1) BUCKETS 1 + ROLLUP ( + example_rollup_index(k1, k3, k2, v1) + ) + PROPERTIES ( + "replication_num" = "1" + ) +``` + ```sql ALTER TABLE example_db.my_table -ADD COLUMN v2 INT MAX DEFAULT "0" AFTER k2 TO example_rollup_index, + ADD COLUMN v2 INT MAX DEFAULT "0" TO example_rollup_index, ORDER BY (k3,k1,k2,v2,v1) FROM example_rollup_index; ``` @@ -350,7 +444,7 @@ Schema Change 的创建是一个异步过程,作业提交成功后,用户需 `SHOW ALTER TABLE COLUMN` 可以查看当前正在执行或已经完成的 Schema Change 作业。当一次 Schema Change 作业涉及到多个 Index 时,该命令会显示多行,每行对应一个 Index。举例如下: ```sql -mysql SHOW ALTER TABLE COLUMN\G; +mysql > SHOW ALTER TABLE COLUMN\G; *************************** 1. row *************************** JobId: 20021 TableName: tbl1 @@ -486,4 +580,4 @@ ADMIN SET FRONTEND CONFIG ("disable_balance" = "true"); ## 更多参考 -关于 Schema Change 使用的更多详细语法及最佳实践,请参阅 [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 命令手册,你也可以在 MySQL 客户端命令行下输入 `HELP ALTER TABLE COLUMN` 获取更多帮助信息。 \ No newline at end of file +关于 Schema Change 使用的更多详细语法及最佳实践,请参阅 [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 命令手册,你也可以在 MySQL 客户端命令行下输入 `HELP ALTER TABLE COLUMN` 获取更多帮助信息。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/auto-increment.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/auto-increment.md index a1477bddeed..fb348308995 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/auto-increment.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/auto-increment.md @@ -452,7 +452,7 @@ select * from records_tbl order by user_id, name limit 100; 获取第 2 页的数据可以使用如下 sql 进行查询: ```sql -select * from records_tbl order by user_id, name limit 100, offset 100; +select * from records_tbl order by user_id, name limit 100 offset 100; ``` 然而,当进行深分页查询时 (offset 很大时),即使实际需要需要的数据行很少,该方法依然会将全部数据读取到内存中进行全量排序后再进行后续处理,这种方法比较低效。可以通过自增列给每行数据一个唯一值,在查询时就可以通过记录之前页面`unique_value`列的最大值`max_value`,然后使用 `where unique_value > max_value limit rows_per_page` 的方式通过提下推谓词提前过滤大量数据,从而更高效地实现分页。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/inverted-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/inverted-index.md index 0fd8c2a7b3c..f18c2f8285c 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/inverted-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/inverted-index.md @@ -237,7 +237,7 @@ CANCEL BUILD INDEX ON table_name (job_id1,jobid_2,...); `BUILD INDEX` 会生成一个异步任务执行,在每个 BE 上有多个线程执行索引构建任务,通过 BE 参数 `alter_index_worker_count` 可以设置,默认值是3。 -2.0.12 和 2.1.4 之前的版本 `BUILD INDEX` 会一直重试直到成功,从这两个版本开始通过失败和超时机制避免一直重试。 +2.0.12 和 2.1.4 之前的版本 `BUILD INDEX` 会一直重试直到成功,从这两个版本开始通过失败和超时机制避免一直重试。3.0 存算分离模式暂不支持此命令。 1. 一个 tablet 的多数副本 `BUILD INDEX` 失败后,整个 `BUILD INDEX` 失败结束 2. 时间超过 `alter_table_timeout_second` (),`BUILD INDEX` 超时结束 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/ngram-bloomfilter-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/ngram-bloomfilter-index.md index be1e09610e9..2aaca37b112 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/ngram-bloomfilter-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/ngram-bloomfilter-index.md @@ -151,6 +151,14 @@ curl --location-trusted -u root: -T amazon_reviews_2014.snappy.parquet -H "forma curl --location-trusted -u root: -T amazon_reviews_2015.snappy.parquet -H "format:parquet" http://127.0.0.1:8030/api/${DB}/amazon_reviews/_stream_load ``` +:::info +上面的文件可能超过 10 GB,您可能需要调整 be.conf 的 streaming_load_max_mb 防止超过 stream load 文件上传大小的限制,可以通过下面方式动态调整 +```bash +curl -X POST http://{be_ip}:{be_http_port}/api/update_config?streaming_load_max_mb=32768 +``` +需要每台 be 都执行上述命令。 +::: + **SQL 运行 count() 确认导入数据成功** ``` mysql> SELECT COUNT() FROM amazon_reviews; diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/prefix-index.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/prefix-index.md index 63888a6cb6b..f138bcf35ac 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/prefix-index.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/index/prefix-index.md @@ -79,13 +79,13 @@ Apache Doris 的数据存储在类似 SSTable(Sorted String Table)的数据 - 当我们的查询条件,是前缀索引的前缀时,可以极大地加快查询速度。比如在第一个例子中,执行如下查询: ```sql -SELECT * FROM table WHERE user_id=1829239 and age=20; +SELECT * FROM table WHERE user_id=1829239 and age=20; ``` 该查询的效率会远高于如下查询: ```sql -SELECT * FROM table WHERE age=20; +SELECT * FROM table WHERE age=20; ``` 所以在建表时,正确选择列顺序,能够极大地提高查询效率。 diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/schema-change.md b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/schema-change.md index f3fb6fda4ec..fef650680c0 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/schema-change.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/table-design/schema-change.md @@ -97,7 +97,27 @@ ALTER TABLE table_name ADD COLUMN column_name column_type [KEY | agg_type] [DEFA ### 示例 -**1. 向 `example_rollup_index ` 的 col1 后添加一个 Key 列 `new_col` (非聚合模型)** +#### 非聚合模型 + +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int, + col5 int +) DUPLICATE KEY(col1, col2, col3) + DISTRIBUTED BY RANDOM BUCKETS 1 + ROLLUP ( + example_rollup_index (col1, col3, col4, col5) + ) + PROPERTIES ( + "replication_num" = "1" + ) +``` + +**1. 向 `example_rollup_index ` 的 col1 后添加一个 Key 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -105,7 +125,7 @@ ADD COLUMN new_col INT KEY DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**2. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_col` (非聚合模型)** +**2. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -113,7 +133,27 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**3. 向 `example_rollup_index` 的 col1 后添加一个 Key 列 `new_col` (聚合模型)** +#### 聚合模型 + +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col1, col2, col3) +DISTRIBUTED BY HASH(col1) BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**3. 向 `example_rollup_index` 的 col1 后添加一个 Key 列 `new_col`** ```sql ALTER TABLE example_db.my_table @@ -121,7 +161,7 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**4. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_co``l` SUM 聚合类型 (聚合模型)** +**4. 向 `example_rollup_index` 的 col1 后添加一个 Value 列 `new_co``l` SUM 聚合类型** ```sql ALTER TABLE example_db.my_table @@ -151,7 +191,7 @@ ALTER TABLE table_name ADD COLUMN (column_name1 column_type [KEY | agg_type] DEF ```sql ALTER TABLE example_db.my_table -ADD COLUMN (col1 INT DEFAULT "1", col2 FLOAT SUM DEFAULT "2.3") +ADD COLUMN (c1 INT DEFAULT "1", c2 FLOAT SUM DEFAULT "0") TO example_rollup_index; ``` @@ -174,7 +214,7 @@ ALTER TABLE table_name DROP COLUMN column_name ```sql ALTER TABLE example_db.my_table -DROP COLUMN col2 +DROP COLUMN col3 FROM example_rollup_index; ``` @@ -221,6 +261,22 @@ ALTER TABLE table_name MODIFY COLUMN column_name column_type [KEY | agg_type] [N ### 示例 +建表语句: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col0 int, + col1 int DEFAULT "1", + col2 int, + col3 varchar(32), + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col0, col1, col2, col3) +DISTRIBUTED BY HASH(col0) BUCKETS 1 +PROPERTIES ( + "replication_num" = "1" +) +``` + **1. 修改 Base Index 的 Key 列 col1 的类型为 BIGINT,并移动到 col2 列后面。** ```sql @@ -234,7 +290,7 @@ MODIFY COLUMN col1 BIGINT KEY DEFAULT "1" AFTER col2; ```sql ALTER TABLE example_db.my_table -MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc"; +MODIFY COLUMN col5 VARCHAR(64) REPLACE DEFAULT "abc"; ``` 注意:只能修改列的类型,列的其他属性维持原样 @@ -242,7 +298,8 @@ MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc"; **3. 修改 Duplicate Key 表 Key 列的某个字段的长度** ```sql -alter table example_tbl modify column k3 varchar(50) key null comment 'to 50' +ALTER TABLE example_db.my_table +MODIFY COLUMN col3 varchar(50) KEY NULL comment 'to 50' ``` ## 对指定 Index 的列进行重新排序 @@ -261,6 +318,24 @@ ALTER TABLE table_name ORDER BY (column_name1, column_name2, ...) ### 示例 +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + v2 int MAX, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k2, k3, v1, v2) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + 重新排序 `example_rollup_index` 中的列(设原列顺序为:k1, k2, k3, v1, v2) ```sql @@ -277,19 +352,20 @@ Schema Change 可以在一个作业中,对多个 Index 进行不同的修改 源 Schema: -```Plain -+-----------+-------+------+------+------+---------+-------+ -| IndexName | Field | Type | Null | Key | Default | Extra | -+-----------+-------+------+------+------+---------+-------+ -| tbl1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -| | k3 | INT | No | true | N/A | | -| | | | | | | | -| rollup2 | k2 | INT | No | true | N/A | | -| | | | | | | | -| rollup1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -+-----------+-------+------+------+------+---------+-------+ +```sql +CREATE TABLE IF NOT EXISTS example_db.tbl1( + k1 int, + k2 int, + k3 int +) AGGREGATE KEY(k1, k2, k3) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + rollup1 (k1, k2), + rollup2 (k2) +) +PROPERTIES ( + "replication_num" = "1" +) ``` 可以通过以下命令给 rollup1 和 rollup2 都加入一列 k4,并且再给 rollup2 加入一列 k5: @@ -329,9 +405,27 @@ ADD COLUMN k5 INT default "1" to rollup2; ### 示例 2 +建表语句 +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + ) AGGREGATE KEY(k1, k2, k3, k4) + DISTRIBUTED BY HASH(k1) BUCKETS 1 + ROLLUP ( + example_rollup_index(k1, k3, k2, v1) + ) + PROPERTIES ( + "replication_num" = "1" + ) +``` + ```sql ALTER TABLE example_db.my_table -ADD COLUMN v2 INT MAX DEFAULT "0" AFTER k2 TO example_rollup_index, + ADD COLUMN v2 INT MAX DEFAULT "0" TO example_rollup_index, ORDER BY (k3,k1,k2,v2,v1) FROM example_rollup_index; ``` @@ -350,7 +444,7 @@ Schema Change 的创建是一个异步过程,作业提交成功后,用户需 `SHOW ALTER TABLE COLUMN` 可以查看当前正在执行或已经完成的 Schema Change 作业。当一次 Schema Change 作业涉及到多个 Index 时,该命令会显示多行,每行对应一个 Index。举例如下: ```sql -mysql SHOW ALTER TABLE COLUMN\G; +mysql > SHOW ALTER TABLE COLUMN\G; *************************** 1. row *************************** JobId: 20021 TableName: tbl1 @@ -486,4 +580,4 @@ ADMIN SET FRONTEND CONFIG ("disable_balance" = "true"); ## 更多参考 -关于 Schema Change 使用的更多详细语法及最佳实践,请参阅 [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 命令手册,你也可以在 MySQL 客户端命令行下输入 `HELP ALTER TABLE COLUMN` 获取更多帮助信息。 \ No newline at end of file +关于 Schema Change 使用的更多详细语法及最佳实践,请参阅 [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) 命令手册。 diff --git a/versioned_docs/version-2.0/table-design/index/ngram-bloomfilter-index.md b/versioned_docs/version-2.0/table-design/index/ngram-bloomfilter-index.md index c0a422a891c..71368ead8c0 100644 --- a/versioned_docs/version-2.0/table-design/index/ngram-bloomfilter-index.md +++ b/versioned_docs/version-2.0/table-design/index/ngram-bloomfilter-index.md @@ -142,6 +142,14 @@ curl --location-trusted -u root: -T amazon_reviews_2014.snappy.parquet -H "forma curl --location-trusted -u root: -T amazon_reviews_2015.snappy.parquet -H "format:parquet" http://127.0.0.1:8030/api/${DB}/amazon_reviews/_stream_load ``` +:::info +The data file may exceed 10 GB, and you may need to adjust the streaming_road_max_mb in be.conf to prevent exceeding the upload size limit of the stream load. You can dynamically adjust it by following the steps below: +```bash +curl -X POST http://{be_ip}:{be_http_port}/api/update_config?streaming_load_max_mb=32768 +``` +Every BE needs to execute the above command. +::: + **Run a count query to confirm successful data import:** ```sql @@ -208,4 +216,4 @@ ALTER TABLE amazon_reviews ADD INDEX review_body_ngram_idx(review_body) USING NG | B00KWVZ750 | Angry Birds Epic RPG | 5 | 6 | +------------+------------------------------------------+--------------------+-------+ 5 rows in set (0.93 sec) -``` \ No newline at end of file +``` diff --git a/versioned_docs/version-2.0/table-design/schema-change.md b/versioned_docs/version-2.0/table-design/schema-change.md index 07dfec60793..fd087c73b74 100644 --- a/versioned_docs/version-2.0/table-design/schema-change.md +++ b/versioned_docs/version-2.0/table-design/schema-change.md @@ -76,7 +76,7 @@ The basic process of executing schema change is to generate a new schema table f Before starting to convert historical data, Doris will obtain a latest transaction ID and wait for all import transactions before this transaction ID to complete. This transaction ID becomes a watershed. This means that Doris ensures that all import tasks after the watershed will generate data for the original table /Index and the new table /Index at the same time. This way, when the historical data conversion is completed, the data in the new table can be guaranteed to be complete. -The specific syntax for creating schema changes can be found in the schema change section of the help [ALTER TABLE COLUMN](../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) +The specific syntax for creating schema changes can be found in the schema change section of the help [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) ## Adding a column at a specified position to a specified index @@ -97,7 +97,27 @@ ALTER TABLE table_name ADD COLUMN column_name column_type [KEY | agg_type] [DEFA ### Examples -**1. Adding a key column `new_col` (non-aggregate model) after col1 to `example_rollup_index`** +#### non-aggregate model + +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int, + col5 int +) DUPLICATE KEY(col1, col2, col3) +DISTRIBUTED BY RANDOM BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**1. Adding a key column `new_col` after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -105,7 +125,7 @@ ADD COLUMN new_col INT KEY DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**2. Adding a value column `new_col` (non-aggregate model) with a default value of 0 after col1 to `example_rollup_index`** +**2. Adding a value column `new_col` with a default value of 0 after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -113,7 +133,27 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**3. Adding a Key column `new_col` (aggregate model) after col1 to `example_rollup_index`** +#### aggregate model + +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col1, col2, col3) +DISTRIBUTED BY HASH(col1) BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**3. Adding a Key column `new_col` after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -121,7 +161,7 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**4. Adding a value column `new_col` with SUM aggregation type (aggregate model) after col1 to `example_rollup_index`** +**4. Adding a value column `new_col` with SUM aggregation type after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -151,7 +191,7 @@ Adding multiple columns (aggregate model) to `example_rollup_index`: ```sql ALTER TABLE example_db.my_table -ADD COLUMN (col1 INT DEFAULT "1", col2 FLOAT SUM DEFAULT "2.3") +ADD COLUMN (c1 INT DEFAULT "1", c2 FLOAT SUM DEFAULT "0") TO example_rollup_index; ``` @@ -174,7 +214,7 @@ Removing column col2 from `example_rollup_index`: ```sql ALTER TABLE example_db.my_table -DROP COLUMN col2 +DROP COLUMN col3 FROM example_rollup_index; ``` @@ -221,6 +261,22 @@ ALTER TABLE table_name MODIFY COLUMN column_name column_type [KEY | agg_type] [N ### Examples +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col0 int, + col1 int DEFAULT "1", + col2 int, + col3 varchar(32), + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col0, col1, col2, col3) +DISTRIBUTED BY HASH(col0) BUCKETS 1 +PROPERTIES ( + "replication_num" = "1" +) +``` + **1. Modifying the column type of Key column col1 to BIGINT in the base index and moving it after column col2** ```sql @@ -233,8 +289,8 @@ Note: whether modifying a key column or a value column, the complete column info **2. Modifying the maximum length of column val1 in the Base Index. The original val1 is (val1 VARCHAR(32) REPLACE DEFAULT "abc")** ```sql -ALTER TABLE example_db.my_table -MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc" +ALTER TABLE example_db.my_table +MODIFY COLUMN col5 VARCHAR(64) REPLACE DEFAULT "abc"; ``` Note: only the column type can be modified while keeping the other properties of the column unchanged. @@ -242,7 +298,8 @@ Note: only the column type can be modified while keeping the other properties of **3. Modifying the length of a field in the key column of a duplicate key table** ```sql -alter table example_tbl modify column k3 varchar(50) key null comment 'to 50' +ALTER TABLE example_db.my_table +MODIFY COLUMN col3 varchar(50) KEY NULL comment 'to 50' ``` ## Reorder columns for a specified index @@ -261,6 +318,24 @@ ALTER TABLE table_name ORDER BY (column_name1, column_name2, ...) ### Example +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + v2 int MAX, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k2, k3, v1, v2) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + Reorder columns in the index `example_rollup_index` (assuming the original column order is: k1, k2, k3, v1, v2). ```sql @@ -277,19 +352,20 @@ Schema change can modify multiple indexes in a single job. Source Schema: -```Plain -+-----------+-------+------+------+------+---------+-------+ -| IndexName | Field | Type | Null | Key | Default | Extra | -+-----------+-------+------+------+------+---------+-------+ -| tbl1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -| | k3 | INT | No | true | N/A | | -| | | | | | | | -| rollup2 | k2 | INT | No | true | N/A | | -| | | | | | | | -| rollup1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -+-----------+-------+------+------+------+---------+-------+ +```sql +CREATE TABLE IF NOT EXISTS example_db.tbl1( + k1 int, + k2 int, + k3 int +) AGGREGATE KEY(k1, k2, k3) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + rollup1 (k1, k2), + rollup2 (k2) +) +PROPERTIES ( + "replication_num" = "1" +) ``` You can use the following command to add a column k4 to rollup1 and rollup2, and add an additional column k5 to rollup2: @@ -329,9 +405,27 @@ Additionally, it is not allowed to add columns to a rollup that already exist in ### Example 2 +table's DDL +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k3, k2, v1) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + ```sql ALTER TABLE example_db.my_table -ADD COLUMN v2 INT MAX DEFAULT "0" AFTER k2 TO example_rollup_index, +ADD COLUMN v2 INT MAX DEFAULT "0" TO example_rollup_index, ORDER BY (k3,k1,k2,v2,v1) FROM example_rollup_index; ``` @@ -350,7 +444,7 @@ The creation of a schema change is an asynchronous process. After a job is succe `SHOW ALTER TABLE COLUMN ` allows you to view the currently executing or completed schema Change jobs. When a schema change job involves multiple indexes, the command will display multiple rows, with each row corresponding to an index. For example: ```sql -mysql SHOW ALTER TABLE COLUMN\G; +mysql > SHOW ALTER TABLE COLUMN\G; *************************** 1. row *************************** JobId: 20021 TableName: tbl1 @@ -484,4 +578,4 @@ ADMIN SET FRONTEND CONFIG ("disable_balance" = "true"); ## More Details -For more detailed syntax and best practices regarding Schema Change, please refer to the [ALTER TABLE COLUMN](../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) command manual. You can also enter `HELP ALTER TABLE COLUMN `in the MySQL client command line for more help information. \ No newline at end of file +For more detailed syntax and best practices regarding Schema Change, please refer to the [ALTER TABLE COLUMN](../sql-manual/sql-reference/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) command manual. You can also enter `HELP ALTER TABLE COLUMN `in the MySQL client command line for more help information. diff --git a/versioned_docs/version-2.1/table-design/auto-increment.md b/versioned_docs/version-2.1/table-design/auto-increment.md index c10f0c29d6c..5c115ee031e 100644 --- a/versioned_docs/version-2.1/table-design/auto-increment.md +++ b/versioned_docs/version-2.1/table-design/auto-increment.md @@ -457,7 +457,7 @@ select * from records_tbl order by `key`, `name` limit 100; Fetching the data for the second page can be accomplished by: ```sql -select * from records_tbl order by `key`, `name` limit 100, offset 100; +select * from records_tbl order by `key`, `name` limit 100 offset 100; ``` However, when performing deep pagination queries (with large offsets), even if the actual required data rows are few, this method still reads all data into memory for full sorting before subsequent processing, which is quite inefficient. Using an auto-incrementa column assigns a unique value to each row, allowing the use of where `unique_value` > x limit y to filter a significant amount of data beforehand, making pagination more efficient. diff --git a/versioned_docs/version-2.1/table-design/index/ngram-bloomfilter-index.md b/versioned_docs/version-2.1/table-design/index/ngram-bloomfilter-index.md index c0a422a891c..71368ead8c0 100644 --- a/versioned_docs/version-2.1/table-design/index/ngram-bloomfilter-index.md +++ b/versioned_docs/version-2.1/table-design/index/ngram-bloomfilter-index.md @@ -142,6 +142,14 @@ curl --location-trusted -u root: -T amazon_reviews_2014.snappy.parquet -H "forma curl --location-trusted -u root: -T amazon_reviews_2015.snappy.parquet -H "format:parquet" http://127.0.0.1:8030/api/${DB}/amazon_reviews/_stream_load ``` +:::info +The data file may exceed 10 GB, and you may need to adjust the streaming_road_max_mb in be.conf to prevent exceeding the upload size limit of the stream load. You can dynamically adjust it by following the steps below: +```bash +curl -X POST http://{be_ip}:{be_http_port}/api/update_config?streaming_load_max_mb=32768 +``` +Every BE needs to execute the above command. +::: + **Run a count query to confirm successful data import:** ```sql @@ -208,4 +216,4 @@ ALTER TABLE amazon_reviews ADD INDEX review_body_ngram_idx(review_body) USING NG | B00KWVZ750 | Angry Birds Epic RPG | 5 | 6 | +------------+------------------------------------------+--------------------+-------+ 5 rows in set (0.93 sec) -``` \ No newline at end of file +``` diff --git a/versioned_docs/version-2.1/table-design/schema-change.md b/versioned_docs/version-2.1/table-design/schema-change.md index 608b2a5bbca..c4510e18929 100644 --- a/versioned_docs/version-2.1/table-design/schema-change.md +++ b/versioned_docs/version-2.1/table-design/schema-change.md @@ -36,7 +36,7 @@ Users can modify the schema of an existing table through the Schema Change opera - Transaction: Each import task is a transaction, and each transaction has a unique increasing transaction ID. -## Introduction +## Introduction **Light Schema Change** @@ -97,7 +97,27 @@ ALTER TABLE table_name ADD COLUMN column_name column_type [KEY | agg_type] [DEFA ### Examples -**1. Adding a key column `new_col` (non-aggregate model) after col1 to `example_rollup_index`** +#### non-aggregate model + +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int, + col5 int +) DUPLICATE KEY(col1, col2, col3) +DISTRIBUTED BY RANDOM BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**1. Adding a key column `new_col` after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -105,7 +125,7 @@ ADD COLUMN new_col INT KEY DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**2. Adding a value column `new_col` (non-aggregate model) with a default value of 0 after col1 to `example_rollup_index`** +**2. Adding a value column `new_col` with a default value of 0 after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -113,7 +133,27 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**3. Adding a Key column `new_col` (aggregate model) after col1 to `example_rollup_index`** +#### aggregate model + +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col1, col2, col3) +DISTRIBUTED BY HASH(col1) BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**3. Adding a Key column `new_col` after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -121,7 +161,7 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**4. Adding a value column `new_col` with SUM aggregation type (aggregate model) after col1 to `example_rollup_index`** +**4. Adding a value column `new_col` with SUM aggregation type after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -151,7 +191,7 @@ Adding multiple columns (aggregate model) to `example_rollup_index`: ```sql ALTER TABLE example_db.my_table -ADD COLUMN (col1 INT DEFAULT "1", col2 FLOAT SUM DEFAULT "2.3") +ADD COLUMN (c1 INT DEFAULT "1", c2 FLOAT SUM DEFAULT "0") TO example_rollup_index; ``` @@ -174,7 +214,7 @@ Removing column col2 from `example_rollup_index`: ```sql ALTER TABLE example_db.my_table -DROP COLUMN col2 +DROP COLUMN col3 FROM example_rollup_index; ``` @@ -221,6 +261,22 @@ ALTER TABLE table_name MODIFY COLUMN column_name column_type [KEY | agg_type] [N ### Examples +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col0 int, + col1 int DEFAULT "1", + col2 int, + col3 varchar(32), + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col0, col1, col2, col3) +DISTRIBUTED BY HASH(col0) BUCKETS 1 +PROPERTIES ( + "replication_num" = "1" +) +``` + **1. Modifying the column type of Key column col1 to BIGINT in the base index and moving it after column col2** ```sql @@ -233,8 +289,8 @@ Note: whether modifying a key column or a value column, the complete column info **2. Modifying the maximum length of column val1 in the Base Index. The original val1 is (val1 VARCHAR(32) REPLACE DEFAULT "abc")** ```sql -ALTER TABLE example_db.my_table -MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc" +ALTER TABLE example_db.my_table +MODIFY COLUMN col5 VARCHAR(64) REPLACE DEFAULT "abc"; ``` Note: only the column type can be modified while keeping the other properties of the column unchanged. @@ -242,7 +298,8 @@ Note: only the column type can be modified while keeping the other properties of **3. Modifying the length of a field in the key column of a duplicate key table** ```sql -alter table example_tbl modify column k3 varchar(50) key null comment 'to 50' +ALTER TABLE example_db.my_table +MODIFY COLUMN col3 varchar(50) KEY NULL comment 'to 50' ``` ## Reorder columns for a specified index @@ -261,6 +318,24 @@ ALTER TABLE table_name ORDER BY (column_name1, column_name2, ...) ### Example +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + v2 int MAX, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k2, k3, v1, v2) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + Reorder columns in the index `example_rollup_index` (assuming the original column order is: k1, k2, k3, v1, v2). ```sql @@ -277,19 +352,20 @@ Schema change can modify multiple indexes in a single job. Source Schema: -```Plain -+-----------+-------+------+------+------+---------+-------+ -| IndexName | Field | Type | Null | Key | Default | Extra | -+-----------+-------+------+------+------+---------+-------+ -| tbl1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -| | k3 | INT | No | true | N/A | | -| | | | | | | | -| rollup2 | k2 | INT | No | true | N/A | | -| | | | | | | | -| rollup1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -+-----------+-------+------+------+------+---------+-------+ +```sql +CREATE TABLE IF NOT EXISTS example_db.tbl1( + k1 int, + k2 int, + k3 int +) AGGREGATE KEY(k1, k2, k3) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + rollup1 (k1, k2), + rollup2 (k2) +) +PROPERTIES ( + "replication_num" = "1" +) ``` You can use the following command to add a column k4 to rollup1 and rollup2, and add an additional column k5 to rollup2: @@ -329,9 +405,27 @@ Additionally, it is not allowed to add columns to a rollup that already exist in ### Example 2 +table's DDL +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k3, k2, v1) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + ```sql ALTER TABLE example_db.my_table -ADD COLUMN v2 INT MAX DEFAULT "0" AFTER k2 TO example_rollup_index, +ADD COLUMN v2 INT MAX DEFAULT "0" TO example_rollup_index, ORDER BY (k3,k1,k2,v2,v1) FROM example_rollup_index; ``` @@ -350,7 +444,7 @@ The creation of a schema change is an asynchronous process. After a job is succe `SHOW ALTER TABLE COLUMN ` allows you to view the currently executing or completed schema Change jobs. When a schema change job involves multiple indexes, the command will display multiple rows, with each row corresponding to an index. For example: ```sql -mysql SHOW ALTER TABLE COLUMN\G; +mysql > SHOW ALTER TABLE COLUMN\G; *************************** 1. row *************************** JobId: 20021 TableName: tbl1 @@ -391,11 +485,11 @@ TransactionId: 10023 - PENDING: the job is waiting in the queue to be scheduled. - WAITING_TXN: waiting for import tasks before the boundary transaction ID to complete. - + - RUNNING: historical data transformation is in progress. - + - FINISHED: the job completed successfully. - + - CANCELLED: the job was cancelled. - Msg: if the job fails, this field displays the failure message. @@ -484,4 +578,4 @@ ADMIN SET FRONTEND CONFIG ("disable_balance" = "true"); ## More Details -For more detailed syntax and best practices regarding Schema Change, please refer to the [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) command manual. You can also enter `HELP ALTER TABLE COLUMN `in the MySQL client command line for more help information. \ No newline at end of file +For more detailed syntax and best practices regarding Schema Change, please refer to the [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) command manual. You can also enter `HELP ALTER TABLE COLUMN `in the MySQL client command line for more help information. diff --git a/versioned_docs/version-3.0/table-design/auto-increment.md b/versioned_docs/version-3.0/table-design/auto-increment.md index be2ea78d71c..1e1eae38451 100644 --- a/versioned_docs/version-3.0/table-design/auto-increment.md +++ b/versioned_docs/version-3.0/table-design/auto-increment.md @@ -456,7 +456,7 @@ select * from records_tbl order by `key`, `name` limit 100; Fetching the data for the second page can be accomplished by: ```sql -select * from records_tbl order by `key`, `name` limit 100, offset 100; +select * from records_tbl order by `key`, `name` limit 100 offset 100; ``` However, when performing deep pagination queries (with large offsets), even if the actual required data rows are few, this method still reads all data into memory for full sorting before subsequent processing, which is quite inefficient. Using an auto-incrementa column assigns a unique value to each row, allowing the use of where `unique_value` > x limit y to filter a significant amount of data beforehand, making pagination more efficient. diff --git a/versioned_docs/version-3.0/table-design/index/inverted-index.md b/versioned_docs/version-3.0/table-design/index/inverted-index.md index f54078ae846..e7f37511857 100644 --- a/versioned_docs/version-3.0/table-design/index/inverted-index.md +++ b/versioned_docs/version-3.0/table-design/index/inverted-index.md @@ -234,7 +234,7 @@ CANCEL BUILD INDEX ON table_name (job_id1, job_id2, ...); `BUILD INDEX` creates an asynchronous task executed by multiple threads on each BE. The number of threads can be set using the BE config `alter_index_worker_count`, with a default value of 3. -In versions before 2.0.12 and 2.1.4, `BUILD INDEX` would keep retrying until it succeeded. Starting from these versions, failure and timeout mechanisms prevent endless retries. +In versions before 2.0.12 and 2.1.4, `BUILD INDEX` would keep retrying until it succeeded. Starting from these versions, failure and timeout mechanisms prevent endless retries. 3.0 (Cloud Mode) does not support this command as this moment. 1. If the majority of replicas for a tablet fail to `BUILD INDEX`, the entire `BUILD INDEX` operation fails. 2. If the time exceeds `alter_table_timeout_second`, the `BUILD INDEX` operation times out. diff --git a/versioned_docs/version-3.0/table-design/index/ngram-bloomfilter-index.md b/versioned_docs/version-3.0/table-design/index/ngram-bloomfilter-index.md index c0a422a891c..71368ead8c0 100644 --- a/versioned_docs/version-3.0/table-design/index/ngram-bloomfilter-index.md +++ b/versioned_docs/version-3.0/table-design/index/ngram-bloomfilter-index.md @@ -142,6 +142,14 @@ curl --location-trusted -u root: -T amazon_reviews_2014.snappy.parquet -H "forma curl --location-trusted -u root: -T amazon_reviews_2015.snappy.parquet -H "format:parquet" http://127.0.0.1:8030/api/${DB}/amazon_reviews/_stream_load ``` +:::info +The data file may exceed 10 GB, and you may need to adjust the streaming_road_max_mb in be.conf to prevent exceeding the upload size limit of the stream load. You can dynamically adjust it by following the steps below: +```bash +curl -X POST http://{be_ip}:{be_http_port}/api/update_config?streaming_load_max_mb=32768 +``` +Every BE needs to execute the above command. +::: + **Run a count query to confirm successful data import:** ```sql @@ -208,4 +216,4 @@ ALTER TABLE amazon_reviews ADD INDEX review_body_ngram_idx(review_body) USING NG | B00KWVZ750 | Angry Birds Epic RPG | 5 | 6 | +------------+------------------------------------------+--------------------+-------+ 5 rows in set (0.93 sec) -``` \ No newline at end of file +``` diff --git a/versioned_docs/version-3.0/table-design/schema-change.md b/versioned_docs/version-3.0/table-design/schema-change.md index 608b2a5bbca..e5810e8a1a5 100644 --- a/versioned_docs/version-3.0/table-design/schema-change.md +++ b/versioned_docs/version-3.0/table-design/schema-change.md @@ -36,7 +36,7 @@ Users can modify the schema of an existing table through the Schema Change opera - Transaction: Each import task is a transaction, and each transaction has a unique increasing transaction ID. -## Introduction +## Introduction **Light Schema Change** @@ -97,7 +97,27 @@ ALTER TABLE table_name ADD COLUMN column_name column_type [KEY | agg_type] [DEFA ### Examples -**1. Adding a key column `new_col` (non-aggregate model) after col1 to `example_rollup_index`** +#### non-aggregate model + +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int, + col5 int +) DUPLICATE KEY(col1, col2, col3) +DISTRIBUTED BY RANDOM BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**1. Adding a key column `new_col` after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -105,7 +125,7 @@ ADD COLUMN new_col INT KEY DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**2. Adding a value column `new_col` (non-aggregate model) with a default value of 0 after col1 to `example_rollup_index`** +**2. Adding a value column `new_col` with a default value of 0 after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -113,7 +133,27 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**3. Adding a Key column `new_col` (aggregate model) after col1 to `example_rollup_index`** +#### aggregate model + +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col1 int, + col2 int, + col3 int, + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col1, col2, col3) +DISTRIBUTED BY HASH(col1) BUCKETS 1 +ROLLUP ( + example_rollup_index (col1, col3, col4, col5) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + +**3. Adding a Key column `new_col` after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -121,7 +161,7 @@ ADD COLUMN new_col INT DEFAULT "0" AFTER col1 TO example_rollup_index; ``` -**4. Adding a value column `new_col` with SUM aggregation type (aggregate model) after col1 to `example_rollup_index`** +**4. Adding a value column `new_col` with SUM aggregation type after col1 to `example_rollup_index`** ```sql ALTER TABLE example_db.my_table @@ -151,7 +191,7 @@ Adding multiple columns (aggregate model) to `example_rollup_index`: ```sql ALTER TABLE example_db.my_table -ADD COLUMN (col1 INT DEFAULT "1", col2 FLOAT SUM DEFAULT "2.3") +ADD COLUMN (c1 INT DEFAULT "1", c2 FLOAT SUM DEFAULT "0") TO example_rollup_index; ``` @@ -174,7 +214,7 @@ Removing column col2 from `example_rollup_index`: ```sql ALTER TABLE example_db.my_table -DROP COLUMN col2 +DROP COLUMN col3 FROM example_rollup_index; ``` @@ -221,6 +261,22 @@ ALTER TABLE table_name MODIFY COLUMN column_name column_type [KEY | agg_type] [N ### Examples +table's DDL: +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + col0 int, + col1 int DEFAULT "1", + col2 int, + col3 varchar(32), + col4 int SUM, + col5 varchar(32) REPLACE DEFAULT "abc" +) AGGREGATE KEY(col0, col1, col2, col3) +DISTRIBUTED BY HASH(col0) BUCKETS 1 +PROPERTIES ( + "replication_num" = "1" +) +``` + **1. Modifying the column type of Key column col1 to BIGINT in the base index and moving it after column col2** ```sql @@ -233,8 +289,8 @@ Note: whether modifying a key column or a value column, the complete column info **2. Modifying the maximum length of column val1 in the Base Index. The original val1 is (val1 VARCHAR(32) REPLACE DEFAULT "abc")** ```sql -ALTER TABLE example_db.my_table -MODIFY COLUMN val1 VARCHAR(64) REPLACE DEFAULT "abc" +ALTER TABLE example_db.my_table +MODIFY COLUMN col5 VARCHAR(64) REPLACE DEFAULT "abc"; ``` Note: only the column type can be modified while keeping the other properties of the column unchanged. @@ -242,7 +298,8 @@ Note: only the column type can be modified while keeping the other properties of **3. Modifying the length of a field in the key column of a duplicate key table** ```sql -alter table example_tbl modify column k3 varchar(50) key null comment 'to 50' +ALTER TABLE example_db.my_table +MODIFY COLUMN col3 varchar(50) KEY NULL comment 'to 50' ``` ## Reorder columns for a specified index @@ -261,6 +318,24 @@ ALTER TABLE table_name ORDER BY (column_name1, column_name2, ...) ### Example +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, + v2 int MAX, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k2, k3, v1, v2) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + Reorder columns in the index `example_rollup_index` (assuming the original column order is: k1, k2, k3, v1, v2). ```sql @@ -277,19 +352,20 @@ Schema change can modify multiple indexes in a single job. Source Schema: -```Plain -+-----------+-------+------+------+------+---------+-------+ -| IndexName | Field | Type | Null | Key | Default | Extra | -+-----------+-------+------+------+------+---------+-------+ -| tbl1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -| | k3 | INT | No | true | N/A | | -| | | | | | | | -| rollup2 | k2 | INT | No | true | N/A | | -| | | | | | | | -| rollup1 | k1 | INT | No | true | N/A | | -| | k2 | INT | No | true | N/A | | -+-----------+-------+------+------+------+---------+-------+ +```sql +CREATE TABLE IF NOT EXISTS example_db.tbl1( + k1 int, + k2 int, + k3 int +) AGGREGATE KEY(k1, k2, k3) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + rollup1 (k1, k2), + rollup2 (k2) +) +PROPERTIES ( + "replication_num" = "1" +) ``` You can use the following command to add a column k4 to rollup1 and rollup2, and add an additional column k5 to rollup2: @@ -329,9 +405,27 @@ Additionally, it is not allowed to add columns to a rollup that already exist in ### Example 2 +table's DDL +```sql +CREATE TABLE IF NOT EXISTS example_db.my_table( + k1 int DEFAULT "1", + k2 int, + k3 varchar(32), + k4 date, + v1 int SUM, +) AGGREGATE KEY(k1, k2, k3, k4) +DISTRIBUTED BY HASH(k1) BUCKETS 1 +ROLLUP ( + example_rollup_index(k1, k3, k2, v1) +) +PROPERTIES ( + "replication_num" = "1" +) +``` + ```sql ALTER TABLE example_db.my_table -ADD COLUMN v2 INT MAX DEFAULT "0" AFTER k2 TO example_rollup_index, +ADD COLUMN v2 INT MAX DEFAULT "0" TO example_rollup_index, ORDER BY (k3,k1,k2,v2,v1) FROM example_rollup_index; ``` @@ -350,7 +444,7 @@ The creation of a schema change is an asynchronous process. After a job is succe `SHOW ALTER TABLE COLUMN ` allows you to view the currently executing or completed schema Change jobs. When a schema change job involves multiple indexes, the command will display multiple rows, with each row corresponding to an index. For example: ```sql -mysql SHOW ALTER TABLE COLUMN\G; +mysql > SHOW ALTER TABLE COLUMN\G; *************************** 1. row *************************** JobId: 20021 TableName: tbl1 @@ -391,11 +485,11 @@ TransactionId: 10023 - PENDING: the job is waiting in the queue to be scheduled. - WAITING_TXN: waiting for import tasks before the boundary transaction ID to complete. - + - RUNNING: historical data transformation is in progress. - + - FINISHED: the job completed successfully. - + - CANCELLED: the job was cancelled. - Msg: if the job fails, this field displays the failure message. @@ -484,4 +578,4 @@ ADMIN SET FRONTEND CONFIG ("disable_balance" = "true"); ## More Details -For more detailed syntax and best practices regarding Schema Change, please refer to the [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) command manual. You can also enter `HELP ALTER TABLE COLUMN `in the MySQL client command line for more help information. \ No newline at end of file +For more detailed syntax and best practices regarding Schema Change, please refer to the [ALTER TABLE COLUMN](../sql-manual/sql-statements/Data-Definition-Statements/Alter/ALTER-TABLE-COLUMN) command manual. --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org