This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 27790a5100c [opt](export) add parquet int96/int64 doc (#2466)
27790a5100c is described below

commit 27790a5100c31f40e642a8bd9fcc40c7fab31330
Author: Mingyu Chen (Rayner) <[email protected]>
AuthorDate: Tue Jun 17 10:37:09 2025 +0800

    [opt](export) add parquet int96/int64 doc (#2466)
    
    ## Versions
    
    - [x] dev
    - [x] 3.0
    - [x] 2.1
    - [ ] 2.0
    
    ## Languages
    
    - [x] Chinese
    - [x] English
    
    ## Docs Checklist
    
    - [ ] Checked by AI
    - [ ] Test Cases Built
---
 docs/data-operate/export/export-overview.md        |  84 +++++++++++------
 docs/lakehouse/catalogs/hive-catalog.md            |   3 +
 docs/lakehouse/catalogs/iceberg-catalog.md         |   6 ++
 .../current/data-operate/export/export-overview.md |  84 +++++++++++------
 .../current/lakehouse/catalogs/hive-catalog.md     |   2 +
 .../current/lakehouse/catalogs/iceberg-catalog.md  |   6 ++
 .../data-operate/export/export-overview.md         | 103 ++++++++++++---------
 .../data-operate/export/export-overview.md         |  84 +++++++++++------
 .../data-operate/export/export-overview.md         | 102 +++++++++++---------
 .../data-operate/export/export-overview.md         |  84 +++++++++++------
 10 files changed, 363 insertions(+), 195 deletions(-)

diff --git a/docs/data-operate/export/export-overview.md 
b/docs/data-operate/export/export-overview.md
index fa64345a0c1..8721a2737f0 100644
--- a/docs/data-operate/export/export-overview.md
+++ b/docs/data-operate/export/export-overview.md
@@ -84,29 +84,61 @@ Suitable for the following scenarios:
 Parquet and ORC file formats have their own data types. Doris's export 
function can automatically map Doris's data types to the corresponding data 
types in Parquet and ORC file formats. The CSV format does not have types, all 
data is output as text.
 
 The following table shows the mapping between Doris data types and Parquet, 
ORC file format data types:
-| Doris Type | Arrow Type | Orc Type |
-| ---------- | ---------- | -------- |
-| boolean    | boolean | boolean |
-| tinyint    | int8 | tinyint |
-| smallint   | int16 | smallint |
-| int        | int32 | int |
-| bigint     | int64 | bigint |
-| largeInt   | utf8 | string |
-| date       | utf8 | string |
-| datev2     | Date32Type | string |
-| datetime   | utf8 | string |
-| datetimev2 | TimestampType | timestamp |
-| float      | float32 | float |
-| double     | float64 | double |
-| char / varchar / string| utf8 | string |
-| decimal    | decimal128 | decimal |
-| struct     | struct | struct |
-| map        | map | map |
-| array      | list | array |
-| json       | utf8 | string |
-| variant    | utf8 | string |
-| bitmap     | binary | binary |
-| quantile_state| binary | binary |
-| hll        | binary | binary |
-
-> Note: When Doris exports data to the Parquet file format, it first converts 
the in-memory data of Doris into the Arrow in-memory data format, and then 
writes it out to the Parquet file format via Arrow. 
+
+- ORC
+
+    | Doris Type | Orc Type |
+    | ---------- | -------- |
+    | boolean    | boolean |
+    | tinyint    | tinyint |
+    | smallint   | smallint |
+    | int        | int |
+    | bigint     | bigint |
+    | largeInt   | string |
+    | date       | string |
+    | datev2     | string |
+    | datetime   | string |
+    | datetimev2 | timestamp |
+    | float      | float |
+    | double     | double |
+    | char / varchar / string| string |
+    | decimal    | decimal |
+    | struct     | struct |
+    | map        | map |
+    | array      | array |
+    | json       | string |
+    | variant    | string |
+    | bitmap     | binary |
+    | quantile_state| binary |
+    | hll        | binary |
+
+- Parquet
+
+    When Doris is exported to the Parquet file format, the Doris memory data 
is first converted to the Arrow memory data format, and then written out to the 
Parquet file format by Arrow.
+
+    | Doris Type | Arrow Type | Parquet Physical Type | Parquet Logical Type |
+    | ---------- | ---------- | -------- | ------- |
+    | boolean    | boolean | BOOLEAN | |
+    | tinyint    | int8 | INT32 | INT_8 |
+    | smallint   | int16 | INT32 | INT_16 |
+    | int        | int32 | INT32 | INT_32 |
+    | bigint     | int64 | INT64 | INT_64 |
+    | largeInt   | utf8 | BYTE_ARRAY | UTF8 |
+    | date       | utf8 | BYTE_ARRAY | UTF8 |
+    | datev2     | date32 | INT32 | DATE |
+    | datetime   | utf8 | BYTE_ARRAY | UTF8 |
+    | datetimev2 | timestamp | INT96/INT64 | TIMESTAMP(MICROS/MILLIS/SECONDS) |
+    | float      | float32 | FLOAT | |
+    | double     | float64 | DOUBLE | |
+    | char / varchar / string| utf8 | BYTE_ARRAY | UTF8 |
+    | decimal    | decimal128 | FIXED_LEN_BYTE_ARRAY | DECIMAL(scale, 
precision) |
+    | struct     | struct |  | Parquet Group |
+    | map        | map | | Parquet Map |
+    | array      | list | | Parquet List |
+    | json       | utf8 | BYTE_ARRAY | UTF8 |
+    | variant    | utf8 | BYTE_ARRAY | UTF8 |
+    | bitmap     | binary | BYTE_ARRAY | |
+    | quantile_state| binary | BYTE_ARRAY | |
+    | hll        | binary | BYTE_ARRAY | |
+
+    > Note: In versions 2.1.11 and 3.0.7, you can specify the 
`parquet.enable_int96_timestamps` property to determine whether Doris's 
datetimev2 type uses Parquet's INT96 storage or INT64. INT96 is used by 
default. However, INT96 has been deprecated in the Parquet standard and is only 
used for compatibility with some older systems (such as versions before Hive 
4.0).
diff --git a/docs/lakehouse/catalogs/hive-catalog.md 
b/docs/lakehouse/catalogs/hive-catalog.md
index aaa6ae10cf7..327cc5415c2 100644
--- a/docs/lakehouse/catalogs/hive-catalog.md
+++ b/docs/lakehouse/catalogs/hive-catalog.md
@@ -510,6 +510,9 @@ For a Hive Database, you must first delete all tables under 
that Database before
 
   - ORC (default)
   - Parquet
+
+    Note that when the DATETIME type is written to a Parquet file, the 
physical type used is INT96 instead of INT64. This is to be compatible with the 
logic of Hive versions prior to 4.0.
+
   - Text (supported from versions 2.1.7 and 3.0.3)
 
       Text format supports the following table properties:
diff --git a/docs/lakehouse/catalogs/iceberg-catalog.md 
b/docs/lakehouse/catalogs/iceberg-catalog.md
index e22c229a74e..9a9a215a8fb 100644
--- a/docs/lakehouse/catalogs/iceberg-catalog.md
+++ b/docs/lakehouse/catalogs/iceberg-catalog.md
@@ -539,6 +539,12 @@ For an Iceberg Database, you must first drop all tables 
under the database befor
 
   * Parquet (default)
 
+    Note that for the Iceberg table created by Doris, the Datetime corresponds 
to the `timestamp_ntz` type.
+
+    In versions after 2.1.11 and 3.0.7, when the Datetime type is written to 
the Parquet file, the physical type used is INT64 instead of INT96.
+    
+    And if the Iceberg table is created by other systems, although the 
`timestamp` and `timestamp_ntz` types are both mapped to the Doris Datetime 
type. However, when writing, it will determine whether the time zone needs to 
be processed based on the actual type.
+
   * ORC
 
 * **Compression Formats**
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/export/export-overview.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/export/export-overview.md
index d2eebc89845..364b3636e53 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/export/export-overview.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/export/export-overview.md
@@ -84,29 +84,61 @@ Apache Doris 提供以下三种不同的数据导出方式:
 Parquet、ORC 文件格式拥有自己的数据类型。Apache Doris 的导出功能能够自动将 Apache Doris 的数据类型导出为 
Parquet、ORC 文件格式的对应数据类型。CSV 格式没有类型,所有数据都以文本形式输出。
 
 以下是 Apache Doris 数据类型和 Parquet、ORC 文件格式的数据类型映射关系表:
-| Doris Type | Arrow Type | Orc Type |
-| ---------- | ---------- | -------- |
-| boolean    | boolean | boolean |
-| tinyint    | int8 | tinyint |
-| smallint   | int16 | smallint |
-| int        | int32 | int |
-| bigint     | int64 | bigint |
-| largeInt   | utf8 | string |
-| date       | utf8 | string |
-| datev2     | Date32Type | string |
-| datetime   | utf8 | string |
-| datetimev2 | TimestampType | timestamp |
-| float      | float32 | float |
-| double     | float64 | double |
-| char / varchar / string| utf8 | string |
-| decimal    | decimal128 | decimal |
-| struct     | struct | struct |
-| map        | map | map |
-| array      | list | array |
-| json       | utf8 | string |
-| variant    | utf8 | string |
-| bitmap     | binary | binary |
-| quantile_state| binary | binary |
-| hll        | binary | binary |
-
-> 注意:Doris 导出到 Parquet 文件格式时,会先将 Doris 内存数据转换为 Arrow 内存数据格式,然后由 Arrow 写出到 
Parquet 文件格式。
\ No newline at end of file
+
+- ORC
+
+    | Doris Type | Orc Type |
+    | ---------- | -------- |
+    | boolean    | boolean |
+    | tinyint    | tinyint |
+    | smallint   | smallint |
+    | int        | int |
+    | bigint     | bigint |
+    | largeInt   | string |
+    | date       | string |
+    | datev2     | string |
+    | datetime   | string |
+    | datetimev2 | timestamp |
+    | float      | float |
+    | double     | double |
+    | char / varchar / string| string |
+    | decimal    | decimal |
+    | struct     | struct |
+    | map        | map |
+    | array      | array |
+    | json       | string |
+    | variant    | string |
+    | bitmap     | binary |
+    | quantile_state| binary |
+    | hll        | binary |
+
+- Parquet
+
+    Doris 导出到 Parquet 文件格式时,会先将 Doris 内存数据转换为 Arrow 内存数据格式,然后由 Arrow 写出到 
Parquet 文件格式。
+
+    | Doris Type | Arrow Type | Parquet Physical Type | Parquet Logical Type |
+    | ---------- | ---------- | -------- | ------- |
+    | boolean    | boolean | BOOLEAN | |
+    | tinyint    | int8 | INT32 | INT_8 |
+    | smallint   | int16 | INT32 | INT_16 |
+    | int        | int32 | INT32 | INT_32 |
+    | bigint     | int64 | INT64 | INT_64 |
+    | largeInt   | utf8 | BYTE_ARRAY | UTF8 |
+    | date       | utf8 | BYTE_ARRAY | UTF8 |
+    | datev2     | date32 | INT32 | DATE |
+    | datetime   | utf8 | BYTE_ARRAY | UTF8 |
+    | datetimev2 | timestamp | INT96/INT64 | TIMESTAMP(MICROS/MILLIS/SECONDS) |
+    | float      | float32 | FLOAT | |
+    | double     | float64 | DOUBLE | |
+    | char / varchar / string| utf8 | BYTE_ARRAY | UTF8 |
+    | decimal    | decimal128 | FIXED_LEN_BYTE_ARRAY | DECIMAL(scale, 
precision) |
+    | struct     | struct |  | Parquet Group |
+    | map        | map | | Parquet Map |
+    | array      | list | | Parquet List |
+    | json       | utf8 | BYTE_ARRAY | UTF8 |
+    | variant    | utf8 | BYTE_ARRAY | UTF8 |
+    | bitmap     | binary | BYTE_ARRAY | |
+    | quantile_state| binary | BYTE_ARRAY | |
+    | hll        | binary | BYTE_ARRAY | |
+
+    > 注:在 2.1.11 和 3.0.7 版本中,支持指定 `parquet.enable_int96_timestamps` 属性,来决定 
Doris 的 datetimev2 类型,是使用 Parquet 的 INT96 存储还是 INT64。默认使用 INT96。但 INT96 在 
Parquet 标准中已经废弃,仅用于兼容一些旧系统(如 Hive 4.0 之前的版本)。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/hive-catalog.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/hive-catalog.md
index 723bbdea8c9..1327b27781c 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/hive-catalog.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/hive-catalog.md
@@ -520,6 +520,8 @@ DROP DATABASE [IF EXISTS] hive_ctl.hive_db;
 
   * Parquet
 
+    注意,DATETIME 类型写入到 Parquet 文件时,物理类型使用的是 INT96 而非 INT64。目的是兼容 Hive 4.0 
版本之前的逻辑。
+
   * Text(自 2.1.7 和 3.0.3 版本开始支持)
 
   * Text 格式还支持以下表属性:
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.md
index 9d9304bd910..ec7ce6676a8 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/catalogs/iceberg-catalog.md
@@ -549,6 +549,12 @@ DROP DATABASE [IF EXISTS] iceberg.iceberg_db;
 
   * Parquet(默认)
 
+    注意,由 Doris 创建的 Iceberg 表,Datetime 对应的是 `timestamp_ntz` 类型。
+
+    2.1.11 和 3.0.7 之后的版本中,Datetime 类型写入到 Parquet 文件时,物理类型使用的是 INT64 而非 INT96。
+
+    此外,如果是其他系统创建的 Iceberg 表,虽然 `timestamp` 和 `timestamp_ntz` 类型都映射为 Doris 的 
Datetime 类型。但在写入时,会根据实际类型判断是否需要处理时区。
+
   * ORC
 
 * **压缩格式**
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/export/export-overview.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/export/export-overview.md
index 76eacceaeeb..364b3636e53 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/export/export-overview.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-2.1/data-operate/export/export-overview.md
@@ -85,49 +85,60 @@ Parquet、ORC 文件格式拥有自己的数据类型。Apache Doris 的导出
 
 以下是 Apache Doris 数据类型和 Parquet、ORC 文件格式的数据类型映射关系表:
 
-1. Doris 导出到 Orc 文件格式的数据类型映射表:
-
-    |Doris Type|Orc Type|
-    | -------- | ------- |
-    |boolean|boolean|
-    |tinyint|tinyint|
-    |smallint|smallint|
-    |int|int|
-    |bigint|bigint|
-    |largeInt|string|
-    |date|string|
-    |datev2|string|
-    |datetime|string|
-    |datetimev2|timestamp|
-    |float|float|
-    |double|double|
-    |char / varchar / string|string|
-    |decimal|decimal|
-    |struct|struct|
-    |map|map|
-    |array|array|
-    |json|不支持|
-
-
-2. Apache Doris 导出到 Parquet 文件格式时,会先将 Apache Doris 内存数据转换为 Arrow 内存数据格式,然后由 
Arrow 写出到 Parquet 文件格式。Apache Doris 数据类型到 Arrow 数据类的映射关系为:
-
-    |Doris Type|Arrow Type|
-    | ----- | ----- |
-    |boolean|boolean|
-    |tinyint|int8|
-    |smallint|int16|
-    |int|int32|
-    |bigint|int64|
-    |largeInt|utf8|
-    |date|utf8|
-    |datev2|Date32Type|
-    |datetime|utf8|
-    |datetimev2|TimestampType|
-    |float|float32|
-    |double|float64|
-    |char / varchar / string|utf8|
-    |decimal|decimal128|
-    |struct|struct|
-    |map|map|
-    |array|list|
-    |json|utf8|
+- ORC
+
+    | Doris Type | Orc Type |
+    | ---------- | -------- |
+    | boolean    | boolean |
+    | tinyint    | tinyint |
+    | smallint   | smallint |
+    | int        | int |
+    | bigint     | bigint |
+    | largeInt   | string |
+    | date       | string |
+    | datev2     | string |
+    | datetime   | string |
+    | datetimev2 | timestamp |
+    | float      | float |
+    | double     | double |
+    | char / varchar / string| string |
+    | decimal    | decimal |
+    | struct     | struct |
+    | map        | map |
+    | array      | array |
+    | json       | string |
+    | variant    | string |
+    | bitmap     | binary |
+    | quantile_state| binary |
+    | hll        | binary |
+
+- Parquet
+
+    Doris 导出到 Parquet 文件格式时,会先将 Doris 内存数据转换为 Arrow 内存数据格式,然后由 Arrow 写出到 
Parquet 文件格式。
+
+    | Doris Type | Arrow Type | Parquet Physical Type | Parquet Logical Type |
+    | ---------- | ---------- | -------- | ------- |
+    | boolean    | boolean | BOOLEAN | |
+    | tinyint    | int8 | INT32 | INT_8 |
+    | smallint   | int16 | INT32 | INT_16 |
+    | int        | int32 | INT32 | INT_32 |
+    | bigint     | int64 | INT64 | INT_64 |
+    | largeInt   | utf8 | BYTE_ARRAY | UTF8 |
+    | date       | utf8 | BYTE_ARRAY | UTF8 |
+    | datev2     | date32 | INT32 | DATE |
+    | datetime   | utf8 | BYTE_ARRAY | UTF8 |
+    | datetimev2 | timestamp | INT96/INT64 | TIMESTAMP(MICROS/MILLIS/SECONDS) |
+    | float      | float32 | FLOAT | |
+    | double     | float64 | DOUBLE | |
+    | char / varchar / string| utf8 | BYTE_ARRAY | UTF8 |
+    | decimal    | decimal128 | FIXED_LEN_BYTE_ARRAY | DECIMAL(scale, 
precision) |
+    | struct     | struct |  | Parquet Group |
+    | map        | map | | Parquet Map |
+    | array      | list | | Parquet List |
+    | json       | utf8 | BYTE_ARRAY | UTF8 |
+    | variant    | utf8 | BYTE_ARRAY | UTF8 |
+    | bitmap     | binary | BYTE_ARRAY | |
+    | quantile_state| binary | BYTE_ARRAY | |
+    | hll        | binary | BYTE_ARRAY | |
+
+    > 注:在 2.1.11 和 3.0.7 版本中,支持指定 `parquet.enable_int96_timestamps` 属性,来决定 
Doris 的 datetimev2 类型,是使用 Parquet 的 INT96 存储还是 INT64。默认使用 INT96。但 INT96 在 
Parquet 标准中已经废弃,仅用于兼容一些旧系统(如 Hive 4.0 之前的版本)。
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/export/export-overview.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/export/export-overview.md
index bb3e11ec62a..364b3636e53 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/export/export-overview.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/version-3.0/data-operate/export/export-overview.md
@@ -84,29 +84,61 @@ Apache Doris 提供以下三种不同的数据导出方式:
 Parquet、ORC 文件格式拥有自己的数据类型。Apache Doris 的导出功能能够自动将 Apache Doris 的数据类型导出为 
Parquet、ORC 文件格式的对应数据类型。CSV 格式没有类型,所有数据都以文本形式输出。
 
 以下是 Apache Doris 数据类型和 Parquet、ORC 文件格式的数据类型映射关系表:
-| Doris Type | Arrow Type | Orc Type |
-| ---------- | ---------- | -------- |
-| boolean    | boolean | boolean |
-| tinyint    | int8 | tinyint |
-| smallint   | int16 | smallint |
-| int        | int32 | int |
-| bigint     | int64 | bigint |
-| largeInt   | utf8 | string |
-| date       | utf8 | string |
-| datev2     | Date32Type | string |
-| datetime   | utf8 | string |
-| datetimev2 | TimestampType | timestamp |
-| float      | float32 | float |
-| double     | float64 | double |
-| char / varchar / string| utf8 | string |
-| decimal    | decimal128 | decimal |
-| struct     | struct | struct |
-| map        | map | map |
-| array      | list | array |
-| json       | utf8 | string |
-| variant    | utf8 | string |
-| bitmap     | binary | binary |
-| quantile_state| binary | binary |
-| hll        | binary | binary |
-
-> 注意:Doris 导出到 Parquet 文件格式时,会先将 Doris 内存数据转换为 Arrow 内存数据格式,然后由 Arrow 写出到 
Parquet 文件格式。
+
+- ORC
+
+    | Doris Type | Orc Type |
+    | ---------- | -------- |
+    | boolean    | boolean |
+    | tinyint    | tinyint |
+    | smallint   | smallint |
+    | int        | int |
+    | bigint     | bigint |
+    | largeInt   | string |
+    | date       | string |
+    | datev2     | string |
+    | datetime   | string |
+    | datetimev2 | timestamp |
+    | float      | float |
+    | double     | double |
+    | char / varchar / string| string |
+    | decimal    | decimal |
+    | struct     | struct |
+    | map        | map |
+    | array      | array |
+    | json       | string |
+    | variant    | string |
+    | bitmap     | binary |
+    | quantile_state| binary |
+    | hll        | binary |
+
+- Parquet
+
+    Doris 导出到 Parquet 文件格式时,会先将 Doris 内存数据转换为 Arrow 内存数据格式,然后由 Arrow 写出到 
Parquet 文件格式。
+
+    | Doris Type | Arrow Type | Parquet Physical Type | Parquet Logical Type |
+    | ---------- | ---------- | -------- | ------- |
+    | boolean    | boolean | BOOLEAN | |
+    | tinyint    | int8 | INT32 | INT_8 |
+    | smallint   | int16 | INT32 | INT_16 |
+    | int        | int32 | INT32 | INT_32 |
+    | bigint     | int64 | INT64 | INT_64 |
+    | largeInt   | utf8 | BYTE_ARRAY | UTF8 |
+    | date       | utf8 | BYTE_ARRAY | UTF8 |
+    | datev2     | date32 | INT32 | DATE |
+    | datetime   | utf8 | BYTE_ARRAY | UTF8 |
+    | datetimev2 | timestamp | INT96/INT64 | TIMESTAMP(MICROS/MILLIS/SECONDS) |
+    | float      | float32 | FLOAT | |
+    | double     | float64 | DOUBLE | |
+    | char / varchar / string| utf8 | BYTE_ARRAY | UTF8 |
+    | decimal    | decimal128 | FIXED_LEN_BYTE_ARRAY | DECIMAL(scale, 
precision) |
+    | struct     | struct |  | Parquet Group |
+    | map        | map | | Parquet Map |
+    | array      | list | | Parquet List |
+    | json       | utf8 | BYTE_ARRAY | UTF8 |
+    | variant    | utf8 | BYTE_ARRAY | UTF8 |
+    | bitmap     | binary | BYTE_ARRAY | |
+    | quantile_state| binary | BYTE_ARRAY | |
+    | hll        | binary | BYTE_ARRAY | |
+
+    > 注:在 2.1.11 和 3.0.7 版本中,支持指定 `parquet.enable_int96_timestamps` 属性,来决定 
Doris 的 datetimev2 类型,是使用 Parquet 的 INT96 存储还是 INT64。默认使用 INT96。但 INT96 在 
Parquet 标准中已经废弃,仅用于兼容一些旧系统(如 Hive 4.0 之前的版本)。
diff --git a/versioned_docs/version-2.1/data-operate/export/export-overview.md 
b/versioned_docs/version-2.1/data-operate/export/export-overview.md
index 8f52ff8375c..8721a2737f0 100644
--- a/versioned_docs/version-2.1/data-operate/export/export-overview.md
+++ b/versioned_docs/version-2.1/data-operate/export/export-overview.md
@@ -85,48 +85,60 @@ Parquet and ORC file formats have their own data types. 
Doris's export function
 
 The following table shows the mapping between Doris data types and Parquet, 
ORC file format data types:
 
-1. Doris export to ORC file format data type mapping table:
-   
-    |Doris Type|Orc Type|
-    | -------- | ------- |
-    |boolean|boolean|
-    |tinyint|tinyint|
-    |smallint|smallint|
-    |int|int|
-    |bigint|bigint|
-    |largeInt|string|
-    |date|string|
-    |datev2|string|
-    |datetime|string|
-    |datetimev2|timestamp|
-    |float|float|
-    |double|double|
-    |char / varchar / string|string|
-    |decimal|decimal|
-    |struct|struct|
-    |map|map|
-    |array|array|
-    |json| Not supported|
-
-2. When Doris exports to Parquet file format, it first converts Doris 
in-memory data to Arrow in-memory data format, then writes out to Parquet file 
format. The mapping relationship between Doris data types and Arrow data types 
is:
-
-    | Doris Type | Arrow Type |
-    | ----- | ----- |
-    | boolean | boolean |
-    | tinyint | int8 |
-    | smallint | int16 |
-    | int | int32 |
-    | bigint | int64 |
-    | largeInt | utf8 |
-    | date | utf8 |
-    | datev2 | Date32Type |
-    | datetime | utf8 |
-    | datetimev2 | TimestampType |
-    | float | float32 |
-    | double | float64 |
-    | char / varchar / string | utf8 |
-    | decimal | decimal128 |
-    | struct | struct |
-    | map | map |
-    | array | list |
-    |json| utf8 |
+- ORC
+
+    | Doris Type | Orc Type |
+    | ---------- | -------- |
+    | boolean    | boolean |
+    | tinyint    | tinyint |
+    | smallint   | smallint |
+    | int        | int |
+    | bigint     | bigint |
+    | largeInt   | string |
+    | date       | string |
+    | datev2     | string |
+    | datetime   | string |
+    | datetimev2 | timestamp |
+    | float      | float |
+    | double     | double |
+    | char / varchar / string| string |
+    | decimal    | decimal |
+    | struct     | struct |
+    | map        | map |
+    | array      | array |
+    | json       | string |
+    | variant    | string |
+    | bitmap     | binary |
+    | quantile_state| binary |
+    | hll        | binary |
+
+- Parquet
+
+    When Doris is exported to the Parquet file format, the Doris memory data 
is first converted to the Arrow memory data format, and then written out to the 
Parquet file format by Arrow.
+
+    | Doris Type | Arrow Type | Parquet Physical Type | Parquet Logical Type |
+    | ---------- | ---------- | -------- | ------- |
+    | boolean    | boolean | BOOLEAN | |
+    | tinyint    | int8 | INT32 | INT_8 |
+    | smallint   | int16 | INT32 | INT_16 |
+    | int        | int32 | INT32 | INT_32 |
+    | bigint     | int64 | INT64 | INT_64 |
+    | largeInt   | utf8 | BYTE_ARRAY | UTF8 |
+    | date       | utf8 | BYTE_ARRAY | UTF8 |
+    | datev2     | date32 | INT32 | DATE |
+    | datetime   | utf8 | BYTE_ARRAY | UTF8 |
+    | datetimev2 | timestamp | INT96/INT64 | TIMESTAMP(MICROS/MILLIS/SECONDS) |
+    | float      | float32 | FLOAT | |
+    | double     | float64 | DOUBLE | |
+    | char / varchar / string| utf8 | BYTE_ARRAY | UTF8 |
+    | decimal    | decimal128 | FIXED_LEN_BYTE_ARRAY | DECIMAL(scale, 
precision) |
+    | struct     | struct |  | Parquet Group |
+    | map        | map | | Parquet Map |
+    | array      | list | | Parquet List |
+    | json       | utf8 | BYTE_ARRAY | UTF8 |
+    | variant    | utf8 | BYTE_ARRAY | UTF8 |
+    | bitmap     | binary | BYTE_ARRAY | |
+    | quantile_state| binary | BYTE_ARRAY | |
+    | hll        | binary | BYTE_ARRAY | |
+
+    > Note: In versions 2.1.11 and 3.0.7, you can specify the 
`parquet.enable_int96_timestamps` property to determine whether Doris's 
datetimev2 type uses Parquet's INT96 storage or INT64. INT96 is used by 
default. However, INT96 has been deprecated in the Parquet standard and is only 
used for compatibility with some older systems (such as versions before Hive 
4.0).
diff --git a/versioned_docs/version-3.0/data-operate/export/export-overview.md 
b/versioned_docs/version-3.0/data-operate/export/export-overview.md
index fa64345a0c1..8721a2737f0 100644
--- a/versioned_docs/version-3.0/data-operate/export/export-overview.md
+++ b/versioned_docs/version-3.0/data-operate/export/export-overview.md
@@ -84,29 +84,61 @@ Suitable for the following scenarios:
 Parquet and ORC file formats have their own data types. Doris's export 
function can automatically map Doris's data types to the corresponding data 
types in Parquet and ORC file formats. The CSV format does not have types, all 
data is output as text.
 
 The following table shows the mapping between Doris data types and Parquet, 
ORC file format data types:
-| Doris Type | Arrow Type | Orc Type |
-| ---------- | ---------- | -------- |
-| boolean    | boolean | boolean |
-| tinyint    | int8 | tinyint |
-| smallint   | int16 | smallint |
-| int        | int32 | int |
-| bigint     | int64 | bigint |
-| largeInt   | utf8 | string |
-| date       | utf8 | string |
-| datev2     | Date32Type | string |
-| datetime   | utf8 | string |
-| datetimev2 | TimestampType | timestamp |
-| float      | float32 | float |
-| double     | float64 | double |
-| char / varchar / string| utf8 | string |
-| decimal    | decimal128 | decimal |
-| struct     | struct | struct |
-| map        | map | map |
-| array      | list | array |
-| json       | utf8 | string |
-| variant    | utf8 | string |
-| bitmap     | binary | binary |
-| quantile_state| binary | binary |
-| hll        | binary | binary |
-
-> Note: When Doris exports data to the Parquet file format, it first converts 
the in-memory data of Doris into the Arrow in-memory data format, and then 
writes it out to the Parquet file format via Arrow. 
+
+- ORC
+
+    | Doris Type | Orc Type |
+    | ---------- | -------- |
+    | boolean    | boolean |
+    | tinyint    | tinyint |
+    | smallint   | smallint |
+    | int        | int |
+    | bigint     | bigint |
+    | largeInt   | string |
+    | date       | string |
+    | datev2     | string |
+    | datetime   | string |
+    | datetimev2 | timestamp |
+    | float      | float |
+    | double     | double |
+    | char / varchar / string| string |
+    | decimal    | decimal |
+    | struct     | struct |
+    | map        | map |
+    | array      | array |
+    | json       | string |
+    | variant    | string |
+    | bitmap     | binary |
+    | quantile_state| binary |
+    | hll        | binary |
+
+- Parquet
+
+    When Doris is exported to the Parquet file format, the Doris memory data 
is first converted to the Arrow memory data format, and then written out to the 
Parquet file format by Arrow.
+
+    | Doris Type | Arrow Type | Parquet Physical Type | Parquet Logical Type |
+    | ---------- | ---------- | -------- | ------- |
+    | boolean    | boolean | BOOLEAN | |
+    | tinyint    | int8 | INT32 | INT_8 |
+    | smallint   | int16 | INT32 | INT_16 |
+    | int        | int32 | INT32 | INT_32 |
+    | bigint     | int64 | INT64 | INT_64 |
+    | largeInt   | utf8 | BYTE_ARRAY | UTF8 |
+    | date       | utf8 | BYTE_ARRAY | UTF8 |
+    | datev2     | date32 | INT32 | DATE |
+    | datetime   | utf8 | BYTE_ARRAY | UTF8 |
+    | datetimev2 | timestamp | INT96/INT64 | TIMESTAMP(MICROS/MILLIS/SECONDS) |
+    | float      | float32 | FLOAT | |
+    | double     | float64 | DOUBLE | |
+    | char / varchar / string| utf8 | BYTE_ARRAY | UTF8 |
+    | decimal    | decimal128 | FIXED_LEN_BYTE_ARRAY | DECIMAL(scale, 
precision) |
+    | struct     | struct |  | Parquet Group |
+    | map        | map | | Parquet Map |
+    | array      | list | | Parquet List |
+    | json       | utf8 | BYTE_ARRAY | UTF8 |
+    | variant    | utf8 | BYTE_ARRAY | UTF8 |
+    | bitmap     | binary | BYTE_ARRAY | |
+    | quantile_state| binary | BYTE_ARRAY | |
+    | hll        | binary | BYTE_ARRAY | |
+
+    > Note: In versions 2.1.11 and 3.0.7, you can specify the 
`parquet.enable_int96_timestamps` property to determine whether Doris's 
datetimev2 type uses Parquet's INT96 storage or INT64. INT96 is used by 
default. However, INT96 has been deprecated in the Parquet standard and is only 
used for compatibility with some older systems (such as versions before Hive 
4.0).


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to