Hisoka-X commented on code in PR #8507: URL: https://github.com/apache/seatunnel/pull/8507#discussion_r1954047394
########## docs/en/connector-v2/source/Hive.md: ########## @@ -102,6 +107,22 @@ The compress codec of files and the details that supported as the following show - orc/parquet: automatically recognizes the compression type, no additional settings required. +### split_single_file_to_multiple_splits + +whether to split a file into many splits. true will split. + +### file_size_per_split + +split a file into many splits according to file size, if row_count_per_split not config. use row_count_per_split prefer. only valid for orc/parquet now. + +### row_count_per_split + +split a file into many splits according to row count. only valid for orc/parquet now. Review Comment: ```suggestion Split a file into many splits according to row count. Only valid for orc/parquet now. ``` ########## docs/en/connector-v2/source/Hive.md: ########## @@ -102,6 +107,22 @@ The compress codec of files and the details that supported as the following show - orc/parquet: automatically recognizes the compression type, no additional settings required. +### split_single_file_to_multiple_splits + +whether to split a file into many splits. true will split. + +### file_size_per_split + +split a file into many splits according to file size, if row_count_per_split not config. use row_count_per_split prefer. only valid for orc/parquet now. + +### row_count_per_split + +split a file into many splits according to row count. only valid for orc/parquet now. + +### batch_read_rows + +max size in a batch. now only useful for orc file. default is 1024, if memory is enough, you can increase it to speed up reading. Review Comment: ```suggestion The max size in a batch, now only useful for orc file. The default value is 1024, if memory is enough, you can increase it to speed up reading. Only worked when enable split_single_file_to_multiple_splits. ``` ########## docs/en/connector-v2/source/Hive.md: ########## @@ -33,21 +33,26 @@ Read all the data in a split in a pollNext call. What splits are read will be sa ## Options -| name | type | required | default value | -|-----------------------|--------|----------|----------------| -| table_name | string | yes | - | -| metastore_uri | string | yes | - | -| krb5_path | string | no | /etc/krb5.conf | -| kerberos_principal | string | no | - | -| kerberos_keytab_path | string | no | - | -| hdfs_site_path | string | no | - | -| hive_site_path | string | no | - | -| hive.hadoop.conf | Map | no | - | -| hive.hadoop.conf-path | string | no | - | -| read_partitions | list | no | - | -| read_columns | list | no | - | -| compress_codec | string | no | none | -| common-options | | no | - | +| name | type | required | default value | +|--------------------------------------|---------|----------|----------------| +| table_name | string | yes | - | +| metastore_uri | string | yes | - | +| krb5_path | string | no | /etc/krb5.conf | +| kerberos_principal | string | no | - | +| kerberos_keytab_path | string | no | - | +| hdfs_site_path | string | no | - | +| hive_site_path | string | no | - | +| hive.hadoop.conf | Map | no | - | +| hive.hadoop.conf-path | string | no | - | +| read_partitions | list | no | - | +| read_columns | list | no | - | +| compress_codec | string | no | - | +| compress_codec | string | no | - | +| split_single_file_to_multiple_splits | long | no | false | Review Comment: ```suggestion | split_single_file_to_multiple_splits | boolean | no | false | ``` ########## docs/en/connector-v2/source/Hive.md: ########## @@ -102,6 +107,22 @@ The compress codec of files and the details that supported as the following show - orc/parquet: automatically recognizes the compression type, no additional settings required. +### split_single_file_to_multiple_splits + +whether to split a file into many splits. true will split. Review Comment: ```suggestion Whether to split a file into many splits. If true will split. Only valid for orc/parquet now. ``` ########## docs/en/connector-v2/source/Hive.md: ########## @@ -102,6 +107,22 @@ The compress codec of files and the details that supported as the following show - orc/parquet: automatically recognizes the compression type, no additional settings required. +### split_single_file_to_multiple_splits + +whether to split a file into many splits. true will split. + +### file_size_per_split + +split a file into many splits according to file size, if row_count_per_split not config. use row_count_per_split prefer. only valid for orc/parquet now. Review Comment: ```suggestion Split a file into many splits according to file size, if row_count_per_split not config, use row_count_per_split prefer. Only valid for orc/parquet now. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@seatunnel.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org