lirui-apache commented on a change in pull request #15060: URL: https://github.com/apache/flink/pull/15060#discussion_r590032900
########## File path: docs/content/docs/connectors/table/hive/hive_read_write.md ########## @@ -409,6 +409,28 @@ This configuration is set in the `TableConfig` and will affect all sinks of the </tbody> </table> +### Sink Parallelism + +The parallelism of writing data into Hive can be configured by the corresponding table option. By default, the parallelism is configured to being the same as the parallelism of its last upstream chained operator. When the parallelism which is different from the parallelism of the upstream parallelism is configured, the writer operator will apply the parallelism. + +<table class="table table-bordered"> + <thead> + <tr> + <th class="text-left" style="width: 20%">Key</th> + <th class="text-left" style="width: 15%">Default</th> + <th class="text-left" style="width: 10%">Type</th> + <th class="text-left" style="width: 55%">Description</th> + </tr> + </thead> + <tbody> + <tr> + <td><h5>sink.parallelism</h5></td> Review comment: Hive batch sink also shares code with FileSystem connector, e.g. it also writes data using a `FileSystemOutputFormat`. If `sink.parallelism` has the same meaning for both FS and hive connectors, it's better to avoid duplicated definitions. So that it makes things easier if we need to update the configuration in the future. That's also what we did in the [streaming write](https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/hive/hive_read_write.html#writing) section of the hive doc. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org