[GitHub] [flink] lirui-apache commented on a change in pull request #15060: [FLINK-19944][Connectors / Hive]Support sink parallelism configuratio…

GitBox Mon, 08 Mar 2021 23:43:49 -0800


lirui-apache commented on a change in pull request #15060:
URL: https://github.com/apache/flink/pull/15060#discussion_r590032900




##########
File path: docs/content/docs/connectors/table/hive/hive_read_write.md
##########
@@ -409,6 +409,28 @@ This configuration is set in the `TableConfig` and will 
affect all sinks of the
   </tbody>
 </table>
 
+### Sink Parallelism
+
+The parallelism of writing data into Hive can be configured by the 
corresponding table option. By default, the parallelism is configured to being 
the same as the parallelism of its last upstream chained operator. When the 
parallelism which is different from the parallelism of the upstream parallelism 
is configured, the writer operator will apply the parallelism.
+
+<table class="table table-bordered">
+  <thead>
+    <tr>
+        <th class="text-left" style="width: 20%">Key</th>
+        <th class="text-left" style="width: 15%">Default</th>
+        <th class="text-left" style="width: 10%">Type</th>
+        <th class="text-left" style="width: 55%">Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+        <td><h5>sink.parallelism</h5></td>

Review comment:
       Hive batch sink also shares code with FileSystem connector, e.g. it also 
writes data using a `FileSystemOutputFormat`.
   
   If `sink.parallelism` has the same meaning for both FS and hive connectors, 
it's better to avoid duplicated definitions. So that it makes things easier if 
we need to update the configuration in the future. That's also what we did in 
the [streaming 
write](https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/hive/hive_read_write.html#writing)
 section of the hive doc.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lirui-apache commented on a change in pull request #15060: [FLINK-19944][Connectors / Hive]Support sink parallelism configuratio…

Reply via email to