pnowojski commented on code in PR #25235:
URL: https://github.com/apache/flink/pull/25235#discussion_r1732443462


##########
docs/content/docs/deployment/filesystems/s3.md:
##########
@@ -164,4 +164,38 @@ The `s3.entropy.key` defines the string in paths that is 
replaced by the random
 If a file system operation does not pass the *"inject entropy"* write option, 
the entropy key substring is simply removed.
 The `s3.entropy.length` defines the number of random alphanumeric characters 
used for entropy.
 
+## s5cmd
+
+Both `flink-s3-fs-hadoop` and `flink-s3-fs-presto` can be configured to use 
the [s5cmd tool](https://github.com/peak/s5cmd) for faster file upload and 
download.
+[Benchmark 
results](https://cwiki.apache.org/confluence/display/FLINK/FLIP-444%3A+Native+file+copy+support)
 are showing that `s5cmd` can be over 2 times more CPU efficient. 
+Which means either using half the CPU to upload or download the same set of 
files, or doing that twice as fast with the same amount of available CPU.
+
+In order to use this feature, the `s5cmd` binary has to be present and 
accessible to the Flink's task managers, for example via embedding it in the 
used docker image.
+Secondly the path to the `s5cmd` has to be configured via:
+```yaml
+s3.s5cmd.path: /path/to/the/s5cmd
+```
+
+The remaining configuration options (with their default value listed below) 
are:

Review Comment:
   I'm doing that in a section below. Or do you mean something else?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to