[hudi] branch master updated: [HUDI-3905] Add S3 related setup in Kafka Connect quick start (#5356)

yihua Tue, 19 Apr 2022 15:08:41 -0700

This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/hudi.git



The following commit(s) were added to refs/heads/master by this push:
     new 6f3fe880d2 [HUDI-3905] Add S3 related setup in Kafka Connect quick 
start (#5356)
6f3fe880d2 is described below

commit 6f3fe880d2008b760bf96e30612cc5ea9a8c2d95
Author: Y Ethan Guo <ethan.guoyi...@gmail.com>
AuthorDate: Tue Apr 19 15:08:28 2022 -0700

    [HUDI-3905] Add S3 related setup in Kafka Connect quick start (#5356)
---
 hudi-kafka-connect/README.md | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/hudi-kafka-connect/README.md b/hudi-kafka-connect/README.md
index a5784139bc..f80e2c9fe1 100644
--- a/hudi-kafka-connect/README.md
+++ b/hudi-kafka-connect/README.md
@@ -48,8 +48,8 @@ $CONFLUENT_DIR/bin/confluent-hub install 
confluentinc/kafka-connect-hdfs:10.1.0
 cp -r 
$CONFLUENT_DIR/share/confluent-hub-components/confluentinc-kafka-connect-hdfs/* 
/usr/local/share/kafka/plugins/
 ```
 
-Now, build the packaged jar that contains all the hudi classes, including the 
Hudi Kafka Connector. And copy it 
-to the plugin path that contains all the other jars 
(`/usr/local/share/kafka/plugins/lib`)
+Now, build the packaged jar that contains all the hudi classes, including the 
Hudi Kafka Connector. And copy it to the
+plugin path that contains all the other jars 
(`/usr/local/share/kafka/plugins/lib`)
 
 ```bash
 cd $HUDI_DIR
@@ -58,8 +58,20 @@ mkdir -p /usr/local/share/kafka/plugins/lib
 cp 
$HUDI_DIR/packaging/hudi-kafka-connect-bundle/target/hudi-kafka-connect-bundle-0.11.0-SNAPSHOT.jar
 /usr/local/share/kafka/plugins/lib
 ```
 
-Set up a Kafka broker locally. Download the latest apache kafka from 
[here](https://kafka.apache.org/downloads). 
-Once downloaded and built, run the Zookeeper server and Kafka server using the 
command line tools.
+If the Hudi Sink Connector writes to a target Hudi table on [Amazon 
S3](https://aws.amazon.com/s3/), you need two
+additional jars, `hadoop-aws-2.10.1.jar` and 
`aws-java-sdk-bundle-1.11.271.jar`, in the `plugins/lib` folder. You may
+download them using the following commands. Note that, when you specify the 
target table path on S3, you need to use
+`s3a://` prefix.
+
+```bash
+cd /usr/local/share/kafka/plugins/lib
+wget 
https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk-bundle/1.11.271/aws-java-sdk-bundle-1.11.271.jar
+wget 
https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.10.1/hadoop-aws-2.10.1.jar
+```
+
+Set up a Kafka broker locally. Download the latest apache kafka from 
[here](https://kafka.apache.org/downloads). Once
+downloaded and built, run the Zookeeper server and Kafka server using the 
command line tools.
+
 ```bash
 export KAFKA_HOME=/path/to/kafka_install_dir
 cd $KAFKA_HOME
@@ -67,6 +79,7 @@ cd $KAFKA_HOME
 ./bin/zookeeper-server-start.sh ./config/zookeeper.properties
 ./bin/kafka-server-start.sh ./config/server.properties
 ```
+
 Wait until the kafka cluster is up and running.
 
 ### 2 - Set up the schema registry

[hudi] branch master updated: [HUDI-3905] Add S3 related setup in Kafka Connect quick start (#5356)

Reply via email to