umehrot2 commented on a change in pull request #4087:
URL: https://github.com/apache/hudi/pull/4087#discussion_r755589562



##########
File path: website/docs/concurrency_control.md
##########
@@ -69,6 +69,17 @@ hoodie.write.lock.hivemetastore.table
 
 `The HiveMetastore URI's are picked up from the hadoop configuration file 
loaded during runtime.`
 
+**`AWS DynamoDB`** based lock provider

Review comment:
       `AWS DynamoDB` => `Amazon DynamoDB`

##########
File path: website/docs/configurations.md
##########
@@ -15,6 +15,20 @@ This page covers the different ways of configuring your job 
to write/read Hudi t
 - [**Metrics Configs**](#METRICS): These set of configs are used to enable 
monitoring and reporting of keyHudi stats and metrics.
 - [**Record Payload Config**](#RECORD_PAYLOAD): This is the lowest level of 
customization offered by Hudi. Record payloads define how to produce new values 
to upsert based on incoming new record and stored old record. Hudi provides 
default implementations such as OverwriteWithLatestAvroPayload which simply 
update table with the latest/last-written record. This can be overridden to a 
custom class extending HoodieRecordPayload class, on both datasource and 
WriteClient levels.
 
+---

Review comment:
       Why not add this as a separate bullet point ? We can call it 
`Environment Configs` ?

##########
File path: website/docs/concurrency_control.md
##########
@@ -69,6 +69,17 @@ hoodie.write.lock.hivemetastore.table
 
 `The HiveMetastore URI's are picked up from the hadoop configuration file 
loaded during runtime.`
 
+**`AWS DynamoDB`** based lock provider
+
+AWS DynamoDB based lock provides a simple way to support multi writing across 
different clusters
+
+```
+hoodie.write.lock.provider=org.apache.hudi.aws.transaction.lock.DynamoDBBasedLockProvider
+hoodie.write.lock.dynamodb.table
+hoodie.write.lock.dynamodb.partition_key
+hoodie.write.lock.dynamodb.region
+```

Review comment:
       In addition to these, we should mention how the AWS Credentials can be 
configured to talk to DynamoDB. Mention the specific configurations here, and 
that if not configured it falls back to 
https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html.

##########
File path: website/docs/configurations.md
##########
@@ -15,6 +15,20 @@ This page covers the different ways of configuring your job 
to write/read Hudi t
 - [**Metrics Configs**](#METRICS): These set of configs are used to enable 
monitoring and reporting of keyHudi stats and metrics.
 - [**Record Payload Config**](#RECORD_PAYLOAD): This is the lowest level of 
customization offered by Hudi. Record payloads define how to produce new values 
to upsert based on incoming new record and stored old record. Hudi provides 
default implementations such as OverwriteWithLatestAvroPayload which simply 
update table with the latest/last-written record. This can be overridden to a 
custom class extending HoodieRecordPayload class, on both datasource and 
WriteClient levels.
 
+---
+Except directly passing configurations to Hudi jobs, since 0.10.0, Hudi also 
supports passing configurations through an external configuration file 
`hudi-default.conf` in which each line consists of a key and a value separated 
by whitespace/equal sign. For example:
+```
+hoodie.datasource.hive_sync.mode               jdbc
+hoodie.datasource.hive_sync.jdbcurl            jdbc:hive2://localhost:10000
+hoodie.datasource.hive_sync.support_timestamp  false
+```
+This is a cluster level configuration, all the Hudi jobs running in this 
cluster would share the same configuration.
+The configuration is parsed and evaluated when the Hudi engine processes are 
started. Changes to the configuration file require restarting the relevant 
processes.
+

Review comment:
       This statement is not true. As discussed it is not the engines like 
Spark, Hive that load the config. Instead we are relying on Hudi code paths to 
ultimately load this.
   
   May be you should mention that this also kicks in via Spark SQL DML and that 
it helps reduces the configs that once would otherwise have to keep passing.

##########
File path: website/docs/configurations.md
##########
@@ -15,6 +15,20 @@ This page covers the different ways of configuring your job 
to write/read Hudi t
 - [**Metrics Configs**](#METRICS): These set of configs are used to enable 
monitoring and reporting of keyHudi stats and metrics.
 - [**Record Payload Config**](#RECORD_PAYLOAD): This is the lowest level of 
customization offered by Hudi. Record payloads define how to produce new values 
to upsert based on incoming new record and stored old record. Hudi provides 
default implementations such as OverwriteWithLatestAvroPayload which simply 
update table with the latest/last-written record. This can be overridden to a 
custom class extending HoodieRecordPayload class, on both datasource and 
WriteClient levels.
 
+---
+Except directly passing configurations to Hudi jobs, since 0.10.0, Hudi also 
supports passing configurations through an external configuration file 
`hudi-default.conf` in which each line consists of a key and a value separated 
by whitespace/equal sign. For example:

Review comment:
       `Except` => `Instead of`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to