pengzhiwei2018 commented on a change in pull request #3140:
URL: https://github.com/apache/hudi/pull/3140#discussion_r659625369
##########
File path: docs/_docs/1_1_spark_quick_start_guide.md
##########
@@ -300,6 +300,221 @@ spark.
show(100, false)
```
+# Spark-Sql example
+## Setup
+Hudi support using spark sql to write and read data with the
**HoodieSparkSessionExtension** sql extension.
+```shell
+# spark sql for spark 3
+spark-sql --packages
org.apache.hudi:hudi-spark3-bundle_2.12:0.9.0,org.apache.spark:spark-avro_2.12:3.0.1
\
+--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
+--conf
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
+
+# spark-sql for spark 2 with scala 2.11
+spark-sql --packages
org.apache.hudi:hudi-spark-bundle_2.11:0.9.0,org.apache.spark:spark-avro_2.11:2.4.4
\
+--conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
+--conf
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
+
+# spark-sql for spark 2 with scala 2.12
+spark-sql \
+ --packages
org.apache.hudi:hudi-spark-bundle_2.12:0.9.0,org.apache.spark:spark-avro_2.12:2.4.4
\
+ --conf 'spark.serializer=org.apache.spark.serializer.KryoSerializer' \
+ --conf
'spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension'
+```
+
+## Sql syntax
+### DDL
+Hudi support create table using the spark-sql.
+**Create Non-Partitioned Table**
+```sql
+-- create a managed cow table
+create table if not exists h0(
+ id int,
+ name string,
+ price double
+) using hudi
+options (
+ type = 'cow',
+ primaryKey = 'id'
+);
+
+-- creae an exteranl mor table
+create table if not exists h1(
+ id int,
+ name string,
+ price double,
+ ts bigint
+) using hudi
+location '/tmp/hudi/h0'
+options (
+ type = 'mor',
+ primaryKey = 'id,name',
+ preCombineField = 'ts'
+)
+;
+
+-- create a non-primary key table
+create table if not exists h2(
+ id int,
+ name string,
+ price double
+) using hudi
+options (
+ type = 'cow'
+);
+```
+**Create Non-Partitioned Table**
+```sql
+create table if not exists h_p0 (
+id bigint,
+name string,
+dt stringļ¼
+hh string
+) using hudi
+location '/tmp/hudi/h_p0'
+options (
+ type = 'cow',
+ primaryKey = 'id',
+ preCombineField = 'ts'
+ )
+partitioned by (dt, hh)
+;
+```
+**Create Table Options**
+
+| Parameter Name | Introduction |
+|------------|--------|
+| primaryKey | The primary key names of the table, multiple fields separated
by commas. |
Review comment:
We have already support date & timestamp partition data type by the sql
internal key generator : `SqlKeyGenerator`. So user do not need to set custom
key generator.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]