[GitHub] [hudi] YannByron commented on a change in pull request #4269: [HUDI-2878] enhance hudi-quick-start guide for spark-sql

GitBox Thu, 09 Dec 2021 23:26:41 -0800


YannByron commented on a change in pull request #4269:
URL: https://github.com/apache/hudi/pull/4269#discussion_r766402385




##########
File path: website/docs/quick-start-guide.md
##########
@@ -175,18 +175,163 @@ values={[
 </TabItem>
 <TabItem value="sparksql">
 
+Spark-sql needs an explicit create table command.
+
+- Table types:
+  Both types of hudi tables (CopyOnWrite (COW) and MergeOnRead (MOR)) can be 
created using spark-sql.
+
+  While creating the table, table type can be specified using **type** option. 
**type = 'cow'** represents COW table, while **type = 'mor'** represents MOR 
table.
+
+- Partitioned & Non-Partitioned table:
+  Users can create a partitioned table or non-partitioned table in spark-sql.
+  To create a partitioned table, one needs to use **partitioned by** statement 
to specify the partition columns to create a partitioned table.
+  When there is no **partitioned by** statement with create table command, 
table is considered to be a non-partitioned table.
+
+- Managed & External table:
+  In general, spark-sql supports two kinds of tables, namely managed and 
external.
+  If one specifies a location using **location** statement or use `create 
external table` to create table explicitly, it is an external table, else its 
considered a managed table.
+  You can read more about external vs managed tables 
[here](https://sparkbyexamples.com/apache-hive/difference-between-hive-internal-tables-and-external-tables/).
+
+- Table with primary key:
+  Users can choose to create a table with primary key as required. Else table 
is considered a non-primary keyed table.

Review comment:
       i'll remove `Table with primary key` which is redundant with `notes` 
below. And move notes here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] YannByron commented on a change in pull request #4269: [HUDI-2878] enhance hudi-quick-start guide for spark-sql

Reply via email to