YannByron commented on a change in pull request #4269: URL: https://github.com/apache/hudi/pull/4269#discussion_r766402385
########## File path: website/docs/quick-start-guide.md ########## @@ -175,18 +175,163 @@ values={[ </TabItem> <TabItem value="sparksql"> +Spark-sql needs an explicit create table command. + +- Table types: + Both types of hudi tables (CopyOnWrite (COW) and MergeOnRead (MOR)) can be created using spark-sql. + + While creating the table, table type can be specified using **type** option. **type = 'cow'** represents COW table, while **type = 'mor'** represents MOR table. + +- Partitioned & Non-Partitioned table: + Users can create a partitioned table or non-partitioned table in spark-sql. + To create a partitioned table, one needs to use **partitioned by** statement to specify the partition columns to create a partitioned table. + When there is no **partitioned by** statement with create table command, table is considered to be a non-partitioned table. + +- Managed & External table: + In general, spark-sql supports two kinds of tables, namely managed and external. + If one specifies a location using **location** statement or use `create external table` to create table explicitly, it is an external table, else its considered a managed table. + You can read more about external vs managed tables [here](https://sparkbyexamples.com/apache-hive/difference-between-hive-internal-tables-and-external-tables/). + +- Table with primary key: + Users can choose to create a table with primary key as required. Else table is considered a non-primary keyed table. Review comment: i'll remove `Table with primary key` which is redundant with `notes` below. And move notes here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org