This is an automated email from the ASF dual-hosted git repository.
yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 7fca40ff29 [DOCS] Update query pages for Redshift Spectrum support for
Hudi (#6145)
7fca40ff29 is described below
commit 7fca40ff299c188e34d35e0eaec43af23d02eca3
Author: Bhavani Sudha Saktheeswaran <[email protected]>
AuthorDate: Wed Jul 20 14:10:55 2022 -0700
[DOCS] Update query pages for Redshift Spectrum support for Hudi (#6145)
---
website/docs/query_engine_setup.md | 11 +++++++++++
website/docs/querying_data.md | 24 +++++++++++++++---------
2 files changed, 26 insertions(+), 9 deletions(-)
diff --git a/website/docs/query_engine_setup.md
b/website/docs/query_engine_setup.md
index 157e6daca6..4b8826c29b 100644
--- a/website/docs/query_engine_setup.md
+++ b/website/docs/query_engine_setup.md
@@ -71,3 +71,14 @@ In order for Hive to recognize Hudi tables and query
correctly,
In addition to setup above, for beeline cli access, the `hive.input.format`
variable needs to be set to the fully qualified path name of the
inputformat `org.apache.hudi.hadoop.HoodieParquetInputFormat`. For Tez,
additionally the `hive.tez.input.format` needs to be set
to `org.apache.hadoop.hive.ql.io.HiveInputFormat`. Then proceed to query the
table like any other Hive table.
+
+
+
+## Redshift Spectrum
+Copy on Write Tables in Apache Hudi versions 0.5.2, 0.6.0, 0.7.0, 0.8.0,
0.9.0, and 0.10.0 can be queried via Amazon Redshift Spectrum external tables.
+:::note
+Hudi tables are supported only when AWS Glue Data Catalog is used. It's not
supported when you use an Apache Hive metastore as the external catalog.
+:::
+
+Please refer to [Redshift Spectrum Integration with Apache
Hudi](https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-external-tables.html#c-spectrum-column-mapping-hudi)
+for more details.
diff --git a/website/docs/querying_data.md b/website/docs/querying_data.md
index 1b5cee0d5b..98d7e5d8d2 100644
--- a/website/docs/querying_data.md
+++ b/website/docs/querying_data.md
@@ -205,21 +205,27 @@ After Hudi made a new commit, refresh the Impala table to
get the latest results
REFRESH database.table_name
```
+## Redshift Spectrum
+To set up Redshift Spectrum for querying Hudi, see the [Query Engine
Setup](/docs/next/query_engine_setup#redshift-spectrum) page.
+
+
## Support Matrix
Following tables show whether a given query is supported on specific query
engine.
### Copy-On-Write tables
-|Query Engine|Snapshot Queries|Incremental Queries|
-|------------|--------|-----------|
-|**Hive**|Y|Y|
-|**Spark SQL**|Y|Y|
-|**Spark Datasource**|Y|Y|
-|**Flink SQL**|Y|N|
-|**PrestoDB**|Y|N|
-|**Trino**|Y|N|
-|**Impala**|Y|N|
+| Query Engine |Snapshot Queries|Incremental Queries|
+|-----------------------|--------|-----------|
+| **Hive** |Y|Y|
+| **Spark SQL** |Y|Y|
+| **Spark Datasource** |Y|Y|
+| **Flink SQL** |Y|N|
+| **PrestoDB** |Y|N|
+| **Trino** |Y|N|
+| **Impala** |Y|N|
+| **Redshift Spectrum** |Y|N|
+
Note that `Read Optimized` queries are not applicable for COPY_ON_WRITE tables.