Re: [PR] [#5557] improvement(CI): Add some docs and tests about how to use Azure Blob Storage(ADLS) in Hive [gravitino]

via GitHub Tue, 26 Nov 2024 22:30:39 -0800


jerryshao commented on code in PR #5558:
URL: https://github.com/apache/gravitino/pull/5558#discussion_r1860006793



##########
docs/hive-catalog-with-s3-and-adls.md:
##########
@@ -11,6 +11,8 @@ license: "This software is licensed under the Apache License 
version 2."
 
 Since Hive 2.x, Hive has supported S3 as a storage backend, enabling users to 
store and manage data in Amazon S3 directly through Hive. Gravitino enhances 
this capability by supporting the Hive catalog with S3, allowing users to 
efficiently manage the storage locations of files located in S3. This 
integration simplifies data operations and enables seamless access to S3 data 
from Hive queries.
 
+For ADLS (or Azure Blob storage(abs), or Azure Data lake storage(v2)), the 
integration is similar to S3. The only difference is the configuration 
properties for ADLS. The following sections will guide you through the 
necessary steps to configure the Hive catalog to utilize S3 as a storage 
backend, including configuration details and examples for creating databases 
and tables.

Review Comment:
   "For ADLS (aka. Azure Blob Storage (ABS), or Azure Data Lake Storage (v2)), 
..."



##########
catalogs/catalog-hive/build.gradle.kts:
##########
@@ -129,6 +129,10 @@ dependencies {
   testImplementation(libs.testcontainers.mysql)
   testImplementation(libs.testcontainers.localstack)
   testImplementation(libs.hadoop2.aws)
+  testImplementation(libs.hadoop3.abs)
+
+  // You need this to run test CatalogHiveABSIT
+  // testImplementation(libs.hadoop3.common)

Review Comment:
   Is this comment still valid?



##########
docs/hive-catalog-with-s3-and-adls.md:
##########
@@ -99,6 +117,7 @@ SupportsSchemas supportsSchemas = catalog.asSchemas();
 
 Map<String, String> schemaProperties = ImmutableMap.<String, String>builder()
     .put("location", "s3a://bucket-name/path")
+    // .put("location", 
"abfss://container-n...@user-account-name.dfs.core.windows.net/path")

Review Comment:
   You'd better add a comment here to say this is for ADLS.



##########
docs/hive-catalog-with-s3-and-adls.md:
##########
@@ -11,6 +11,8 @@ license: "This software is licensed under the Apache License 
version 2."
 
 Since Hive 2.x, Hive has supported S3 as a storage backend, enabling users to 
store and manage data in Amazon S3 directly through Hive. Gravitino enhances 
this capability by supporting the Hive catalog with S3, allowing users to 
efficiently manage the storage locations of files located in S3. This 
integration simplifies data operations and enables seamless access to S3 data 
from Hive queries.
 
+For ADLS (or Azure Blob storage(abs), or Azure Data lake storage(v2)), the 
integration is similar to S3. The only difference is the configuration 
properties for ADLS. The following sections will guide you through the 
necessary steps to configure the Hive catalog to utilize S3 as a storage 
backend, including configuration details and examples for creating databases 
and tables.

Review Comment:
   > "S3 as a storage backend"
   
   Is it S3 or ADLS?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [#5557] improvement(CI): Add some docs and tests about how to use Azure Blob Storage(ADLS) in Hive [gravitino]

Reply via email to