jerryshao commented on code in PR #5806:
URL: https://github.com/apache/gravitino/pull/5806#discussion_r1897363672


##########
docs/cloud-storage-fileset-example.md:
##########
@@ -0,0 +1,676 @@
+---
+title: "How to use cloud storage fileset"
+slug: /how-to-use-cloud-storage-fileset
+keyword: fileset S3 GCS ADLS OSS
+license: "This software is licensed under the Apache License version 2."
+---
+
+This document aims to provide a comprehensive guide on how to use cloud 
storage fileset created by Gravitino, it usually contains the following 
sections:
+
+
+## Start up Gravitino server
+
+### Start up Gravitino server
+
+Before running the Gravitino server, you need to put the following jars into 
the fileset class path located in `${GRAVITINO_HOME}/catalogs/hadoop/libs`. For 
example, if you are using S3, you need to put 
gravitino-aws-hadoop-bundles-{version}.jar into the fileset class path.
+
+
+| Storage type | Description                                                   
| Jar file                                                                      
                                           | Since Version    |
+|--------------|---------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------|------------------|
+| Local file   | The local file system.                                        
| (none)                                                                        
                                           | 0.5.0            |
+| HDFS         | HDFS file system.                                             
| (none)                                                                        
                                           | 0.5.0            |
+| S3           | AWS S3 storage.                                               
| 
[gravitino-aws-hadoop-bundle](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-hadoop-aws-bundle)
       | 0.8.0-incubating |
+| GCS          | Google Cloud Storage.                                         
| 
[gravitino-gcp-hadoop-bundle](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-hadoop-gcp-bundle)
       | 0.8.0-incubating |
+| OSS          | Aliyun OSS storage.                                           
| 
[gravitino-aliyun-hadoop-bundle](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-hadoop-aliyun-bundle)
 | 0.8.0-incubating |
+| ABS          | Azure Blob Storage (aka. ABS, or Azure Data Lake Storage (v2) 
| 
[gravitino-azure-hadoop-bundle](https://mvnrepository.com/artifact/org.apache.gravitino/gravitino-hadoop-azure-bundle)
   | 0.8.0-incubating |
+
+After putting the jars into the fileset class path, you can start up the 
Gravitino server by running the following command:
+
+```shell
+cd ${GRAVITINO_HOME}
+bin/gravitino.sh start
+```
+
+### Bundle jars
+
+`gravitino-{aws,gcp,aliyun,azure}-hadoop-bundle` are the jars that contain all 
the necessary classes to access the corresponding cloud storages, for instance, 
`gravitino-aws-hadoop-bundle.jar` contains the all necessary classes including 
`hadoop-common`(hadoop-3.3.1) and `hadoop-aws` to access the S3 storage.
+**They are used in the scenario where there is no hadoop environment in the 
runtime.**
+
+**If there is already hadoop environment in the runtime, you can use the 
`gravitino-{aws,gcp,aliyun,azure}-bundle.jar` that does not contain the cloud 
storage classes (like hadoop-aws) and hadoop environment, you can manually add 
the necessary jars to the classpath.**
+
+The following table demonstrates what jars are necessary for different cloud 
storage filesets:

Review Comment:
   "which jars"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@gravitino.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to