jerryshao commented on code in PR #3931:
URL: https://github.com/apache/gravitino/pull/3931#discussion_r1670087646
##########
docs/how-to-use-gvfs.md:
##########
@@ -307,3 +315,210 @@ conf.set("fs.gravitino.client.kerberos.keytabFilePath",
"${your_kerberos_keytab}
Path filesetPath = new
Path("gvfs://fileset/test_catalog/test_schema/test_fileset_1");
FileSystem fs = filesetPath.getFileSystem(conf);
```
+
+## 2. Managing files of Fileset with Python GVFS
+
+### Prerequisites
+
++ A Hadoop environment with HDFS running. Now we only supports Fileset on HDFS.
+ GVFS in Python has been tested against Hadoop 2.7.3. It is recommended to
use Hadoop 2.7.3 or later,
+ it should work with Hadoop 3.x. Please create an
[issue](https://www.github.com/datastrato/gravitino/issues)
+ if you find any compatibility issues.
++ Python version >= 3.8. It has been tested GVFS works well with Python 3.8
and Python 3.9.
+ Your Python version should be at least higher than Python 3.8.
+
+Attention: If you are using macOS or Windows operating system, you need to
follow the steps in the
+[Hadoop official building
documentation](https://github.com/apache/hadoop/blob/trunk/BUILDING.txt)(Need
match your Hadoop version)
+to recompile the native libraries like `libhdfs` and others, and completely
replace the files in `${HADOOP_HOME}/lib/native`.
+
+### Configuration
+
+| Configuration item | Description
| Default
value | Required | Since version |
+|----------------------|---------------------------------------------------------------------------------------------------------------------------|---------------|----------|---------------|
+| `server_uri` | The Gravitino server uri, e.g.
`http://localhost:8090`.
| (none) | Yes | 0.6.0 |.
| (none) | Yes | 0.6.0 |
+| `metalake_name` | The metalake name which the fileset belongs to.
| (none)
| Yes | 0.6.0 |.
| (none)
| Yes | 0.6.0 | .
| (none) | Yes | 0.6.0 |
+| `cache_size` | The cache capacity of the Gravitino Virtual File
System. | `20`
| No | 0.6.0 |.
| (none)
| Yes | 0.6.0 | .
| (none) | Yes | 0.6.0 |
+| `cache_expired_time` | The value of time that the cache expires after
accessing in the Gravitino Virtual File System. The value is in `seconds`. |
`3600` | No | 0.6.0 |.
|
(none) | Yes | 0.6.0 | .
| (none) | Yes | 0.6.0 |
+
+
+You can configure these properties when obtaining the `Gravitino Virtual
FileSystem` in Python like this:
+
+```python
+from gravitino import gvfs
+
+fs = gvfs.GravitinoVirtualFileSystem(server_uri="http://localhost:8090",
metalake_name="test_metalake")
+```
+
+### Usage examples
+
+1. Make sure to obtain the Gravitino library.
+ You can get it by [pip](https://pip.pypa.io/en/stable/installation/):
+
+```shell
+pip install gravitino
+```
+
+2. Configuring the Hadoop environment.
+ You should ensure that the Python client has Kerberos authentication
information and
+ configure Hadoop environments in the system environment:
+```shell
Review Comment:
Add a blank line above and make this code block indentation.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]