jerqi commented on code in PR #9716: URL: https://github.com/apache/gravitino/pull/9716#discussion_r2694001283
########## docs/iceberg-rest-service.md: ########## @@ -649,6 +649,58 @@ table = catalog.load_table(table_identifier) print(table.scan().to_arrow()) ``` +### Exploring Apache Iceberg with Ray + +[Ray](https://www.ray.io/) is a unified framework for scaling AI and Python applications. Ray Data provides native support for reading and writing Iceberg tables through the REST catalog. + +:::note +Ray Data only supports reading from and writing to existing Iceberg tables. It does not support DDL operations such as creating, dropping, or altering tables, schemas, or catalogs. You need to use other tools like Spark or PyIceberg to manage table metadata. +::: + +#### Reading Iceberg tables with Ray + +```python +import ray + +ds = ray.data.read_iceberg( + table_identifier="default.sample", + catalog_kwargs={ + "name": "default", Review Comment: It's related about pyIceberg. `~/.pyIceberg.yaml` will have default configurations for every catalog. The catalog will have the name. So we pass the name here, we can use the name to decide to use which configurations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
