This is an automated email from the ASF dual-hosted git repository.

xushiyan pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 072d3e5b713 [DOCS][site] add hudi-rs page and ecosystem links (#12367)
072d3e5b713 is described below

commit 072d3e5b7134b0ed53f9d4ad229cd70f26f6875a
Author: Shiyan Xu <[email protected]>
AuthorDate: Thu Nov 28 10:51:58 2024 -1000

    [DOCS][site] add hudi-rs page and ecosystem links (#12367)
---
 website/docs/overview.mdx                          |   8 ++
 website/docs/python-rust-quick-start-guide.md      | 119 +++++++++++++++++++++
 website/sidebars.js                                |   1 +
 website/src/pages/ecosystem.md                     |   2 +
 website/versioned_docs/version-0.15.0/overview.mdx |  14 ++-
 .../python-rust-quick-start-guide.md               | 119 +++++++++++++++++++++
 .../version-0.15.0-sidebars.json                   |   1 +
 7 files changed, 261 insertions(+), 3 deletions(-)

diff --git a/website/docs/overview.mdx b/website/docs/overview.mdx
index 013ecc6dc4b..009de571fb1 100644
--- a/website/docs/overview.mdx
+++ b/website/docs/overview.mdx
@@ -28,7 +28,11 @@ Apache Hudi can easily be used on any [cloud storage 
platform](/docs/cloud).
 Hudi’s advanced performance optimizations, make analytical workloads faster 
with any of
 the popular query engines including, Apache Spark, Flink, Presto, Trino, Hive, 
etc.
 
+[Hudi-rs](https://github.com/apache/hudi-rs) is the native Rust implementation 
for Apache Hudi, which also provides bindings to Python. It 
+expands the use of Apache Hudi for a diverse range of use cases in the non-JVM 
ecosystems.
+
 ## Core Concepts to Learn
+
 If you are relatively new to Apache Hudi, it is important to be familiar with 
a few core concepts:
 - [Hudi Timeline](/docs/next/timeline) – How Hudi manages transactions and 
other table services
 - [Hudi File Layout](/docs/next/file_layouts) - How the files are laid out on 
storage
@@ -40,11 +44,15 @@ See more in the "Concepts" section of the docs.
 Take a look at recent [blog posts](/blog) that go in depth on certain topics 
or use cases.
 
 ## Getting Started
+
 Sometimes the fastest way to learn is by doing. Try out these Quick Start 
resources to get up and running in minutes:
+
 - [Spark Quick Start Guide](/docs/quick-start-guide) – if you primarily use 
Apache Spark
 - [Flink Quick Start Guide](/docs/flink-quick-start-guide) – if you primarily 
use Apache Flink
+- [Python/Rust Quick Start Guide 
(Hudi-rs)](/docs/python-rust-quick-start-guide) - if you primarily use Python 
or Rust
 
 If you want to experience Apache Hudi integrated into an end to end demo with 
Kafka, Spark, Hive, Presto, etc, try out the Docker Demo:
+
 - [Docker Demo](/docs/docker_demo)
 
 ## Connect With The Community
diff --git a/website/docs/python-rust-quick-start-guide.md 
b/website/docs/python-rust-quick-start-guide.md
new file mode 100644
index 00000000000..73f22a1c673
--- /dev/null
+++ b/website/docs/python-rust-quick-start-guide.md
@@ -0,0 +1,119 @@
+---
+title: "Python/Rust Quick Start (Hudi-rs)"
+toc: true
+last_modified_at: 2024-11-28T12:53:57+08:00
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+This guide will help you get started with 
[hudi-rs](https://github.com/apache/hudi-rs), a native Rust library for Apache 
Hudi with Python bindings. Learn how to install, set up, and perform basic 
operations using both Python and Rust interfaces.
+
+## Installation
+
+```bash
+# Python
+pip install hudi
+
+# Rust
+cargo add hudi
+```
+
+## Basic Usage
+
+:::note
+Currently, write capabilities and reading from MOR tables are not supported.
+
+The examples below expect a Hudi table exists at `/tmp/trips_table`, created 
using the [quick start guide](/docs/quick-start-guide).
+:::
+
+### Python Example
+
+```python
+from hudi import HudiTableBuilder
+import pyarrow as pa
+
+hudi_table = (
+    HudiTableBuilder
+    .from_base_uri("/tmp/trips_table")
+    .build()
+)
+
+# Read with partition filters
+records = hudi_table.read_snapshot(filters=[("city", "=", "san_francisco")])
+
+# Convert to PyArrow table
+arrow_table = pa.Table.from_batches(records)
+result = arrow_table.select(["rider", "city", "ts", "fare"])
+```
+
+### Rust Example (with DataFusion)
+
+1. Set up your project:
+
+```bash
+cargo new my_project --bin && cd my_project
+cargo add tokio@1 datafusion@42
+cargo add hudi --features datafusion
+```
+
+1. Add code to `src/main.rs`:
+
+```rust
+use std::sync::Arc;
+use datafusion::error::Result;
+use datafusion::prelude::{DataFrame, SessionContext};
+use hudi::HudiDataSource;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+    let ctx = SessionContext::new();
+    let hudi = HudiDataSource::new_with_options("/tmp/trips_table", []).await?;
+    ctx.register_table("trips_table", Arc::new(hudi))?;
+    // Read with partition filters
+    let df: DataFrame = ctx.sql("SELECT * from trips_table where city = 
'san_francisco'").await?;
+    df.show().await?;
+    Ok(())
+}
+```
+
+## Cloud Storage Integration
+
+### Python
+
+```python
+from hudi import HudiTableBuilder
+
+hudi_table = (
+    HudiTableBuilder
+    .from_base_uri("s3://bucket/trips_table")
+    .with_option("aws_region", "us-west-2")
+    .build()
+)
+```
+
+### Rust
+
+```rust
+use hudi::HudiDataSource;
+
+let hudi = HudiDataSource::new_with_options(
+    "s3://bucket/trips_table",
+    [("aws_region", "us-west-2")]
+).await?;
+```
+
+### Supported Cloud Storage
+
+- AWS S3 (`s3://`)
+- Azure Storage (`az://`)
+- Google Cloud Storage (`gs://`)
+
+Set appropriate environment variables (`AWS_*`, `AZURE_*`, or `GOOGLE_*`) for 
authentication, or pass through the `option()` API.
+
+## Read with Timestamp
+
+Add timestamp option for time-travel queries:
+
+```python
+.with_option("hoodie.read.as.of.timestamp", "20241122010827898")
+```
diff --git a/website/sidebars.js b/website/sidebars.js
index 56d3bbe05fd..75486d5e669 100644
--- a/website/sidebars.js
+++ b/website/sidebars.js
@@ -15,6 +15,7 @@ module.exports = {
                 'overview',
                 'quick-start-guide',
                 'flink-quick-start-guide',
+                'python-rust-quick-start-guide',
                 'docker_demo',
                 'use_cases',
             ],
diff --git a/website/src/pages/ecosystem.md b/website/src/pages/ecosystem.md
index dcda0de53ab..52857120b26 100644
--- a/website/src/pages/ecosystem.md
+++ b/website/src/pages/ecosystem.md
@@ -37,3 +37,5 @@ In such cases, you can leverage another tool like Apache 
Spark or Apache Flink t
 | Apache Doris      | 
[Read](https://doris.apache.org/docs/ecosystem/external-table/hudi-external-table/)
                                      |             |
 | Starrocks         | 
[Read](https://docs.starrocks.io/docs/data_source/catalog/hudi_catalog/)        
                                         | [Demo with HMS + 
Min.IO](https://github.com/StarRocks/demo/tree/master/documentation-samples/hudi)
            |
 | Dremio            |                                                          
                                                                |             |
+| Daft              | 
[Read](https://www.getdaft.io/projects/docs/en/stable/user_guide/integrations/hudi.html)
                                 |             |
+| Ray Data          | 
[Read](https://docs.ray.io/en/master/data/api/input_output.html#hudi)           
                                         |             |
diff --git a/website/versioned_docs/version-0.15.0/overview.mdx 
b/website/versioned_docs/version-0.15.0/overview.mdx
index 0abd4219987..009de571fb1 100644
--- a/website/versioned_docs/version-0.15.0/overview.mdx
+++ b/website/versioned_docs/version-0.15.0/overview.mdx
@@ -13,7 +13,7 @@ how to learn more to get started.
 
 ## What is Apache Hudi
 Apache Hudi (pronounced “hoodie”) is the next generation [streaming data lake 
platform](/blog/2021/07/21/streaming-data-lake-platform).
-Apache Hudi brings core warehouse and database functionality directly to a 
data lake. Hudi provides [tables](/docs/next/sql_ddl),
+ Hudi brings core warehouse and database functionality directly to a data 
lake. Hudi provides [tables](/docs/next/sql_ddl),
 [transactions](/docs/next/timeline), [efficient 
upserts/deletes](/docs/next/write_operations), [advanced 
indexes](/docs/next/indexing),
 [ingestion services](/docs/hoodie_streaming_ingestion), data 
[clustering](/docs/next/clustering)/[compaction](/docs/next/compaction) 
optimizations,
 and [concurrency](/docs/next/concurrency_control) all while keeping your data 
in open source file formats.
@@ -28,7 +28,11 @@ Apache Hudi can easily be used on any [cloud storage 
platform](/docs/cloud).
 Hudi’s advanced performance optimizations, make analytical workloads faster 
with any of
 the popular query engines including, Apache Spark, Flink, Presto, Trino, Hive, 
etc.
 
+[Hudi-rs](https://github.com/apache/hudi-rs) is the native Rust implementation 
for Apache Hudi, which also provides bindings to Python. It 
+expands the use of Apache Hudi for a diverse range of use cases in the non-JVM 
ecosystems.
+
 ## Core Concepts to Learn
+
 If you are relatively new to Apache Hudi, it is important to be familiar with 
a few core concepts:
 - [Hudi Timeline](/docs/next/timeline) – How Hudi manages transactions and 
other table services
 - [Hudi File Layout](/docs/next/file_layouts) - How the files are laid out on 
storage
@@ -40,11 +44,15 @@ See more in the "Concepts" section of the docs.
 Take a look at recent [blog posts](/blog) that go in depth on certain topics 
or use cases.
 
 ## Getting Started
+
 Sometimes the fastest way to learn is by doing. Try out these Quick Start 
resources to get up and running in minutes:
+
 - [Spark Quick Start Guide](/docs/quick-start-guide) – if you primarily use 
Apache Spark
 - [Flink Quick Start Guide](/docs/flink-quick-start-guide) – if you primarily 
use Apache Flink
+- [Python/Rust Quick Start Guide 
(Hudi-rs)](/docs/python-rust-quick-start-guide) - if you primarily use Python 
or Rust
 
 If you want to experience Apache Hudi integrated into an end to end demo with 
Kafka, Spark, Hive, Presto, etc, try out the Docker Demo:
+
 - [Docker Demo](/docs/docker_demo)
 
 ## Connect With The Community
@@ -53,7 +61,7 @@ resources to learn more, engage, and get help as you get 
started.
 
 ### Join in on discussions
 See all the ways to [engage with the community here](/community/get-involved). 
Two most popular methods include:
-- <SlackCommunity title="Hudi Slack Channel" />
+- <SlackCommunity title="Hudi Slack Channel"/>
 - [Hudi mailing list](mailto:[email protected]) - (send any msg 
to subscribe)
 
 ### Come to Office Hours for help
@@ -67,5 +75,5 @@ Apache Hudi welcomes you to join in on the fun and make a 
lasting impact on the
 [contributor guide](/contribute/how-to-contribute) to learn more, and don’t 
hesitate to directly reach out to any of the
 current committers to learn more.
 
-Have an idea, an ask, or feedback about a pain-point, but don’t have time to 
contribute? Join the <SlackCommunity title="Hudi Slack Channel" />
+Have an idea, an ask, or feedback about a pain-point, but don’t have time to 
contribute? Join the <SlackCommunity title="Hudi Slack Channel"/>
 and share!
diff --git 
a/website/versioned_docs/version-0.15.0/python-rust-quick-start-guide.md 
b/website/versioned_docs/version-0.15.0/python-rust-quick-start-guide.md
new file mode 100644
index 00000000000..73f22a1c673
--- /dev/null
+++ b/website/versioned_docs/version-0.15.0/python-rust-quick-start-guide.md
@@ -0,0 +1,119 @@
+---
+title: "Python/Rust Quick Start (Hudi-rs)"
+toc: true
+last_modified_at: 2024-11-28T12:53:57+08:00
+---
+import Tabs from '@theme/Tabs';
+import TabItem from '@theme/TabItem';
+
+This guide will help you get started with 
[hudi-rs](https://github.com/apache/hudi-rs), a native Rust library for Apache 
Hudi with Python bindings. Learn how to install, set up, and perform basic 
operations using both Python and Rust interfaces.
+
+## Installation
+
+```bash
+# Python
+pip install hudi
+
+# Rust
+cargo add hudi
+```
+
+## Basic Usage
+
+:::note
+Currently, write capabilities and reading from MOR tables are not supported.
+
+The examples below expect a Hudi table exists at `/tmp/trips_table`, created 
using the [quick start guide](/docs/quick-start-guide).
+:::
+
+### Python Example
+
+```python
+from hudi import HudiTableBuilder
+import pyarrow as pa
+
+hudi_table = (
+    HudiTableBuilder
+    .from_base_uri("/tmp/trips_table")
+    .build()
+)
+
+# Read with partition filters
+records = hudi_table.read_snapshot(filters=[("city", "=", "san_francisco")])
+
+# Convert to PyArrow table
+arrow_table = pa.Table.from_batches(records)
+result = arrow_table.select(["rider", "city", "ts", "fare"])
+```
+
+### Rust Example (with DataFusion)
+
+1. Set up your project:
+
+```bash
+cargo new my_project --bin && cd my_project
+cargo add tokio@1 datafusion@42
+cargo add hudi --features datafusion
+```
+
+1. Add code to `src/main.rs`:
+
+```rust
+use std::sync::Arc;
+use datafusion::error::Result;
+use datafusion::prelude::{DataFrame, SessionContext};
+use hudi::HudiDataSource;
+
+#[tokio::main]
+async fn main() -> Result<()> {
+    let ctx = SessionContext::new();
+    let hudi = HudiDataSource::new_with_options("/tmp/trips_table", []).await?;
+    ctx.register_table("trips_table", Arc::new(hudi))?;
+    // Read with partition filters
+    let df: DataFrame = ctx.sql("SELECT * from trips_table where city = 
'san_francisco'").await?;
+    df.show().await?;
+    Ok(())
+}
+```
+
+## Cloud Storage Integration
+
+### Python
+
+```python
+from hudi import HudiTableBuilder
+
+hudi_table = (
+    HudiTableBuilder
+    .from_base_uri("s3://bucket/trips_table")
+    .with_option("aws_region", "us-west-2")
+    .build()
+)
+```
+
+### Rust
+
+```rust
+use hudi::HudiDataSource;
+
+let hudi = HudiDataSource::new_with_options(
+    "s3://bucket/trips_table",
+    [("aws_region", "us-west-2")]
+).await?;
+```
+
+### Supported Cloud Storage
+
+- AWS S3 (`s3://`)
+- Azure Storage (`az://`)
+- Google Cloud Storage (`gs://`)
+
+Set appropriate environment variables (`AWS_*`, `AZURE_*`, or `GOOGLE_*`) for 
authentication, or pass through the `option()` API.
+
+## Read with Timestamp
+
+Add timestamp option for time-travel queries:
+
+```python
+.with_option("hoodie.read.as.of.timestamp", "20241122010827898")
+```
diff --git a/website/versioned_sidebars/version-0.15.0-sidebars.json 
b/website/versioned_sidebars/version-0.15.0-sidebars.json
index d69c2f62e40..b61a09c74ef 100644
--- a/website/versioned_sidebars/version-0.15.0-sidebars.json
+++ b/website/versioned_sidebars/version-0.15.0-sidebars.json
@@ -8,6 +8,7 @@
         "overview",
         "quick-start-guide",
         "flink-quick-start-guide",
+        "python-rust-quick-start-guide",
         "docker_demo",
         "use_cases"
       ]

Reply via email to