(sedona-db) branch main updated: [DOCS] Fit and finish fixes (#110)

jiayu Fri, 19 Sep 2025 00:13:52 -0700

This is an automated email from the ASF dual-hosted git repository.

jiayu pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/sedona-db.git



The following commit(s) were added to refs/heads/main by this push:
     new ab9fcbb  [DOCS] Fit and finish fixes (#110)
ab9fcbb is described below

commit ab9fcbba50370bf829e7ef47af50e1a19745bbba
Author: Kelly-Ann Dolor <[email protected]>
AuthorDate: Fri Sep 19 00:13:30 2025 -0700

    [DOCS] Fit and finish fixes (#110)
    
    Co-authored-by: Jia Yu <[email protected]>
---
 README.md                                      |   6 +-
 docs/{development.md => contributors-guide.md} | 138 ++++++++++++++++----
 docs/index.md                                  |  47 ++++---
 docs/programming-guide.ipynb                   |  44 +++----
 docs/programming-guide.md                      |  26 ++--
 docs/quickstart-python.ipynb                   |   2 +-
 docs/reference/read-parquet-files.md           |  71 -----------
 docs/stylesheets/extra.css                     |  11 ++
 docs/working-with-parquet-files.ipynb          | 166 +++++++++++++++++++++++++
 docs/working-with-parquet-files.md             | 116 +++++++++++++++++
 mkdocs.yml                                     |  11 +-
 11 files changed, 463 insertions(+), 175 deletions(-)

diff --git a/README.md b/README.md
index dfb7e78..dde5ab0 100644
--- a/README.md
+++ b/README.md
@@ -27,7 +27,11 @@ SedonaDB only runs on a single machine, so it’s perfect for 
processing smaller
 
 ## Install
 
-You can install Python SedonaDB with `pip install apache-sedona[db]`.
+You can install Python SedonaDB with PyPI:
+
+```sh
+pip install "apache-sedona[db]"
+```
 
 ## Overture buildings example
 
diff --git a/docs/development.md b/docs/contributors-guide.md
similarity index 51%
rename from docs/development.md
rename to docs/contributors-guide.md
index 58f7178..2183c65 100644
--- a/docs/development.md
+++ b/docs/contributors-guide.md
@@ -17,14 +17,66 @@
   under the License.
 -->
 
-# Development
+# Contributors Guide
+
+This guide details how to set up your development environment as a SedonaDB 
Contributor.
+
+## Fork and clone the repository
+
+Your first step is to create a personal copy of the repository and connect it 
to the main project.
+
+1. Fork the repository
+
+      * Navigate to the official [Apache SedonaDB GitHub 
repository](https://github.com/apache/sedona-db).
+      * Click the **Fork** button in the top-right corner. This creates a 
complete copy of the project in your own GitHub account.
+
+1. Clone your fork
+
+      * Next, clone your newly created fork to your local machine. This 
command downloads the repository into a new folder named `sedona-db`.
+      * Replace `YourUsername` with your actual GitHub username.
+
+        ```shell
+        git clone https://github.com/YourUsername/sedona-db.git
+        cd sedona-db
+        ```
+
+1. Configure the remotes
+
+      * Your local repository needs to know where the original project is so 
you can pull in updates. You'll add a remote link, traditionally named 
**`upstream`**, to the main Apache SedonaDB repository.
+      * Your fork is automatically configured as the **`origin`** remote.
+
+        ```shell
+        # Add the main repository as the "upstream" remote
+        git remote add upstream https://github.com/apache/sedona-db.git
+        ```
+
+1. Verify the configuration
+
+      * Run the following command to verify that you have two remotes 
configured correctly: `origin` (your fork) and `upstream` (the main repository).
+
+        ```shell
+        git remote -v
+        ```
+
+      * The output should look like this:
+
+        ```shell
+        origin    https://github.com/YourUsername/sedona-db.git (fetch)
+        origin    https://github.com/YourUsername/sedona-db.git (push)
+        upstream  https://github.com/apache/sedona-db.git (fetch)
+        upstream  https://github.com/apache/sedona-db.git (push)
+        ```
 
 ## Rust
 
-SedonaDB is written and Rust and is a standard `cargo` workspace. You can
-install a recent version of the Rust compiler and cargo from
-[rustup.rs](https://rustup.rs/) and run tests using `cargo test`. A local
-development version of the CLI can be run with `cargo run --bin sedona-cli`.
+SedonaDB is written in Rust and is a standard `cargo` workspace.
+
+You can install a recent version of the Rust compiler and cargo from
+[rustup.rs](https://rustup.rs/) and run tests using `cargo test`.
+
+A local development version of the CLI can be run with `cargo run --bin 
sedona-cli`.
+
+### Test data setup
 
 Some tests require submodules that contain test data or pinned versions of
 external dependencies. These submodules can be initialized with:
@@ -40,16 +92,26 @@ Additionally, some of the data required in the tests can be 
downloaded by runnin
 python submodules/download-assets.py
 ```
 
+### System dependencies
+
 Some crates wrap external native libraries and require system dependencies
-to build. At this time the only crate that requires this is the 
sedona-s2geography
-crate, which requires [CMake](https://cmake.org),
-[Abseil](https://github.com/abseil/abseil-cpp) and OpenSSL. These can be 
installed
-on MacOS with [Homebrew](https://brew.sh):
+to build.
+
+!!!note "`sedona-s2geography`"
+    At this time, the only crate that requires this is the `sedona-s2geography`
+    crate, which requires [CMake](https://cmake.org),
+    [Abseil](https://github.com/abseil/abseil-cpp) and OpenSSL.
+
+#### macOS
+
+These can be installed on macOS with [Homebrew](https://brew.sh):
 
 ```shell
 brew install abseil openssl cmake geos
 ```
 
+#### Linux and Windows
+
 On Linux and Windows, it is recommended to use 
[vcpkg](https://github.com/microsoft/vcpkg)
 to provide external dependencies. This can be done by setting the 
`CMAKE_TOOLCHAIN_FILE`
 environment variable:
@@ -58,7 +120,9 @@ environment variable:
 export CMAKE_TOOLCHAIN_FILE=/path/to/vcpkg/scripts/buildsystems/vcpkg.cmake
 ```
 
-When using VSCode, it may be necessary to set this environment variable in 
settings.json
+#### Visual Studio Code (VSCode) Configuration
+
+When using VSCode, it may be necessary to set this environment variable in 
`settings.json`
 such that it can be found by rust-analyzer when running build/run tasks:
 
 ```json
@@ -75,8 +139,9 @@ such that it can be found by rust-analyzer when running 
build/run tasks:
 ## Python
 
 Python bindings to SedonaDB are built with the 
[Maturin](https://www.maturin.rs) build
-backend. Installing a development version of the main Python bindings the 
first time
-can be done with:
+backend.
+
+To install a development version of the main Python bindings for the first 
time, run the following commands:
 
 ```shell
 cd python/sedonadb
@@ -92,12 +157,16 @@ maturin develop
 
 ## Debugging
 
+### Rust
+
 Debugging Rust code is most easily done by writing or finding a test that 
triggers
 the desired behavior and running it using the *Debug* selection in
 [VSCode](https://code.visualstudio.com/) with the
 
[rust-analyzer](https://marketplace.visualstudio.com/items?itemName=rust-lang.rust-analyzer)
-extension. Rust code can also debugged using the CLI by finding the `main()` 
function in
-sedona-cli and choosing the *Debug* run option.
+extension. Rust code can also be debugged using the CLI by finding the 
`main()` function in
+`sedona-cli` and choosing the *Debug* run option.
+
+### Python, C, and C++
 
 Installation of Python bindings with `maturin develop` ensures a 
debug-friendly build for
 debugging Rust, Python, or C/C++ code. Python code can be debugged using 
breakpoints in
@@ -114,7 +183,9 @@ In general, there is at least one benchmark for every 
implementation of a functi
 and a few other benchmarks for low-level iteration where work was done to 
optimize
 specific cases.
 
-Briefly, benchmarks for a specific crate can be run with `cargo bench`:
+### Running benchmarks
+
+Benchmarks for a specific crate can be run with `cargo bench`:
 
 ```shell
 cd rust/sedona-geo
@@ -129,17 +200,22 @@ to read for a specific crate).
 cargo bench -- st_area
 ```
 
+### Managing results
+
 By default, criterion saves the last run and will report the difference 
between the
 current benchmark and the last time it was run (although there are options to
-save and load various baselines). A report containing the last run for any
-benchmark that was ever run can be opened with:
+save and load various baselines).
 
-```shell
-# MacOS
-open target/criterion/report/index.html
-# Ubuntu
-xdg-open target/criterion/report/index.html
-```
+A report of the latest results for all benchmarks can be opened with the 
following command:
+
+=== "macOS"
+    ```shell
+    open target/criterion/report/index.html
+    ```
+=== "Ubuntu"
+    ```shell
+    xdg-open target/criterion/report/index.html
+    ```
 
 All previous saved benchmark runs can be cleared with:
 
@@ -149,6 +225,16 @@ rm -rf target/criterion
 
 ## Documentation
 
-* `mkdocs serve` - Start the live-reloading docs server.
-* `mkdocs build` - Build the documentation site.
-* `mkdocs -h` - Print help message and exit.
+To contribute to the SedonaDB documentation:
+
+1. Clone the repository and create a fork.
+1. Install the Documentation dependencies:
+    ```sh
+    pip install -r docs/requirements.txt
+    ```
+1. Make your changes to the documentation files.
+1. Preview your changes locally using these commands:
+    * `mkdocs serve` - Start the live-reloading docs server.
+    * `mkdocs build` - Build the documentation site.
+    * `mkdocs -h` - Print help message and exit.
+1. Push your changes and open a pull request.
diff --git a/docs/index.md b/docs/index.md
index 45b2119..62a8bfc 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -1,6 +1,5 @@
 ---
 hide:
-  - navigation
 
 title: Introducing SedonaDB
 ---
@@ -24,30 +23,45 @@ title: Introducing SedonaDB
   under the License.
 -->
 
-SedonaDB is a high-performance, dependency-free geospatial compute engine 
designed for single-node processing, making it ideal for smaller datasets on 
local machines or cloud instances.
+SedonaDB is a single-node analytical database engine with geospatial as the 
first-class citizen.
+
+Fast and dependency-free, SedonaDB is ideal for working with smaller datasets 
located on local machines or cloud instances.
 
 The initial `0.1` release supports a core set of vector operations, with 
comprehensive vector and raster computation capabilities planned for the near 
future.
 
+For distributed workloads, you can still leverage the power of SedonaSpark, 
SedonaFlink, or SedonaSnow.
+
 ## Key features
 
 SedonaDB has several advantages:
 
 * **Exceptional Performance:** Built in Rust to process massive geospatial 
datasets with exceptional speed.
 * **Unified Geospatial Toolkit:** Access a comprehensive suite of functions 
for both vector and raster data in a single, powerful library.
-* **Seamless Ecosystem Integration:** Built on Apache Arrow for smooth 
interoperability with popular data science libraries like GeoPandas, DuckDB, 
and Polars.
+* **Extensive Ecosystem Integration:** Built on Apache Arrow for smooth 
interoperability with popular data science libraries like GeoPandas, DuckDB, 
and Polars.
 * **Flexible APIs:** Effortlessly switch between Python and SQL interfaces to 
match your preferred workflow and skill set.
 * **Guaranteed CRS Propagation:** Automatically manages coordinate reference 
systems (CRS) to ensure spatial accuracy and prevent common errors.
 * **Broad File Format Support:** Work with a wide range of both modern and 
legacy geospatial file formats like geoparquet.
 * **Highly Extensible:** Easily customize and extend the library's 
functionality to meet your project's unique requirements.
 
-## Run a query in SQL, Python, or Rust
+## Install SedonaDB
+
+Here's how to install SedonaDB with various build tools:
+
+=== "pip"
+
+       ```bash
+       pip install "apache-sedona[db]"
+       ```
+
+=== "R"
 
-SedonaDB offers a flexible query interface in SQL, Python, or Rust.
+       ```bash
+       install.packages("sedonadb", repos = 
"https://community.r-multiverse.org";)
+       ```
 
-Engineered for speed, SedonaDB provides performant geospatial processing on a 
single machine. This makes it perfect for the rapid analysis of smaller 
datasets, whether you're working locally or on a cloud server. While the 
initial release focuses on core vector operations, a full suite of vector and 
raster computations is on the roadmap.
+## Run a query in SQL, Python, Rust, or R
 
-For massive, distributed workloads, you can leverage the power of SedonaSpark,
-SedonaFlink, or SedonaSnow.
+SedonaDB offers a flexible query interface.
 
 === "SQL"
 
@@ -58,7 +72,7 @@ SedonaFlink, or SedonaSnow.
 === "Python"
 
        ```python
-       import seonda.db
+       import sedona.db
 
        sd = sedona.db.connect()
        sd.sql("SELECT ST_Point(0, 1) as geom")
@@ -86,21 +100,6 @@ SedonaFlink, or SedonaSnow.
         sd_sql("SELECT ST_Point(0, 1) as geom")
        ```
 
-## Install SedonaDB
-
-Here's how to install SedonaDB with various build tools:
-
-=== "pip"
-
-       ```bash
-       pip install "apache-sedona[db]"
-       ```
-
-=== "R"
-
-       ```bash
-       install.packages("sedonadb", repos = 
"https://community.r-multiverse.org";)
-       ```
 
 ## Have questions?
 
diff --git a/docs/programming-guide.ipynb b/docs/programming-guide.ipynb
index 0c3867d..13e36d1 100644
--- a/docs/programming-guide.ipynb
+++ b/docs/programming-guide.ipynb
@@ -24,14 +24,18 @@
     "  under the License.\n",
     "-->\n",
     "\n",
-    "# SedonaDB Guide\n",
+    "# Working with Vector Data\n",
     "\n",
-    "This page explains how to process vector data with SedonaDB.\n",
-    "\n",
-    "You will learn how to create SedonaDB DataFrames, run spatial queries, 
and perform I/O operations with various types of files.\n",
-    "\n",
-    "Let's start by establishing a SedonaDB connection.\n",
+    "Process vector data using SedonaDB. You will learn to create DataFrames, 
run spatial queries, and manage file I/O. Let's begin by connecting to 
SedonaDB.\n",
     "\n",
+    "Let's start by establishing a SedonaDB connection."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "119fcbae",
+   "metadata": {},
+   "source": [
     "## Establish SedonaDB connection\n",
     "\n",
     "Here's how to create the SedonaDB connection:"
@@ -137,7 +141,7 @@
    "source": [
     "Now, let's run some spatial queries.\n",
     "\n",
-    "**Read from GeoPandas DataFrame**\n",
+    "### Read from GeoPandas DataFrame\n",
     "\n",
     "This section shows how to convert a GeoPandas DataFrame into a SedonaDB 
DataFrame.\n",
     "\n",
@@ -146,7 +150,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": null,
    "id": "b81549f2-0f58-49e4-9011-8de6578c2b0e",
    "metadata": {},
    "outputs": [],
@@ -202,7 +206,7 @@
     "\n",
     "Let's see how to run spatial operations like filtering, joins, and 
clustering algorithms.\n",
     "\n",
-    "**Spatial filtering**\n",
+    "### Spatial filtering\n",
     "\n",
     "Let's run a spatial filtering operation to fetch all the objects in the 
following polygon:"
    ]
@@ -249,11 +253,11 @@
    "id": "32076e01-d807-40ed-8457-9d8c4244e89f",
    "metadata": {},
    "source": [
-    "You can see it only includes the divisions in the Nova Scotia area.  Skip 
to the visualization section to see how this data can be graphed on a map.\n",
+    "You can see it only includes the divisions in the Nova Scotia area.\n",
     "\n",
-    "**K-nearest neighbors (KNN) joins**\n",
+    "### K-nearest neighbors (KNN) joins\n",
     "\n",
-    "Create `restaurants` and `customers` tables so we can demonstrate the KNN 
join functionality."
+    "Create `restaurants` and `customers` views so we can demonstrate the KNN 
join functionality."
    ]
   },
   {
@@ -370,22 +374,6 @@
    "source": [
     "Notice how each customer has two rows - one for each of the two closest 
restaurants."
    ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "3cb1e53b",
-   "metadata": {},
-   "source": [
-    "## GeoParquet support\n",
-    "\n",
-    "You can also read GeoParquet files with SedonaDB with `read_parquet()`\n",
-    "\n",
-    "```python\n",
-    "df = sd.read_parquet(\"DATA_FILE.parquet\")\n",
-    "```\n",
-    "\n",
-    "Once you read the file, you can easily expose it as a view and query it 
with spatial SQL, as we demonstrated in the example above.\n"
-   ]
   }
  ],
  "metadata": {
diff --git a/docs/programming-guide.md b/docs/programming-guide.md
index 493603a..7da3c5f 100644
--- a/docs/programming-guide.md
+++ b/docs/programming-guide.md
@@ -17,11 +17,9 @@
   under the License.
 -->
 
-# SedonaDB Guide
+# Process Vector Data with SedonaDB
 
-This page explains how to process vector data with SedonaDB.
-
-You will learn how to create SedonaDB DataFrames, run spatial queries, and 
perform I/O operations with various types of files.
+Process vector data using SedonaDB. You will learn to create DataFrames, run 
spatial queries, and manage file I/O. Let's begin by connecting to SedonaDB.
 
 Let's start by establishing a SedonaDB connection.
 
@@ -82,7 +80,7 @@ sd.read_parquet(
 
 Now, let's run some spatial queries.
 
-**Read from GeoPandas DataFrame**
+### Read from GeoPandas DataFrame
 
 This section shows how to convert a GeoPandas DataFrame into a SedonaDB 
DataFrame.
 
@@ -120,7 +118,7 @@ df.show(3)
 
 Let's see how to run spatial operations like filtering, joins, and clustering 
algorithms.
 
-**Spatial filtering**
+### Spatial filtering
 
 Let's run a spatial filtering operation to fetch all the objects in the 
following polygon:
 
@@ -151,11 +149,11 @@ ns.show(3)
     
└──────────┴──────────┴────────────────────────────────────────────────────────────────────────────┘
 
 
-You can see it only includes the divisions in the Nova Scotia area.  Skip to 
the visualization section to see how this data can be graphed on a map.
+You can see it only includes the divisions in the Nova Scotia area.
 
-**K-nearest neighbors (KNN) joins**
+### K-nearest neighbors (KNN) joins
 
-Create `restaurants` and `customers` tables so we can demonstrate the KNN join 
functionality.
+Create `restaurants` and `customers` views so we can demonstrate the KNN join 
functionality.
 
 
 ```python
@@ -234,13 +232,3 @@ ORDER BY c.name, r.name;
 
 
 Notice how each customer has two rows - one for each of the two closest 
restaurants.
-
-## GeoParquet support
-
-You can also read GeoParquet files with SedonaDB with `read_parquet()`
-
-```python
-df = sd.read_parquet("DATA_FILE.parquet")
-```
-
-Once you read the file, you can easily expose it as a view and query it with 
spatial SQL, as we demonstrated in the example above.
diff --git a/docs/quickstart-python.ipynb b/docs/quickstart-python.ipynb
index 56dcc17..3558e22 100644
--- a/docs/quickstart-python.ipynb
+++ b/docs/quickstart-python.ipynb
@@ -250,7 +250,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
+   "execution_count": null,
    "id": "6dd816c7-fd3f-4358-b628-ef5e6940c95c",
    "metadata": {},
    "outputs": [],
diff --git a/docs/reference/read-parquet-files.md 
b/docs/reference/read-parquet-files.md
deleted file mode 100644
index 6dc4836..0000000
--- a/docs/reference/read-parquet-files.md
+++ /dev/null
@@ -1,71 +0,0 @@
-
-<!---
-  Licensed to the Apache Software Foundation (ASF) under one
-  or more contributor license agreements.  See the NOTICE file
-  distributed with this work for additional information
-  regarding copyright ownership.  The ASF licenses this file
-  to you under the Apache License, Version 2.0 (the
-  "License"); you may not use this file except in compliance
-  with the License.  You may obtain a copy of the License at
-
-    http://www.apache.org/licenses/LICENSE-2.0
-
-  Unless required by applicable law or agreed to in writing,
-  software distributed under the License is distributed on an
-  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-  KIND, either express or implied.  See the License for the
-  specific language governing permissions and limitations
-  under the License.
--->
-
-# Reading Parquet Files
-
-To read a Parquet file, you must use the dedicated `sd.read_parquet()` method. 
You cannot query a file path directly within the `sd.sql()` `FROM` clause.
-
-The `sd.sql()` function is designed to query tables that have already been 
registered in the session. When you pass a path like `'s3://...'` to `FROM`, 
the SQL engine searches for a registered table with that literal name and fails 
when it's not found, producing a `table not found` error.
-
-## Usage
-
-The correct process is a two-step approach:
-
-1. **Load** the Parquet file into a data frame using `sd.read_parquet()`.
-1. **Register** the data frame view with `to_view()`.
-1. **Query** the view using `sd.sql()`.
-
-```python linenums="1" title="Read a parquet file with SedonaDB"
-
-import sedona.db
-sd = sedona.db.connect()
-
-df = sd.read_parquet(
-    's3://wherobots-benchmark-prod/SpatialBench_sf=1_format=parquet/'
-    'building/building.parquet'
-)
-
-# Load the Parquet file, which creates a Pandas data frame
-df = 
sd.read_parquet('s3://wherobots-benchmark-prod/SpatialBench_sf=1_format=parquet/building/building.parquet')
-
-# Convert the Pandas data frame to a Spark data frame AND
-#    register it as a temporary view in a single line.
-spark.createDataFrame(df).to_view("zone")
-
-# Now, query the view using SQL
-sd.sql("SELECT * FROM zone LIMIT 10").show()
-```
-
-### Common Errors
-
-Directly using a file path within `sd.sql()` is a common mistake that will 
result in an error.
-
-**Incorrect Code:**
-
-```python
-# This will fail because the SQL engine looks for a table named 's3://...'
-sd.sql("SELECT * FROM 
's3://wherobots-benchmark-prod/SpatialBench_sf=1_format=parquet/building/building.parquet'")
-```
-
-**Resulting Error:**
-
-```bash
-sedonadb._lib.SedonaError: Error during planning: table '...s3://...' not found
-```
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
index b18d0b4..b4651e0 100644
--- a/docs/stylesheets/extra.css
+++ b/docs/stylesheets/extra.css
@@ -75,3 +75,14 @@
   padding: 0 0.9rem;
   font-size: 0.65rem; /* NEW: Adjust font size */
 }
+
+/* ==========================================================================
+   Mobile Navigation Styles
+   ========================================================================== 
*/
+
+/* This targets the main container of the slide-out navigation on mobile */
+.md-nav--primary .md-nav__title,
+.md-nav__source {
+  background-color: var(--color-red); /* Use your red color */
+  box-shadow: none; /* Optional: removes the shadow */
+}
diff --git a/docs/working-with-parquet-files.ipynb 
b/docs/working-with-parquet-files.ipynb
new file mode 100644
index 0000000..40aedaf
--- /dev/null
+++ b/docs/working-with-parquet-files.ipynb
@@ -0,0 +1,166 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Working with Parquet Files\n",
+    "\n",
+    "The easiest way to read a GeoParquet or Parquet file is to use 
`sd.read_parquet()`. Alternatively, you can query these files directly by their 
path in SQL."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Install SedonaDB\n",
+    "\n",
+    "Use pip to install SedonaDB from the Python Package Index (PyPI)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "> **Note**: Before running this notebook on your local machine, you must 
have SedonaDB installed in your environment. You can install SedonaDB with the 
following command: `pip install \"apache-sedona[db]\"`"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Implementation\n",
+    "\n",
+    "A common workflow for working with GeoParquet and/or Parquet files is:\n",
+    "\n",
+    "1. **Load** the Parquet file into a data frame using 
`sd.read_parquet()`.\n",
+    "2. **Register** the data frame as a view with `to_view()`.\n",
+    "3. **Query** the view using `sd.sql()`.\n",
+    "4. **Write** your results to a Parquet file with `.to_parquet()` or use 
`.to_pandas()` to export your results to a DataFrame or GeoDataFrame."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Import the sedona.db module and connect to SedonaDB\n",
+    "import sedona.db\n",
+    "\n",
+    "sd = sedona.db.connect()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "┌──────────────┬───────────────────────────────┐\n",
+      "│     name     ┆            geometry           │\n",
+      "│     utf8     ┆            geometry           │\n",
+      "╞══════════════╪═══════════════════════════════╡\n",
+      "│ Vatican City ┆ POINT(12.4533865 41.9032822)  │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ San Marino   ┆ POINT(12.4417702 43.9360958)  │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Vaduz        ┆ POINT(9.5166695 47.1337238)   │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Lobamba      ┆ POINT(31.1999971 -26.4666675) │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Luxembourg   ┆ POINT(6.1300028 49.6116604)   │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Palikir      ┆ POINT(158.1499743 6.9166437)  │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Majuro       ┆ POINT(171.3800002 7.1030043)  │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Funafuti     ┆ POINT(179.2166471 -8.516652)  │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Melekeok     ┆ POINT(134.6265485 7.4873962)  │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Bir Lehlou   ┆ POINT(-9.6525222 26.1191667)  │\n",
+      "└──────────────┴───────────────────────────────┘\n"
+     ]
+    }
+   ],
+   "source": [
+    "# 1. Load the Parquet file\n",
+    "df = sd.read_parquet(\n",
+    "    
\"https://raw.githubusercontent.com/geoarrow/geoarrow-data/v0.2.0/\"\n";,
+    "    \"natural-earth/files/natural-earth_cities_geo.parquet\"\n",
+    ")\n",
+    "\n",
+    "# 2. Register the data frame as a view\n",
+    "df.to_view(\"zone\")\n",
+    "\n",
+    "# 3. Query the view and store the result in a new DataFrame\n",
+    "query_result_df = sd.sql(\"SELECT * FROM zone LIMIT 10\")\n",
+    "query_result_df.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "Verifying the written file at 'query_results.parquet'...\n",
+      "┌──────────────┬───────────────────────────────┐\n",
+      "│     name     ┆            geometry           │\n",
+      "│     utf8     ┆            geometry           │\n",
+      "╞══════════════╪═══════════════════════════════╡\n",
+      "│ Vatican City ┆ POINT(12.4533865 41.9032822)  │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ San Marino   ┆ POINT(12.4417702 43.9360958)  │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Vaduz        ┆ POINT(9.5166695 47.1337238)   │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Lobamba      ┆ POINT(31.1999971 -26.4666675) │\n",
+      "├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+      "│ Luxembourg   ┆ POINT(6.1300028 49.6116604)   │\n",
+      "└──────────────┴───────────────────────────────┘\n"
+     ]
+    }
+   ],
+   "source": [
+    "# 4. Write the result to a new Parquet file\n",
+    "output_path = \"query_results.parquet\"\n",
+    "query_result_df.to_parquet(output_path)\n",
+    "\n",
+    "# (Optional) Verify the written file\n",
+    "print(f\"\\nVerifying the written file at '{output_path}'...\")\n",
+    "verified_df = sd.read_parquet(output_path)\n",
+    "verified_df.show(5)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": ".venv (3.13.3)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.13.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/docs/working-with-parquet-files.md 
b/docs/working-with-parquet-files.md
new file mode 100644
index 0000000..ea28931
--- /dev/null
+++ b/docs/working-with-parquet-files.md
@@ -0,0 +1,116 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Working with Parquet Files
+
+The easiest way to read a GeoParquet or Parquet file is to use 
`sd.read_parquet()`. Alternatively, you can query these files directly by their 
path in SQL.
+
+## Install SedonaDB
+
+Use pip to install SedonaDB from the Python Package Index (PyPI).
+
+> **Note**: Before running this notebook on your local machine, you must have 
SedonaDB installed in your environment. You can install SedonaDB with the 
following command: `pip install "apache-sedona[db]"`
+
+## Implementation
+
+A common workflow for working with GeoParquet and/or Parquet files is:
+
+1. **Load** the Parquet file into a data frame using `sd.read_parquet()`.
+2. **Register** the data frame as a view with `to_view()`.
+3. **Query** the view using `sd.sql()`.
+4. **Write** your results to a Parquet file with `.to_parquet()` or use 
`.to_pandas()` to export your results to a DataFrame or GeoDataFrame.
+
+
+```python
+# Import the sedona.db module and connect to SedonaDB
+import sedona.db
+
+sd = sedona.db.connect()
+```
+
+
+```python
+# 1. Load the Parquet file
+df = sd.read_parquet(
+    "https://raw.githubusercontent.com/geoarrow/geoarrow-data/v0.2.0/";
+    "natural-earth/files/natural-earth_cities_geo.parquet"
+)
+
+# 2. Register the data frame as a view
+df.to_view("zone")
+
+# 3. Query the view and store the result in a new DataFrame
+query_result_df = sd.sql("SELECT * FROM zone LIMIT 10")
+query_result_df.show()
+```
+
+    ┌──────────────┬───────────────────────────────┐
+    │     name     ┆            geometry           │
+    │     utf8     ┆            geometry           │
+    ╞══════════════╪═══════════════════════════════╡
+    │ Vatican City ┆ POINT(12.4533865 41.9032822)  │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ San Marino   ┆ POINT(12.4417702 43.9360958)  │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Vaduz        ┆ POINT(9.5166695 47.1337238)   │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Lobamba      ┆ POINT(31.1999971 -26.4666675) │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Luxembourg   ┆ POINT(6.1300028 49.6116604)   │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Palikir      ┆ POINT(158.1499743 6.9166437)  │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Majuro       ┆ POINT(171.3800002 7.1030043)  │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Funafuti     ┆ POINT(179.2166471 -8.516652)  │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Melekeok     ┆ POINT(134.6265485 7.4873962)  │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Bir Lehlou   ┆ POINT(-9.6525222 26.1191667)  │
+    └──────────────┴───────────────────────────────┘
+
+
+
+```python
+# 4. Write the result to a new Parquet file
+output_path = "query_results.parquet"
+query_result_df.to_parquet(output_path)
+
+# (Optional) Verify the written file
+print(f"\nVerifying the written file at '{output_path}'...")
+verified_df = sd.read_parquet(output_path)
+verified_df.show(5)
+```
+
+
+    Verifying the written file at 'query_results.parquet'...
+    ┌──────────────┬───────────────────────────────┐
+    │     name     ┆            geometry           │
+    │     utf8     ┆            geometry           │
+    ╞══════════════╪═══════════════════════════════╡
+    │ Vatican City ┆ POINT(12.4533865 41.9032822)  │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ San Marino   ┆ POINT(12.4417702 43.9360958)  │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Vaduz        ┆ POINT(9.5166695 47.1337238)   │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Lobamba      ┆ POINT(31.1999971 -26.4666675) │
+    ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+    │ Luxembourg   ┆ POINT(6.1300028 49.6116604)   │
+    └──────────────┴───────────────────────────────┘
diff --git a/mkdocs.yml b/mkdocs.yml
index 621f1bc..233ce78 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -20,19 +20,20 @@ site_description: "Documentation for Apache SedonaDB"
 site_url: https://sedona.apache.org/sedonadb/
 nav:
    - SedonaDB: index.md
+   - Python Quickstart: quickstart-python.md
    - SedonaDB Guides:
-     - Python Quickstart: quickstart-python.md
-     - SedonaDB Guide: programming-guide.md
+     - Working with Vector Data: programming-guide.md
      - Working with GeoPandas: geopandas-interop.md
      - Working with Overture: overture-examples.md
-     - Development: development.md
+     - Working with Parquet Files: working-with-parquet-files.md
+     - Contributors Guide: contributors-guide.md
+
    - SedonaDB Reference:
       - Python:
           - Python Functions: reference/python.md
       - SQL:
           - SQL Functions: reference/sql.md
           - Spatial Joins: reference/sql-joins.md
-          - Read Parquet Files: reference/read-parquet-files.md
    - Blog: "https://sedona.apache.org/latest/blog/";
    - Community: "https://sedona.apache.org/latest/community/contact/";
    - Apache Software Foundation: "https://sedona.apache.org/latest/asf/asf/";
@@ -50,7 +51,7 @@ theme:
     primary: custom
     accent: 'green'
   favicon: image/sedona_logo_symbol.png
-  logo: image/sedona_logo_symbol_white.svg
+  logo: image/sedona_logo_symbol.png
   icon:
     logo: fontawesome/solid/earth-americas
     repo: fontawesome/brands/github

(sedona-db) branch main updated: [DOCS] Fit and finish fixes (#110)

Reply via email to