This is an automated email from the ASF dual-hosted git repository.
jiayu pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/sedona-db.git
The following commit(s) were added to refs/heads/main by this push:
new 6bf6189 Pre and post bug bash fixes (#66)
6bf6189 is described below
commit 6bf61897ab21f3eb3cf041f072c71572b3db8b56
Author: Kelly-Ann Dolor <[email protected]>
AuthorDate: Tue Sep 16 22:01:18 2025 -0700
Pre and post bug bash fixes (#66)
Co-authored-by: Dewey Dunnington <[email protected]>
---
docs/geopandas-interop.ipynb | 6 +-
docs/index.md | 49 ++++----
docs/programming-guide.ipynb | 48 ++++----
docs/quickstart-cli.md | 94 ---------------
docs/quickstart-python.ipynb | 120 ++++++++----------
docs/quickstart-python.md | 228 +++++++++++++++++++++++++++++++++++
docs/reference/read-parquet-files.md | 28 +++--
docs/requirements.txt | 2 +
docs/stylesheets/extra.css | 27 +++--
mkdocs.yml | 11 +-
10 files changed, 376 insertions(+), 237 deletions(-)
diff --git a/docs/geopandas-interop.ipynb b/docs/geopandas-interop.ipynb
index ce11116..cef23d9 100644
--- a/docs/geopandas-interop.ipynb
+++ b/docs/geopandas-interop.ipynb
@@ -14,15 +14,15 @@
},
{
"cell_type": "code",
- "execution_count": 2,
+ "execution_count": null,
"id": "0434bead-2628-4844-a3f6-2f9c15a21899",
"metadata": {},
"outputs": [],
"source": [
- "import sedonadb\n",
+ "import sedona.db\n",
"import geopandas as gpd\n",
"\n",
- "sd = sedonadb.connect()"
+ "sd = sedona.db.connect()"
]
},
{
diff --git a/docs/index.md b/docs/index.md
index 3a398a7..45b2119 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -1,6 +1,8 @@
---
hide:
- navigation
+
+title: Introducing SedonaDB
---
<!---
@@ -22,13 +24,30 @@ hide:
under the License.
-->
-# SedonaDB
+SedonaDB is a high-performance, dependency-free geospatial compute engine
designed for single-node processing, making it ideal for smaller datasets on
local machines or cloud instances.
+
+The initial `0.1` release supports a core set of vector operations, with
comprehensive vector and raster computation capabilities planned for the near
future.
+
+## Key features
+
+SedonaDB has several advantages:
+
+* **Exceptional Performance:** Built in Rust to process massive geospatial
datasets with exceptional speed.
+* **Unified Geospatial Toolkit:** Access a comprehensive suite of functions
for both vector and raster data in a single, powerful library.
+* **Seamless Ecosystem Integration:** Built on Apache Arrow for smooth
interoperability with popular data science libraries like GeoPandas, DuckDB,
and Polars.
+* **Flexible APIs:** Effortlessly switch between Python and SQL interfaces to
match your preferred workflow and skill set.
+* **Guaranteed CRS Propagation:** Automatically manages coordinate reference
systems (CRS) to ensure spatial accuracy and prevent common errors.
+* **Broad File Format Support:** Work with a wide range of both modern and
legacy geospatial file formats like geoparquet.
+* **Highly Extensible:** Easily customize and extend the library's
functionality to meet your project's unique requirements.
+
+## Run a query in SQL, Python, or Rust
-SedonaDB is a high-performance, dependency-free geospatial compute engine.
+SedonaDB offers a flexible query interface in SQL, Python, or Rust.
-You can easily run SedonaDB locally or in the cloud. The first release
supports a core set of vector operations, but the full-suite of common vector
and raster computations will be supported soon.
+Engineered for speed, SedonaDB provides performant geospatial processing on a
single machine. This makes it perfect for the rapid analysis of smaller
datasets, whether you're working locally or on a cloud server. While the
initial release focuses on core vector operations, a full suite of vector and
raster computations is on the roadmap.
-SedonaDB only runs on a single machine, so it’s perfect for processing smaller
datasets. You can use SedonaSpark, SedonaFlink, or SedonaSnow for operations
on larger datasets.
+For massive, distributed workloads, you can leverage the power of SedonaSpark,
+SedonaFlink, or SedonaSnow.
=== "SQL"
@@ -67,21 +86,9 @@ SedonaDB only runs on a single machine, so it’s perfect for
processing smaller
sd_sql("SELECT ST_Point(0, 1) as geom")
```
-## Key features
-
-SedonaDB has several advantages:
-
-* **Blazing-Fast Performance:** Built in Rust to process massive geospatial
datasets with exceptional speed.
-* **Unified Geospatial Toolkit:** Access a comprehensive suite of functions
for both vector and raster data in a single, powerful library.
-* **Seamless Ecosystem Integration:** Built on Apache Arrow for smooth
interoperability with popular data science libraries like GeoPandas, DuckDB,
and Polars.
-* **Flexible APIs:** Effortlessly switch between Python and SQL interfaces to
match your preferred workflow and skillset.
-* **Guaranteed CRS Propagation:** Automatically manages coordinate reference
systems (CRS) to ensure spatial accuracy and prevent common errors.
-* **Broad File Format Support:** Work with a wide range of both modern and
legacy geospatial file formats like geoparquet.
-* **Highly Extensible:** Easily customize and extend the library's
functionality to meet your project's unique requirements.
+## Install SedonaDB
-## Installation
-
-Here’s how to install SedonaDB with various build tools:
+Here's how to install SedonaDB with various build tools:
=== "pip"
@@ -95,12 +102,8 @@ Here’s how to install SedonaDB with various build tools:
install.packages("sedonadb", repos =
"https://community.r-multiverse.org")
```
-## SedonaDB example with vector data
-
-TODO
-
## Have questions?
-Feel free to start a GitHub Discussion or join the Discord community to ask
the developers any questions you may have.
+Start a [GitHub Discussion](https://github.com/apache/sedona-db/issues) or
join the [Discord community](https://discord.com/invite/9A3k5dEBsY) and ask the
developers any questions you may have.
We look forward to collaborating with you!
diff --git a/docs/programming-guide.ipynb b/docs/programming-guide.ipynb
index 392fdbd..93c7208 100644
--- a/docs/programming-guide.ipynb
+++ b/docs/programming-guide.ipynb
@@ -11,11 +11,11 @@
"\n",
"You will learn how to create SedonaDB DataFrames, run spatial queries,
and perform I/O operations with various types of files.\n",
"\n",
- "Let’s start by establishing a SedonaDB connection.\n",
+ "Let's start by establishing a SedonaDB connection.\n",
"\n",
"## Establish SedonaDB connection\n",
"\n",
- "Here’s how to create the SedonaDB connection:"
+ "Here's how to create the SedonaDB connection:"
]
},
{
@@ -25,9 +25,9 @@
"metadata": {},
"outputs": [],
"source": [
- "import sedonadb\n",
+ "import sedona.db\n",
"\n",
- "sd = sedonadb.connect()"
+ "sd = sedona.db.connect()"
]
},
{
@@ -35,13 +35,13 @@
"id": "7aeaa60f-2325-418c-8e72-4344bd4a75fe",
"metadata": {},
"source": [
- "Now let’s see how to create SedonaDB DataFrames.\n",
+ "Now, let's see how to create SedonaDB dataframes.\n",
"\n",
"## Create SedonaDB DataFrame\n",
"\n",
"**Manually creating SedonaDB DataFrame**\n",
"\n",
- "Here’s how to manually create a SedonaDB DataFrame:"
+ "Here's how to manually create a SedonaDB DataFrame:"
]
},
{
@@ -95,7 +95,7 @@
"source": [
"**Create SedonaDB DataFrame from files in S3**\n",
"\n",
- "For most production applications, you will create SedonaDB DataFrames by
reading data from a file. Let’s see how to read GeoParquet files in AWS S3
into a SedonaDB DataFrame."
+ "For most production applications, you will create SedonaDB DataFrames by
reading data from a file. Let's see how to read GeoParquet files in AWS S3
into a SedonaDB DataFrame."
]
},
{
@@ -116,7 +116,7 @@
"id": "858fcc66-816d-4c71-8875-82b74169eccd",
"metadata": {},
"source": [
- "Let’s now run some spatial queries.\n",
+ "Now, let's run some spatial queries.\n",
"\n",
"**Read from GeoPandas DataFrame**\n",
"\n",
@@ -181,11 +181,11 @@
"source": [
"## Spatial queries\n",
"\n",
- "Let’s see how to run spatial operations like filtering, joins, and
clustering algorithms.\n",
+ "Let's see how to run spatial operations like filtering, joins, and
clustering algorithms.\n",
"\n",
- "***Spatial filtering***\n",
+ "**Spatial filtering**\n",
"\n",
- "Let’s run a spatial filtering operation to fetch all the objects in the
following polygon:"
+ "Let's run a spatial filtering operation to fetch all the objects in the
following polygon:"
]
},
{
@@ -232,21 +232,21 @@
"source": [
"You can see it only includes the divisions in the Nova Scotia area. Skip
to the visualization section to see how this data can be graphed on a map.\n",
"\n",
- "***K-nearest neighbors (KNN) joins***\n",
+ "**K-nearest neighbors (KNN) joins**\n",
"\n",
"Create `restaurants` and `customers` tables so we can demonstrate the KNN
join functionality."
]
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": null,
"id": "deaa36db-2fee-4ba2-ab79-1dc756cb1655",
"metadata": {},
"outputs": [],
"source": [
"df = sd.sql(\"\"\"\n",
"SELECT name, ST_Point(lng, lat) AS location\n",
- "FROM (VALUES \n",
+ "FROM (VALUES\n",
" (101, -74.0, 40.7, 'Pizza Palace'),\n",
" (102, -73.99, 40.69, 'Burger Barn'),\n",
" (103, -74.02, 40.72, 'Taco Town'),\n",
@@ -259,7 +259,7 @@
"\n",
"df = sd.sql(\"\"\"\n",
"SELECT name, ST_Point(lng, lat) AS location\n",
- "FROM (VALUES \n",
+ "FROM (VALUES\n",
" (1, -74.0, 40.7, 'Alice'),\n",
" (2, -73.9, 40.8, 'Bob'),\n",
" (3, -74.1, 40.6, 'Carol')\n",
@@ -349,17 +349,23 @@
"id": "2e93fe6a-b0a7-4ec0-952c-dde9edcacdc4",
"metadata": {},
"source": [
- "Notice how each customer has two rows - one for each of the two closest
restaurants.\n",
- "\n",
- "## Files\n",
+ "Notice how each customer has two rows - one for each of the two closest
restaurants."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3cb1e53b",
+ "metadata": {},
+ "source": [
+ "## GeoParquet support\n",
"\n",
- "You can read GeoParquet files with SedonaDB, see the following
example:\n",
+ "You can also read GeoParquet files with SedonaDB with `read_parquet()`\n",
"\n",
"```python\n",
- "df = sd.read_parquet(\"some_file.parquet\")\n",
+ "df = sd.read_parquet(\"DATA_FILE.parquet\")\n",
"```\n",
"\n",
- "Once you read the file, you can easily expose it as a view and query it
with spatial SQL, as we demonstrated in the example above."
+ "Once you read the file, you can easily expose it as a view and query it
with spatial SQL, as we demonstrated in the example above.\n"
]
}
],
diff --git a/docs/quickstart-cli.md b/docs/quickstart-cli.md
deleted file mode 100644
index 26d0bd4..0000000
--- a/docs/quickstart-cli.md
+++ /dev/null
@@ -1,94 +0,0 @@
-<!---
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
-
- http://www.apache.org/licenses/LICENSE-2.0
-
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
--->
-
-# CLI Quickstart
-
-SedonaDB's command-line interface provides an interactive SQL shell that can
be used to
-leverage the SedonaDB engine for SQL-only/shell-centric workflows. SedonaDB's
CLI is
-based on the [DataFusion
CLI](https://datafusion.apache.org/user-guide/cli/index.html),
-whose documentation may be useful for advanced features not covered in detail
here.
-
-## Installation
-
-You can install `sedona-cli` using Cargo:
-
-```shell
-cargo install sedona-cli
-```
-
-## Usage
-
-Running `sedona-cli` from a terminal will start an interactive SQL shell.
Queries must end
-in a semicolon (`;`) and can be cleared with `Control-C`.
-
-```
-Sedona CLI v0.0.1
-> SELECT ST_Point(0, 1) as geom;
-┌────────────┐
-│ geom │
-│ wkb │
-╞════════════╡
-│ POINT(0 1) │
-└────────────┘
-
-1 row(s)/1 column(s) fetched.
-Elapsed 0.024 seconds.
-```
-
-See the [SQL Reference]() for details on the SQL functions and features
available to the CLI.
-
-## Help
-
-From the interactive shell, use `\?` for special command help:
-
-```
-> \?
-Command,Description
-\d,list tables
-\d name,describe table
-\q,quit datafusion-cli
-\?,help
-\h,function list
-\h function,search function
-\quiet (true|false)?,print or set quiet mode
-\pset [NAME [VALUE]],"set table output option
-(format)"
-```
-
-From the command line, use `--help` to list launch options and/or options for
interacting
-with the CLI in a non-interactive context.
-
-```
-Command Line Client for Sedona's DataFusion-based query engine.
-
-Usage: sedona-cli [OPTIONS]
-
-Options:
- -p, --data-path <DATA_PATH> Path to your data, default to current directory
- -c, --command [<COMMAND>...] Execute the given command string(s), then
exit. Commands are expected to be non empty.
- -f, --file [<FILE>...] Execute commands from file(s), then exit
- -r, --rc [<RC>...] Run the provided files on startup instead of
~/.datafusionrc
- --format <FORMAT> [default: automatic] [possible values: csv,
tsv, table, json, nd-json, automatic]
- -q, --quiet Reduce printing other than the results and
work quietly
- --maxrows <MAXROWS> The max number of rows to display for 'Table'
format
- [possible values: numbers(0/10/...), inf(no
limit)] [default: 40]
- --color Enables console syntax highlighting
- -h, --help Print help
- -V, --version Print version
-```
diff --git a/docs/quickstart-python.ipynb b/docs/quickstart-python.ipynb
index e243a5c..931b35c 100644
--- a/docs/quickstart-python.ipynb
+++ b/docs/quickstart-python.ipynb
@@ -10,7 +10,7 @@
"SedonaDB for Python can be installed from [PyPI](https://pypi.org):\n",
"\n",
"```shell\n",
- "pip install apache-sedona[db]\n",
+ "pip install \"apache-sedona[db]\"\n",
"```\n",
"\n",
"If you can import the module and connect to a new session, you're good to
go!"
@@ -28,7 +28,7 @@
"text": [
"┌────────────┐\n",
"│ geom │\n",
- "│ wkb │\n",
+ "│ geometry │\n",
"╞════════════╡\n",
"│ POINT(0 1) │\n",
"└────────────┘\n"
@@ -36,9 +36,9 @@
}
],
"source": [
- "import sedonadb\n",
+ "import sedona.db\n",
"\n",
- "sd = sedonadb.connect()\n",
+ "sd = sedona.db.connect()\n",
"sd.sql(\"SELECT ST_Point(0, 1) as geom\").show()"
]
},
@@ -74,7 +74,7 @@
"text": [
"┌──────────────┬───────────────────────────────┐\n",
"│ name ┆ geometry │\n",
- "│ utf8view ┆ wkb_view <epsg:4326> │\n",
+ "│ utf8view ┆ geometry │\n",
"╞══════════════╪═══════════════════════════════╡\n",
"│ Vatican City ┆ POINT(12.4533865 41.9032822) │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
@@ -127,7 +127,7 @@
"text": [
"┌─────────────────────────────┬───────────────┬────────────────────────────────────────────────────┐\n",
"│ name ┆ continent ┆
geometry │\n",
- "│ utf8view ┆ utf8view ┆ wkb_view
<epsg:4326> │\n",
+ "│ utf8view ┆ utf8view ┆
geometry │\n",
"╞═════════════════════════════╪═══════════════╪════════════════════════════════════════════════════╡\n",
"│ Fiji ┆ Oceania ┆ MULTIPOLYGON(((180
-16.067132663642447,180 -16.55… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
@@ -179,7 +179,7 @@
"text": [
"┌───────────────┬──────────────────────┬─────────────────────┬───────────────┬─────────────────────┐\n",
"│ name ┆ geometry ┆ name ┆
continent ┆ geometry │\n",
- "│ utf8view ┆ wkb_view <epsg:4326> ┆ utf8view ┆
utf8view ┆ wkb_view <epsg:432… │\n",
+ "│ utf8view ┆ geometry ┆ utf8view ┆
utf8view ┆ geometry │\n",
"╞═══════════════╪══════════════════════╪═════════════════════╪═══════════════╪═════════════════════╡\n",
"│ Suva ┆ POINT(178.4417073 -… ┆ Fiji ┆ Oceania
┆ MULTIPOLYGON(((180… │\n",
"├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
@@ -237,10 +237,10 @@
"outputs": [],
"source": [
"df = sd.sql(\"\"\"\n",
- "SELECT * FROM (VALUES \n",
- " ('one', ST_GeomFromWkt('POINT(1 2)')), \n",
- " ('two', ST_GeomFromWkt('POLYGON((-74.0 40.7, -74.0 40.8, -73.9 40.8,
-73.9 40.7, -74.0 40.7))')), \n",
- " ('three', ST_GeomFromWkt('LINESTRING(-74.0060 40.7128, -73.9352
40.7306, -73.8561 40.8484)'))) \n",
+ "SELECT * FROM (VALUES\n",
+ " ('one', ST_GeomFromWkt('POINT(1 2)')),\n",
+ " ('two', ST_GeomFromWkt('POLYGON((-74.0 40.7, -74.0 40.8, -73.9 40.8,
-73.9 40.7, -74.0 40.7))')),\n",
+ " ('three', ST_GeomFromWkt('LINESTRING(-74.0060 40.7128, -73.9352
40.7306, -73.8561 40.8484)')))\n",
"AS t(val, point)\"\"\")"
]
},
@@ -254,16 +254,16 @@
"name": "stdout",
"output_type": "stream",
"text": [
-
"┌───────┬───────────────────────────────────────────────────────────────┐\n",
- "│ val ┆ point
│\n",
- "│ utf8 ┆ wkb
│\n",
-
"╞═══════╪═══════════════════════════════════════════════════════════════╡\n",
- "│ one ┆ POINT(1 2)
│\n",
-
"├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
- "│ two ┆ POLYGON((-74 40.7,-74 40.8,-73.9 40.8,-73.9 40.7,-74 40.7))
│\n",
-
"├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
- "│ three ┆ LINESTRING(-74.006 40.7128,-73.9352 40.7306,-73.8561 40.8484)
│\n",
-
"└───────┴───────────────────────────────────────────────────────────────┘\n"
+
"┌───────┬──────────────────────────────────────────────────────────────────────────────────────────┐\n",
+ "│ val ┆ point
│\n",
+ "│ utf8 ┆ binary
│\n",
+
"╞═══════╪══════════════════════════════════════════════════════════════════════════════════════════╡\n",
+ "│ one ┆ 0101000000000000000000f03f0000000000000040
│\n",
+
"├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+ "│ two ┆
0103000000010000000500000000000000008052c09a9999999959444000000000008052c06666666666664…
│\n",
+
"├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+ "│ three ┆
010200000003000000aaf1d24d628052c05e4bc8073d5b444007ce1951da7b52c0933a014d845d4440c286a…
│\n",
+
"└───────┴──────────────────────────────────────────────────────────────────────────────────────────┘\n"
]
}
],
@@ -320,78 +320,60 @@
},
{
"cell_type": "code",
- "execution_count": 12,
- "id": "09eefd76-b325-4d58-8284-92c2340514b8",
+ "execution_count": 16,
+ "id": "de296b34",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
-
"┌───────┬─────────────────────────────────────────────┬────────────────────────────────────────────┐\n",
- "│ val ┆ point ┆
centroid │\n",
- "│ utf8 ┆ wkb ┆
wkb │\n",
-
"╞═══════╪═════════════════════════════════════════════╪════════════════════════════════════════════╡\n",
- "│ one ┆ POINT(1 2) ┆ POINT(1 2)
│\n",
-
"├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
- "│ two ┆ POLYGON((-74 40.7,-74 40.8,-73.9 40.8,-73.… ┆
POINT(-73.95000000000002 40.75) │\n",
-
"├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
- "│ three ┆ LINESTRING(-74.006 40.7128,-73.9352 40.730… ┆
POINT(-73.92111155675562 40.7664673976246… │\n",
-
"└───────┴─────────────────────────────────────────────┴────────────────────────────────────────────┘\n"
+ "┌─────────────┬───────────┬─────────────┐\n",
+ "│ column_name ┆ data_type ┆ is_nullable │\n",
+ "│ utf8 ┆ utf8 ┆ utf8 │\n",
+ "╞═════════════╪═══════════╪═════════════╡\n",
+ "│ val ┆ Utf8 ┆ YES │\n",
+ "├╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+ "│ point ┆ Binary ┆ YES │\n",
+ "└─────────────┴───────────┴─────────────┘\n"
]
}
],
"source": [
- "sd.sql(\"select *, ST_Centroid(point) as centroid from
fun_table\").show()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "d5244831-100e-4d8c-b5df-b796631bffc8",
- "metadata": {},
- "source": [
- "## Interactive mode"
+ "sd.sql(\"DESCRIBE fun_table\").show()"
]
},
{
"cell_type": "code",
- "execution_count": 13,
- "id": "3629c6e6",
+ "execution_count": 18,
+ "id": "09eefd76-b325-4d58-8284-92c2340514b8",
"metadata": {},
"outputs": [
{
- "data": {
- "text/plain": [
- "┌────────────┐\n",
- "│ geom │\n",
- "│ wkb │\n",
- "╞════════════╡\n",
- "│ POINT(0 1) │\n",
- "└────────────┘"
- ]
- },
- "execution_count": 13,
- "metadata": {},
- "output_type": "execute_result"
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+
"┌───────┬─────────────────────────────────────────────┬────────────────────────────────────────────┐\n",
+ "│ val ┆ point ┆
centroid │\n",
+ "│ utf8 ┆ binary ┆
geometry │\n",
+
"╞═══════╪═════════════════════════════════════════════╪════════════════════════════════════════════╡\n",
+ "│ one ┆ 0101000000000000000000f03f0000000000000040 ┆ POINT(1 2)
│\n",
+
"├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+ "│ two ┆ 0103000000010000000500000000000000008052c0… ┆ POINT(-73.95
40.75) │\n",
+
"├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤\n",
+ "│ three ┆ 010200000003000000aaf1d24d628052c05e4bc807… ┆
POINT(-73.92111155675562 40.7664673976246… │\n",
+
"└───────┴─────────────────────────────────────────────┴────────────────────────────────────────────┘\n"
+ ]
}
],
"source": [
- "sedonadb.options.interactive = True\n",
- "sd.sql(\"SELECT ST_Point(0, 1) as geom\")"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "fc34cc93",
- "metadata": {},
- "source": [
- "Most SedonaDB Python users will want to turn on interactive mode when
developing code in a notebook or interactive session. Interactive mode prints
results eagerly, which is usually what you want when interacting with a new
data source or constructing a query. When interacting with large remote data
sources or non-interactive workloads, this is usually *not* what you want;
however, you can use an explicit `.show()` to force executing enough of a query
to show the first few rows."
+ "sd.sql(\"SELECT *, ST_Centroid(ST_GeomFromWKB(point)) as centroid from
fun_table\").show()"
]
}
],
"metadata": {
"kernelspec": {
- "display_name": "Python 3 (ipykernel)",
+ "display_name": ".venv (3.13.3)",
"language": "python",
"name": "python3"
},
@@ -405,7 +387,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.12.8"
+ "version": "3.13.3"
}
},
"nbformat": 4,
diff --git a/docs/quickstart-python.md b/docs/quickstart-python.md
new file mode 100644
index 0000000..836d2bd
--- /dev/null
+++ b/docs/quickstart-python.md
@@ -0,0 +1,228 @@
+# Python Quickstart
+
+SedonaDB for Python can be installed from [PyPI](https://pypi.org):
+
+```shell
+pip install "apache-sedona[db]"
+```
+
+If you can import the module and connect to a new session, you're good to go!
+
+
+```python
+import sedona.db
+
+sd = sedona.db.connect()
+sd.sql("SELECT ST_Point(0, 1) as geom").show()
+```
+
+ ┌────────────┐
+ │ geom │
+ │ geometry │
+ ╞════════════╡
+ │ POINT(0 1) │
+ └────────────┘
+
+
+## Point in polygon join
+
+
+```python
+cities = sd.read_parquet(
+
"https://raw.githubusercontent.com/geoarrow/geoarrow-data/v0.2.0/natural-earth/files/natural-earth_cities_geo.parquet"
+)
+```
+
+
+```python
+cities.show()
+```
+
+ ┌──────────────┬───────────────────────────────┐
+ │ name ┆ geometry │
+ │ utf8view ┆ geometry │
+ ╞══════════════╪═══════════════════════════════╡
+ │ Vatican City ┆ POINT(12.4533865 41.9032822) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ San Marino ┆ POINT(12.4417702 43.9360958) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Vaduz ┆ POINT(9.5166695 47.1337238) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Lobamba ┆ POINT(31.1999971 -26.4666675) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Luxembourg ┆ POINT(6.1300028 49.6116604) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Palikir ┆ POINT(158.1499743 6.9166437) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Majuro ┆ POINT(171.3800002 7.1030043) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Funafuti ┆ POINT(179.2166471 -8.516652) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Melekeok ┆ POINT(134.6265485 7.4873962) │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Bir Lehlou ┆ POINT(-9.6525222 26.1191667) │
+ └──────────────┴───────────────────────────────┘
+
+
+
+```python
+countries = sd.read_parquet(
+
"https://raw.githubusercontent.com/geoarrow/geoarrow-data/v0.2.0/natural-earth/files/natural-earth_countries_geo.parquet"
+)
+```
+
+
+```python
+countries.show()
+```
+
+
┌─────────────────────────────┬───────────────┬────────────────────────────────────────────────────┐
+ │ name ┆ continent ┆
geometry │
+ │ utf8view ┆ utf8view ┆
geometry │
+
╞═════════════════════════════╪═══════════════╪════════════════════════════════════════════════════╡
+ │ Fiji ┆ Oceania ┆ MULTIPOLYGON(((180
-16.067132663642447,180 -16.55… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ United Republic of Tanzania ┆ Africa ┆ POLYGON((33.90371119710453
-0.9500000000000001,34… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Western Sahara ┆ Africa ┆
POLYGON((-8.665589565454809 27.656425889592356,-8… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Canada ┆ North America ┆
MULTIPOLYGON(((-122.84000000000003 49.00000000000… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ United States of America ┆ North America ┆
MULTIPOLYGON(((-122.84000000000003 49.00000000000… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Kazakhstan ┆ Asia ┆ POLYGON((87.35997033076265
49.21498078062912,86.5… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Uzbekistan ┆ Asia ┆ POLYGON((55.96819135928291
41.30864166926936,55.9… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Papua New Guinea ┆ Oceania ┆
MULTIPOLYGON(((141.00021040259185 -2.600151055515… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Indonesia ┆ Asia ┆
MULTIPOLYGON(((141.00021040259185 -2.600151055515… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Argentina ┆ South America ┆
MULTIPOLYGON(((-68.63401022758323 -52.63637045887… │
+
└─────────────────────────────┴───────────────┴────────────────────────────────────────────────────┘
+
+
+
+```python
+cities.to_view("cities")
+countries.to_view("countries")
+```
+
+
+```python
+# join the cities and countries tables
+sd.sql("""
+select * from cities
+join countries
+where ST_Intersects(cities.geometry, countries.geometry)
+""").show()
+```
+
+
┌───────────────┬──────────────────────┬─────────────────────┬───────────────┬─────────────────────┐
+ │ name ┆ geometry ┆ name ┆ continent
┆ geometry │
+ │ utf8view ┆ geometry ┆ utf8view ┆ utf8view
┆ geometry │
+
╞═══════════════╪══════════════════════╪═════════════════════╪═══════════════╪═════════════════════╡
+ │ Suva ┆ POINT(178.4417073 -… ┆ Fiji ┆ Oceania
┆ MULTIPOLYGON(((180… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Dodoma ┆ POINT(35.7500036 -6… ┆ United Republic of… ┆ Africa
┆ POLYGON((33.903711… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Dar es Salaam ┆ POINT(39.266396 -6.… ┆ United Republic of… ┆ Africa
┆ POLYGON((33.903711… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Bir Lehlou ┆ POINT(-9.6525222 26… ┆ Western Sahara ┆ Africa
┆ POLYGON((-8.665589… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Ottawa ┆ POINT(-75.7019612 4… ┆ Canada ┆ North
America ┆ MULTIPOLYGON(((-12… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Vancouver ┆ POINT(-123.1235901 … ┆ Canada ┆ North
America ┆ MULTIPOLYGON(((-12… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Toronto ┆ POINT(-79.389458554… ┆ Canada ┆ North
America ┆ MULTIPOLYGON(((-12… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ San Francisco ┆ POINT(-122.39959956… ┆ United States of A… ┆ North
America ┆ MULTIPOLYGON(((-12… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Denver ┆ POINT(-104.9859618 … ┆ United States of A… ┆ North
America ┆ MULTIPOLYGON(((-12… │
+
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ Houston ┆ POINT(-95.348436256… ┆ United States of A… ┆ North
America ┆ MULTIPOLYGON(((-12… │
+
└───────────────┴──────────────────────┴─────────────────────┴───────────────┴─────────────────────┘
+
+
+## Manually create SedonaDB DataFrames
+
+Let's create a DataFrame with one string column and one geometry column to
show some of the functionality of the SedonaDB Python interface.
+
+
+```python
+df = sd.sql("""
+SELECT * FROM (VALUES
+ ('one', ST_GeomFromWkt('POINT(1 2)')),
+ ('two', ST_GeomFromWkt('POLYGON((-74.0 40.7, -74.0 40.8, -73.9 40.8, -73.9
40.7, -74.0 40.7))')),
+ ('three', ST_GeomFromWkt('LINESTRING(-74.0060 40.7128, -73.9352 40.7306,
-73.8561 40.8484)')))
+AS t(val, point)""")
+```
+
+
+```python
+df.show()
+```
+
+
┌───────┬──────────────────────────────────────────────────────────────────────────────────────────┐
+ │ val ┆ point
│
+ │ utf8 ┆ binary
│
+
╞═══════╪══════════════════════════════════════════════════════════════════════════════════════════╡
+ │ one ┆ 0101000000000000000000f03f0000000000000040
│
+
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ two ┆
0103000000010000000500000000000000008052c09a9999999959444000000000008052c06666666666664…
│
+
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ three ┆
010200000003000000aaf1d24d628052c05e4bc8073d5b444007ce1951da7b52c0933a014d845d4440c286a…
│
+
└───────┴──────────────────────────────────────────────────────────────────────────────────────────┘
+
+
+Verify that this object is a SedonaDB DataFrame.
+
+
+```python
+type(df)
+```
+
+
+
+
+ sedonadb.dataframe.DataFrame
+
+
+
+Expose the DataFrame as a view and run a SQL operation on the geometry data.
+
+
+```python
+df.to_view("fun_table")
+```
+
+
+```python
+sd.sql("DESCRIBE fun_table").show()
+```
+
+ ┌─────────────┬───────────┬─────────────┐
+ │ column_name ┆ data_type ┆ is_nullable │
+ │ utf8 ┆ utf8 ┆ utf8 │
+ ╞═════════════╪═══════════╪═════════════╡
+ │ val ┆ Utf8 ┆ YES │
+ ├╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ point ┆ Binary ┆ YES │
+ └─────────────┴───────────┴─────────────┘
+
+
+
+```python
+sd.sql("SELECT *, ST_Centroid(ST_GeomFromWKB(point)) as centroid from
fun_table").show()
+```
+
+
┌───────┬─────────────────────────────────────────────┬────────────────────────────────────────────┐
+ │ val ┆ point ┆
centroid │
+ │ utf8 ┆ binary ┆
geometry │
+
╞═══════╪═════════════════════════════════════════════╪════════════════════════════════════════════╡
+ │ one ┆ 0101000000000000000000f03f0000000000000040 ┆ POINT(1 2)
│
+
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ two ┆ 0103000000010000000500000000000000008052c0… ┆ POINT(-73.95
40.75) │
+
├╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
+ │ three ┆ 010200000003000000aaf1d24d628052c05e4bc807… ┆
POINT(-73.92111155675562 40.7664673976246… │
+
└───────┴─────────────────────────────────────────────┴────────────────────────────────────────────┘
diff --git a/docs/reference/read-parquet-files.md
b/docs/reference/read-parquet-files.md
index 927598d..6dc4836 100644
--- a/docs/reference/read-parquet-files.md
+++ b/docs/reference/read-parquet-files.md
@@ -28,19 +28,29 @@ The `sd.sql()` function is designed to query tables that
have already been regis
The correct process is a two-step approach:
-1. **Load** the Parquet file into a DataFrame using `sd.read_parquet()`.
-1. **Register** the DataFrame as a temporary view using
`.createOrReplaceTempView()`.
+1. **Load** the Parquet file into a data frame using `sd.read_parquet()`.
+1. **Register** the data frame view with `to_view()`.
1. **Query** the view using `sd.sql()`.
-```python
-# 1. Load the Parquet file from a URL into a DataFrame
+```python linenums="1" title="Read a parquet file with SedonaDB"
+
+import sedona.db
+sd = sedona.db.connect()
+
+df = sd.read_parquet(
+ 's3://wherobots-benchmark-prod/SpatialBench_sf=1_format=parquet/'
+ 'building/building.parquet'
+)
+
+# Load the Parquet file, which creates a Pandas data frame
df =
sd.read_parquet('s3://wherobots-benchmark-prod/SpatialBench_sf=1_format=parquet/building/building.parquet')
-# 2. Register the DataFrame as a temporary view named 'buildings'
-df.createOrReplaceTempView('buildings')
+# Convert the Pandas data frame to a Spark data frame AND
+# register it as a temporary view in a single line.
+spark.createDataFrame(df).to_view("zone")
-# 3. Now, query the view using SQL
-sd.sql("SELECT * FROM buildings LIMIT 10").show()
+# Now, query the view using SQL
+sd.sql("SELECT * FROM zone LIMIT 10").show()
```
### Common Errors
@@ -56,6 +66,6 @@ sd.sql("SELECT * FROM
's3://wherobots-benchmark-prod/SpatialBench_sf=1_format=pa
**Resulting Error:**
-```
+```bash
sedonadb._lib.SedonaError: Error during planning: table '...s3://...' not found
```
diff --git a/docs/requirements.txt b/docs/requirements.txt
index a6c075f..f6a1590 100644
--- a/docs/requirements.txt
+++ b/docs/requirements.txt
@@ -1,3 +1,4 @@
+jupyter
mike
mkdocs-git-revision-date-localized-plugin
mkdocs-glightbox
@@ -5,5 +6,6 @@ mkdocs-jupyter
mkdocs-macros-plugin
mkdocs-material
mkdocstrings[python]
+nbconvert
ruff
pyproj
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
index b197694..a7ad722 100644
--- a/docs/stylesheets/extra.css
+++ b/docs/stylesheets/extra.css
@@ -32,26 +32,27 @@
width: auto;
}
-/* ==========================================================================
- Navigation Tabs Styles
- ==========================================================================
*/
+/* --- Definitive Navigation CSS (Final Version) --- */
-/* The main navigation bar container (the red bar) */
+/* 1. Set the height of the main navigation bar */
.md-tabs {
background-color: var(--color-red);
+ height: 2.5rem; /* Set an explicit, predictable height for the bar */
}
-/* This ensures the navigation links are centered */
+/* 2. Control the alignment of the links within the bar */
.md-tabs .md-tabs__list {
- justify-content: center;
+ height: 100%; /* Make the link container fill the bar's height */
+ justify-content: center; /* Center links horizontally */
+ align-items: center; /* NEW: Center links vertically */
+ flex-wrap: wrap; /* Allow wrapping on small screens */
}
-/* Styles for each link in the navigation bar */
+/* 3. Style the individual navigation links */
.md-tabs__link {
- font-family: var(--font-inter);
- color: var(--color-white);
-
- /* You can adjust the padding here to control spacing */
- /* The first value is top/bottom, the second is left/right. */
- padding: .5rem .6rem;
+ font-weight: 400;
+ color: rgba(255, 255, 255, 0.85);
+ /* We no longer need vertical padding for spacing */
+ padding: 0 0.9rem;
+ font-size: 0.65rem; /* NEW: Adjust font size */
}
diff --git a/mkdocs.yml b/mkdocs.yml
index 62392c0..12607f4 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -1,11 +1,11 @@
site_name: SedonaDB
site_description: "Documentation for Apache SedonaDB"
+site_url: https://sedona.apache.org/sedonadb/
nav:
- - Home: index.md
+ - SedonaDB: index.md
- SedonaDB Guides:
+ - Python Quickstart: quickstart-python.md
- SedonaDB Guide: programming-guide.ipynb
- - CLI Quickstart: quickstart-cli.md
- - Python Quickstart: quickstart-python.ipynb
- Development: development.md
- SedonaDB Reference:
- Python:
@@ -15,8 +15,10 @@ nav:
- Spatial Joins: reference/sql-joins.md
- Read Parquet Files: reference/read-parquet-files.md
- Blog: "https://sedona.apache.org/latest/blog/"
+ - Community: "https://sedona.apache.org/latest/community/contact/"
- Apache Software Foundation: "https://sedona.apache.org/latest/asf/asf/"
- - Return to Sedona Homepage: "https://sedona.apache.org/latest/"
+ - Sedona Homepage: "https://sedona.apache.org/latest/"
+
repo_url: https://github.com/apache/sedona-db
repo_name: apache/sedona-db
theme:
@@ -50,7 +52,6 @@ extra:
default:
- latest
-
extra_css:
- stylesheets/extra.css