This is an automated email from the ASF dual-hosted git repository.
jiayu pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/sedona-spatialbench.git
The following commit(s) were added to refs/heads/main by this push:
new 8342248 fix: Update docs and readme (#74)
8342248 is described below
commit 834224842391919c7bab462eea36f90e29eb0ef5
Author: Jia Yu <[email protected]>
AuthorDate: Wed Jan 14 21:31:12 2026 -0700
fix: Update docs and readme (#74)
---
README.md | 32 ++++++++++++++++++++++++++++++++
docs/index.md | 11 +++++++++++
docs/single-node-benchmarks.md | 36 +++++++++++++++++++++++++++++++++++-
3 files changed, 78 insertions(+), 1 deletion(-)
diff --git a/README.md b/README.md
index 24680ea..2ffccc3 100644
--- a/README.md
+++ b/README.md
@@ -39,6 +39,38 @@ You can print the queries in your dialect of choice using
the following command:
./spatialbench-queries/print_queries.py <dialect>
```
+## Automated Benchmarks
+
+SpatialBench includes an automated benchmark framework that runs on GitHub
Actions to verify that all queries are fully runnable across supported engines.
+
+> **Note:** The GitHub Actions benchmark is designed to validate correctness
and runnability, not for serious performance comparisons. For meaningful
performance benchmarks, please run SpatialBench on dedicated hardware with
appropriate scale factors. See the [Single Node
Benchmarks](https://sedona.apache.org/spatialbench/single-node-benchmarks/)
page for detailed performance results.
+
+The automated tests cover:
+
+- 🦆 **DuckDB** - In-process analytical database with spatial extension
+- 🐼 **GeoPandas** - Python geospatial data analysis library
+- 🌵 **SedonaDB** - High-performance spatial analytics engine
+- 🐻❄️ **Spatial Polars** - Geospatial extension for Polars dataframes
+
+### View Latest Results
+
+You can view the latest results on the [GitHub Actions
page](../../actions/workflows/benchmark.yml). Click on any successful workflow
run to see the summary with:
+
+- Query execution times for each engine
+- Performance comparison across all 12 queries
+- Winner highlighting for each query
+
+### Run Benchmarks Manually
+
+You can trigger a benchmark run manually from the [Actions
tab](../../actions/workflows/benchmark.yml) with configurable options:
+
+- **Scale Factor**: 0.1, 1, or 10
+- **Engines**: Select which engines to benchmark
+- **Query Timeout**: Adjust timeout for longer queries
+- **Runs per Query**: 1, 3, or 5 runs for averaging
+
+The benchmark data is automatically downloaded from [Hugging
Face](https://huggingface.co/datasets/apache-sedona/spatialbench) and cached
for subsequent runs.
+
## Data Model
SpatialBench defines a spatial star schema with the following tables:
diff --git a/docs/index.md b/docs/index.md
index 55046a6..c33c007 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -86,6 +86,17 @@ ORDER BY nearby_pickup_count DESC;
This query performs a distance join, followed by an aggregation. It's a great
example of a query that's useful for performance benchmarking a spatial engine
that can process vector geometries.
+## Automated Testing
+
+SpatialBench includes an automated benchmark that runs on GitHub Actions to
verify that all queries are fully runnable across supported engines (DuckDB,
GeoPandas, SedonaDB, and Spatial Polars).
+
+**[View the latest test results
→](https://github.com/apache/sedona-spatialbench/actions/workflows/benchmark.yml)**
+
+Click on any successful workflow run and scroll to the **Summary** section to
see the results.
+
+!!! note
+ The GitHub Actions benchmark is designed to validate correctness and
runnability, not for serious performance comparisons. For meaningful
performance benchmarks, see the [Single Node
Benchmarks](single-node-benchmarks.md) page.
+
## Join the community
Feel free to start a [GitHub
Discussion](https://github.com/apache/sedona/discussions) or join the [Discord
community](https://discord.gg/9A3k5dEBsY) to ask the developers any questions
you may have.
diff --git a/docs/single-node-benchmarks.md b/docs/single-node-benchmarks.md
index 4337091..d1d6e37 100644
--- a/docs/single-node-benchmarks.md
+++ b/docs/single-node-benchmarks.md
@@ -97,7 +97,41 @@ SedonaDB completes KNN joins at both SF 1 and SF 10, thanks
to its native operat
SedonaDB demonstrates balanced performance across all query types and scales
effectively to SF 10. DuckDB excels at spatial filters and some geometric
operations but faces challenges with complex joins and KNN queries. GeoPandas,
while popular in the Python ecosystem, requires manual optimization and
parallelization to handle larger datasets effectively.
-## Benchmark code
+## Automated Benchmarks (GitHub Actions)
+
+We run automated benchmarks on every pull request and periodically via GitHub
Actions to verify that all SpatialBench queries are fully runnable across
supported engines.
+
+!!! note "Not for Performance Comparison"
+ The GitHub Actions benchmark is designed to validate correctness and
runnability, **not** for serious performance comparisons. GitHub Actions
runners have variable performance characteristics and limited resources. For
meaningful performance benchmarks, please run SpatialBench on dedicated
hardware with appropriate scale factors as described in the sections above.
+
+### View Latest Results
+
+Visit the [GitHub Actions Benchmark
Page](https://github.com/apache/sedona-spatialbench/actions/workflows/benchmark.yml)
to see the latest results. Click on any successful workflow run and scroll to
the **Summary** section to view:
+
+- Query execution status for each engine
+- Comparison across all 12 queries
+- Error and timeout information
+
+### Supported Engines
+
+The automated tests cover:
+
+- 🦆 **DuckDB** - In-process analytical database with spatial extension
+- 🐼 **GeoPandas** - Python geospatial data analysis library
+- 🌵 **SedonaDB** - High-performance spatial analytics engine
+- 🐻❄️ **Spatial Polars** - Geospatial extension for Polars dataframes
+
+### Run Your Own Benchmark
+
+You can trigger the automated tests manually from the [Actions
tab](https://github.com/apache/sedona-spatialbench/actions/workflows/benchmark.yml)
with configurable options:
+
+- **Scale Factor**: 0.1, 1, or 10
+- **Engines**: Select which engines to test
+- **Query Timeout**: Adjust timeout for longer queries (default: 60s)
+- **Runs per Query**: 1, 3, or 5 runs for averaging (default: 3)
+- **Package Versions**: Pin specific versions or use latest
+
+## Benchmark Code
You can access and run the benchmark code in the [sedona-spatialbench
GitHub](https://github.com/apache/sedona-spatialbench) repository.