[D] Improving the Getting Started documentation [sedona]

via GitHub Tue, 26 Aug 2025 15:24:40 -0700


GitHub user mpetazzoni created a discussion: Improving the Getting Started 
documentation


Hi Sedona community and maintainers!

I would like to open a discussion around the Apache Sedona documentation, in 
particular making it easier for first-time users to get started with Apache 
Sedona to work with their geospatial data. I personally find – and hopefully, 
others as well – the current information architecture and documentation 
structure on the [Apache Sedona website](https://sedona.apache.org) quite 
confusing, and not really presenting nor matching the mental model that people 
would have coming in to use Sedona.

I think we have an opportunity to make this path more straightforward by 
reworking most of the 
"[Setup](https://sedona.apache.org/latest/setup/overview/)" and "[Programming 
Guides](https://sedona.apache.org/latest/tutorial/sql/)" sections into a 
smoother flow along the "paved path" usage for the most common use cases, while 
updating some of the language/copywriting.

We can/should also do so in a way that progressively takes the user through the 
learning curve of using Sedona and working with geospatial data, starting from 
simple concepts and situations, and introducing more complex ones later.

We should also separate advanced or contributor-only content into a separate 
area of the documentation. For example, build instructions on the "[Play Sedona 
in Docker](https://sedona.apache.org/latest/setup/docker/)" page are irrelevant 
to a user that's here to try/use Sedona.

* The "Getting Started" path could center around a simple, local, single-node 
Sedona on Apache Spark deployment using the Docker image, and introducing how 
to work with Spatial SQL with simple Parquet and/or CSV datasets;
* It would then introduce how to work with Iceberg tables, and more complex 
Spatial SQL queries and joins;
* It would graduate to working in Python, and progressively introduce the 
capabilities of the Python ecosystem around Sedona for visualizations, UDFs, 
etc;
* A separate "learning path" would introduce the various deployment options 
(Sedona on Flink, SedonaSnow, Sedona or EMR, Sedona in Databricks, Wherobots);
* Finally, we would have sections covering each set of capabilities in complete 
detail:
  * Working with all the various file and dataset formats
  * All types of spatial joins
  * Advanced geospatial algorithms
  * Geometry vs Geography

Of course, we would continue to maintain the full API/programming reference for 
SQL, Python, and Java/Scala.

What do you think?

GitHub link: https://github.com/apache/sedona/discussions/2311

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

[D] Improving the Getting Started documentation [sedona]

Reply via email to