GitHub user mpetazzoni created a discussion: Improving the Getting Started documentation
Hi Sedona community and maintainers! I would like to open a discussion around the Apache Sedona documentation, in particular making it easier for first-time users to get started with Apache Sedona to work with their geospatial data. I personally find – and hopefully, others as well – the current information architecture and documentation structure on the [Apache Sedona website](https://sedona.apache.org) quite confusing, and not really presenting nor matching the mental model that people would have coming in to use Sedona. I think we have an opportunity to make this path more straightforward by reworking most of the "[Setup](https://sedona.apache.org/latest/setup/overview/)" and "[Programming Guides](https://sedona.apache.org/latest/tutorial/sql/)" sections into a smoother flow along the "paved path" usage for the most common use cases, while updating some of the language/copywriting. We can/should also do so in a way that progressively takes the user through the learning curve of using Sedona and working with geospatial data, starting from simple concepts and situations, and introducing more complex ones later. We should also separate advanced or contributor-only content into a separate area of the documentation. For example, build instructions on the "[Play Sedona in Docker](https://sedona.apache.org/latest/setup/docker/)" page are irrelevant to a user that's here to try/use Sedona. * The "Getting Started" path could center around a simple, local, single-node Sedona on Apache Spark deployment using the Docker image, and introducing how to work with Spatial SQL with simple Parquet and/or CSV datasets; * It would then introduce how to work with Iceberg tables, and more complex Spatial SQL queries and joins; * It would graduate to working in Python, and progressively introduce the capabilities of the Python ecosystem around Sedona for visualizations, UDFs, etc; * A separate "learning path" would introduce the various deployment options (Sedona on Flink, SedonaSnow, Sedona or EMR, Sedona in Databricks, Wherobots); * Finally, we would have sections covering each set of capabilities in complete detail: * Working with all the various file and dataset formats * All types of spatial joins * Advanced geospatial algorithms * Geometry vs Geography Of course, we would continue to maintain the full API/programming reference for SQL, Python, and Java/Scala. What do you think? GitHub link: https://github.com/apache/sedona/discussions/2311 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
