Hi Reynold and team,
I’m glad to see that the Spark community is recognizing the importance
of geospatial support. The Sedona community has long been a strong
advocate for Spark, and we’ve proudly supported large-scale geospatial
workloads on Spark for nearly a decade. We’re absolutely open to
col
* 1. Domain types evolve quickly.*
It has taken years for Parquet to include these new types in its format...
We could evolve alongside Parquet. Unfortunately, Spark is not known for
upgrading its dependencies quickly.
* 2. Geospatial in Java and Python is a dependency hell.*
How has Par
While I don’t think Spark should become a super specialized geospatial
processing engine, I don’t think it makes sense to focus *only* on reading
and writing from storage. Geospatial is a pretty common and fundamental
capability of analytics systems and virtually every mature and popular
analytics
Hi Wenchen, Menelaos and Szehon,
Thanks for the clarification — I’m glad to hear the primary motivation of this
SPIP is focused on reading and writing geospatial data with Parquet and
Iceberg. That’s an important goal, and I want to highlight that this problem is
being solved by the Apache Sedo
To continue along the line of thought of Szehon:
I am really excited that the Parquet and Iceberg communities have adopted
geospatial logical types and of course I am grateful for the work put in that
direction.
As both Wenchen and Szehon pointed out in their own way, the goal is to have
minim
Thank you Menelaos, will do!To give a little background, Jia and Sedona community, also GeoParquet community, and others really put much effort contributing to defining the Parquet and Iceberg geo types, which couldn't be done without their experience and help! But I do agree with Wenchen , now tha
Hello Jia,
Wenchen summarized the intent very clearly. The scope of the proposal is
primarily the type system and storage, not processing. Let’s work together on
the technical details and make sure the work we propose to do in Spark works
best with Apache Sedona.
Best,
Menelaos
> On Mar 29,
Hi Jia,
This is a good question. As the shepherd of this SPIP, I'd like to clarify
the motivation here: the focus of this project is more about the storage
part, not the processing. Apache Sedona is a great library for geo
processing, but without native geo type support in Spark, users can't do
th