Hi all,

Following the momentum in specifications for storing geospatial vector data
using the Arrow columnar format [1][2] and the recent addition of Feather,
IPC, and Parquet drivers in GDAL [3], I've been prototyping an R package
[4] to add geospatial support to the Arrow R bindings (e.g., register
extension types to support passing arrow-encoded geometry extension columns
through the query engine, convert arrays to/from various R representations
of geometry).

Currently I'm prototyping the R package with a small mostly header-only C++
library based on the C Data interface [5]. This is not ideal since there is
some overlap with Arrow C++ (e.g., there is also an ArrayBuilder
implementation and something like an Array class for the geo-specific
arrays), but the library was fun to prototype, helped me learn the details
of the columnar format, and is a good fit for the scope the R package,
whose job is mostly to convert Arrays that have been read using the arrow R
package into objects that R users/package developers can manipulate.

I would prefer to build geospatial support based on Arrow C++ or some other
on-label Arrow library, but the complexities of linking other geospatial
libraries (e.g., GDAL, GEOS, PROJ) make it a poor fit to include within
Arrow C++ itself, and the complexities of linking Arrow C++ make it a poor
fit for including another copy of it in a separate R or Python library
(with all the duplicated maintenance effort that would entail).

I suppose my question is: How I should move forward with
building/contributing geospatial support? I feel too new to the Arrow
community to suggest an initial direction, but would be happy to contribute
any of the code I've written into Arrow, rewrite the geospatial bits on top
of an easier-to-vendor Arrow C++, or anything inbetween!

Cheers,

-dewey

[1] https://github.com/opengeospatial/geoparquet
[2] https://github.com/geopandas/geo-arrow-spec
[3] https://gdal.org/drivers/vector/parquet.html#vector-parquet
[4] https://github.com/paleolimbot/geoarrow
[5] https://github.com/paleolimbot/geoarrow-cpp

Reply via email to