Hello GDAL'ers ,

I have made a few attempts at using ogr2ogr for getting bounding box based
extracts from overturemaps datasets.

I am unfortunately not able to do so - something that takes duckdb or
overturemaps-py  <https://github.com/OvertureMaps/overturemaps-py> 30s or
less takes forever when using ogr2ogr. overturemaps-py is essentially a
wrapper over pyarrow with the arrow filter constructed from bbox.

I suspect I am doing something wrong. The lesser probability is that
ogr2ogr is not the right tool for this.

Attempt 1: Command at the top of the link
---------------------------------------------
https://pastebin.com/bh05Kcww

Attempt 2:
----------------------------------------------

https://pastebin.com/BG3WmQ9Y

>From what I can tell, all row groups from each of the parquet files is
being loaded and checked. This is clearly not correct.

Below are my libs and versions on ubuntu 20.04. All attempts are within a
conda environment.

gdal                      3.9.2
gcc_linux-64              12.4.0
libarrow                  17.0.0
libarrow-dataset          17.0.0
libparquet                17.0.0
zstd                      1.5.6
libgdal-core              3.9.2
libgdal-arrow-parquet     3.9.2
libcurl/8.9.1
OpenSSL/3.3.2

I typically use the command line tools to test gdal/ogr's functionality and
performance before I can embed that functionality in my own c++ app. Thus,
while there are other tools, I would love to understand how to do this in
GDAL/OGR.

Please advice !

cheers,
Varun
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to