Hello GDAL'ers , I have made a few attempts at using ogr2ogr for getting bounding box based extracts from overturemaps datasets.
I am unfortunately not able to do so - something that takes duckdb or overturemaps-py <https://github.com/OvertureMaps/overturemaps-py> 30s or less takes forever when using ogr2ogr. overturemaps-py is essentially a wrapper over pyarrow with the arrow filter constructed from bbox. I suspect I am doing something wrong. The lesser probability is that ogr2ogr is not the right tool for this. Attempt 1: Command at the top of the link --------------------------------------------- https://pastebin.com/bh05Kcww Attempt 2: ---------------------------------------------- https://pastebin.com/BG3WmQ9Y >From what I can tell, all row groups from each of the parquet files is being loaded and checked. This is clearly not correct. Below are my libs and versions on ubuntu 20.04. All attempts are within a conda environment. gdal 3.9.2 gcc_linux-64 12.4.0 libarrow 17.0.0 libarrow-dataset 17.0.0 libparquet 17.0.0 zstd 1.5.6 libgdal-core 3.9.2 libgdal-arrow-parquet 3.9.2 libcurl/8.9.1 OpenSSL/3.3.2 I typically use the command line tools to test gdal/ogr's functionality and performance before I can embed that functionality in my own c++ app. Thus, while there are other tools, I would love to understand how to do this in GDAL/OGR. Please advice ! cheers, Varun
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev