I could be completely wrong here.

My understanding is duckdb uses httpfs or possibly some variant of fsspec.

I believe /vsis3 uses only libcurl, which doesn't *appear* to have support for httpfs.

Again, I could be wildly wrong.

On 8/28/24 09:45, Daniel Baston via gdal-dev wrote:
Hello,

I'm trying to use ogr2ogr with an attribute filter to pull 14 polygons
from Overture maps. Running the following command with CPL_DEBUG=ON
tells me that "PARQUET: Attribute filter fully translated to Arrow"
yet it takes about 7 minutes to complete, and appears to download
quite a bit of data:

ogr2ogr /tmp/vt.geojson
"PARQUET:/vsis3/overturemaps-us-west-2/release/2024-08-20.0/theme=divisions/type=division_area"
-select "id,division_id,names.primary" -where "subtype='county' AND
country='US' AND region='US-VT'"

Have I made a mistake in my ogr2ogr invocation? For comparison,
running what I believe to be an equivalent query in DuckDB takes about
10 seconds:

SELECT
       id,
       division_id,
       names.primary,
       ST_GeomFromWKB(geometry) as geometry
       FROM
           
read_parquet('s3://overturemaps-us-west-2/release/2024-08-20.0/theme=divisions/type=division_area/*',
hive_partitioning=1)
       WHERE
           subtype = 'county'
           AND country = 'US'
           AND region = 'US-VT';

I am using GDAL master (e09d07a7) and libarrow 16.1.

Thanks,
Dan
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to