I have a collection of parquet files all with the same schema, different stac 
collections written using geopandas to parquet. 

 

When I query at the cli or in python for a directory of parquert files using 
sql I get timestamp casting errors


gdal vector info -i PARQUET:s3://mybucket/stac/mds/rasters/ --sql "select * 
from 'rasters' where st_intersects(geometry, st_geomfromtext('POLYGON 
((-68.00948853933728 17.7602787370086, -64.99052950907739 17.7602787370086, 
-64.99052950907739 18.6509945435268, -68.00948853933728 18.6509945435268, 
-68.00948853933728 17.7602787370086))'))" --dialect sqlite -f text

INFO: Open of `PARQUET:s3://grid-dev-publiclidar/stac/mds/rasters/'

      using driver `Parquet' successful.

 

Layer name: SELECT

Geometry: Polygon

ERROR 1: ReadNext() failed: Casting from timestamp[us, tz=UTC] to timestamp[ns, 
tz=UTC] would result in out of bounds timestamp: 237718454400000000

Feature Count: 34

ERROR 1: ReadNext() failed: Casting from timestamp[us, tz=UTC] to timestamp[ns, 
tz=UTC] would result in out of bounds timestamp: 237718454400000000

Extent: (-180.000000, -90.000000) - (180.000000, 83.999167)

 

If I do it file by file for all the parquet in a directory, I don’t get an 
error. 

 

Is this a bug or a problem with sqlite dialect and parquet?

 

 

-- 

Michael Smith

RSGIS Center – ERDC CRREL NH

US Army Corps

 

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to