I'm not sure that providing a fix to work around this very broken behavior is the best way of action to make them fix their server...
On Tue, Sep 10, 2024 at 5:07 PM Even Rouault via gdal-dev < gdal-dev@lists.osgeo.org> wrote: > > Le 10/09/2024 à 16:10, Rahkonen Jukka via gdal-dev a écrit : > > Hi, > > > > Have you tried with configuration option “CPL_VSIL_CURL_USE_HEAD=[YES/NO]: > Defaults to YES. Controls whether to use a HEAD request when opening a > remote URL.” > > I was just going to suggest that too. It "works", but not really. It just > postpones the core issue: the server doesn't support GET Range requests, so > can't be used with /vsicurl/ > > As it has a COG organization with overview data first in the file, If you > want to read the smallest overview(s), you can use /vsicurl_streaming/ > instead, but that won't be efficient to read the bottom-right most tile of > the full resoultion late, which will require reading the whole file... > > Nothing GDAL can do about that. > > Actually... digging further... it somehow supports Range requests, but in > what I believe a non-compliant way. It does return the expected content, > but returns HTTP 200 and not HTTP 206 (Partial content). And it never > returns the Content-Length header. > > Well, I've implemented a workaround in > https://github.com/OSGeo/gdal/pull/10760 that might be useful in other > similar cases too. > > With that, the following works: > > gdal_translate > "/vsicurl?file_size=unlimited&url=https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif" > --config GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR out.tif -srcwin 5000 5000 50 > 50 > > file_size=unlimited works here since the GTiff driver doesn't really need > to have the right file size, it will just check we don't try to read beyond > at some points, so unlimited is OK. In other situations/drivers, the exact > value could be needed. > > But they should really fix their servers > > Even > > > > -Jukka Rahkonen- > > > > *Lähettäjä:* gdal-dev <gdal-dev-boun...@lists.osgeo.org> > <gdal-dev-boun...@lists.osgeo.org> *Puolesta *Daniel Evans via gdal-dev > *Lähetetty:* tiistai 10. syyskuuta 2024 16.57 > *Vastaanottaja:* 'gdal-dev@lists.osgeo.org' (gdal-dev@lists.osgeo.org) > <gdal-dev@lists.osgeo.org> <gdal-dev@lists.osgeo.org> > *Aihe:* [gdal-dev] Ignore content-length in vsicurl? > > > > Hi all, > > > > I am attempting to read a dataset via /vsicurl/ where I believe the server > is incorrectly returning `content-length: 0` in response to HEAD requests. > This causes GDAL to believe it's a zero-length file, and it therefore can't > be read. > > > > If I download the file via HTTP GET, it's valid, and GDAL can read it > locally. I've also confirmed I can use /vsicurl/ on some test datasets in > the GDAL repo. > > > > Is it possible to force GDAL to work around the faulty content-length > header, or is it too fundamental a problem to ignore? > > > > I've separately got in touch with the data provider to see if they are > able to fix the issue at their end. > > > > Cheers, > > Daniel > > > > > > URL of the troublesome dataset: > > > https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif > > > > > > Example HTTP header responses I'm seeing: > > > > GET > > > > HTTP/2 200 > date: Tue, 10 Sep 2024 13:47:54 GMT > content-type: binary/octet-stream > content-length: 278198294 > vary: Origin, Access-Control-Request-Method, Access-Control-Request-Headers > etag: "a79f3f685281d6681e4d362536c5b3eb-34" > last-modified: Thu, 25 Jul 2024 13:16:08 GMT > x-version: 0.0.16 > access-control-allow-credentials: true > > > > HEAD > > > > HTTP/2 200 > date: Tue, 10 Sep 2024 13:48:08 GMT > content-type: binary/octet-stream > content-length: 0 > x-version: 0.0.16 > access-control-allow-credentials: true > etag: "a79f3f685281d6681e4d362536c5b3eb-34" > last-modified: Thu, 25 Jul 2024 13:16:08 GMT > vary: Origin, Access-Control-Request-Method, Access-Control-Request-Headers > > _______________________________________________ > gdal-dev mailing > listgdal-dev@lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev > > -- http://www.spatialys.com > My software is free, but my time generally not. > > _______________________________________________ > gdal-dev mailing list > gdal-dev@lists.osgeo.org > https://lists.osgeo.org/mailman/listinfo/gdal-dev >
_______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev