The server seems to have been fixed, but on reflection, I won't merge the workaround PR as this is (thanksfully) quite an unusual case. At least the PR is there if someone badly needed to patch their build with such a workaround.

Le 10/09/2024 à 17:11, thomas bonfort a écrit :
I'm not sure that providing a fix to work around this very broken behavior is the best way of action to make them fix their server...

On Tue, Sep 10, 2024 at 5:07 PM Even Rouault via gdal-dev <gdal-dev@lists.osgeo.org> wrote:


    Le 10/09/2024 à 16:10, Rahkonen Jukka via gdal-dev a écrit :

    Hi,

    Have you tried with configuration option
    “CPL_VSIL_CURL_USE_HEAD=[YES/NO]: Defaults to YES. Controls
    whether to use a HEAD request when opening a remote URL.”

    I was just going to suggest that too. It "works", but not really.
    It just postpones the core issue: the server doesn't support GET
    Range requests, so can't be used with /vsicurl/

    As it has a COG organization with overview data first in the file,
    If you want to read the smallest overview(s), you can use
    /vsicurl_streaming/ instead, but that won't be efficient to read
    the bottom-right most tile of the full resoultion late, which will
    require reading the whole file...

    Nothing GDAL can do about that.

    Actually... digging further... it somehow supports Range requests,
    but in what I believe a non-compliant way. It does return the
    expected content, but returns HTTP 200 and not HTTP 206 (Partial
    content). And it never returns the Content-Length header.

    Well, I've implemented a workaround in
    https://github.com/OSGeo/gdal/pull/10760 that might be useful in
    other similar cases too.

    With that, the following works:

    |gdal_translate
    
"/vsicurl?file_size=unlimited&url=https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif";
    --config GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR out.tif -srcwin
    5000 5000 50 50|

    file_size=unlimited works here since the GTiff driver doesn't
    really need to have the right file size, it will just check we
    don't try to read beyond at some points, so unlimited is OK. In
    other situations/drivers, the exact value could be needed.

    But they should really fix their servers

    Even

    -Jukka Rahkonen-

    *Lähettäjä:* gdal-dev <gdal-dev-boun...@lists.osgeo.org>
    <mailto:gdal-dev-boun...@lists.osgeo.org> *Puolesta *Daniel Evans
    via gdal-dev
    *Lähetetty:* tiistai 10. syyskuuta 2024 16.57
    *Vastaanottaja:* 'gdal-dev@lists.osgeo.org'
    (gdal-dev@lists.osgeo.org) <gdal-dev@lists.osgeo.org>
    <mailto:gdal-dev@lists.osgeo.org>
    *Aihe:* [gdal-dev] Ignore content-length in vsicurl?

    Hi all,

    I am attempting to read a dataset via /vsicurl/ where I believe
    the server is incorrectly returning `content-length: 0` in
    response to HEAD requests. This causes GDAL to believe it's a
    zero-length file, and it therefore can't be read.

    If I download the file via HTTP GET, it's valid, and GDAL can
    read it locally. I've also confirmed I can use /vsicurl/ on some
    test datasets in the GDAL repo.

    Is it possible to force GDAL to work around the faulty
    content-length header, or is it too fundamental a problem to ignore?

    I've separately got in touch with the data provider to see if
    they are able to fix the issue at their end.

    Cheers,

    Daniel

    URL of the troublesome dataset:

    
https://data.source.coop/earthgenome/sentinel2-temporal-mosaics/20NMH_2024-04-01_2024-08-01/B08.tif

    Example HTTP header responses I'm seeing:

    GET

    HTTP/2 200
    date: Tue, 10 Sep 2024 13:47:54 GMT
    content-type: binary/octet-stream
    content-length: 278198294
    vary: Origin, Access-Control-Request-Method,
    Access-Control-Request-Headers
    etag: "a79f3f685281d6681e4d362536c5b3eb-34"
    last-modified: Thu, 25 Jul 2024 13:16:08 GMT
    x-version: 0.0.16
    access-control-allow-credentials: true

    HEAD

    HTTP/2 200
    date: Tue, 10 Sep 2024 13:48:08 GMT
    content-type: binary/octet-stream
    content-length: 0
    x-version: 0.0.16
    access-control-allow-credentials: true
    etag: "a79f3f685281d6681e4d362536c5b3eb-34"
    last-modified: Thu, 25 Jul 2024 13:16:08 GMT
    vary: Origin, Access-Control-Request-Method,
    Access-Control-Request-Headers


    _______________________________________________
    gdal-dev mailing list
    gdal-dev@lists.osgeo.org
    https://lists.osgeo.org/mailman/listinfo/gdal-dev

-- http://www.spatialys.com
    My software is free, but my time generally not.

    _______________________________________________
    gdal-dev mailing list
    gdal-dev@lists.osgeo.org
    https://lists.osgeo.org/mailman/listinfo/gdal-dev

--
http://www.spatialys.com
My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to