Le 21/07/2023 à 11:46, b.coer...@mailbox.org a écrit :

One more follow-up question:

The datasets that I’m interested in contains subdatasets. I can get the info of a subdataset like this:

    sub_ds_path= 'HDF5:"/vsis3/prod-lads/VNP02IMG/VNP02IMG.A2021064.2342.002.2021128145323.nc"://observation_data/I04'

    info = gdal.Info(sub_ds_path)

This works fine and finishes in a few seconds. However, when I do the same thing for a different dataset (which contains the geolocation of the dataset above):

sub_ds_path = 'HDF5:"/vsis3/prod-lads/VNP03IMG/VNP03IMG.A2021065.2324.002.2021127011303.nc"://geolocation_data/latitude'

info = gdal.Info(sub_ds_path)

This takes about 2.5 minutes and I can see on my network that Python is downloading data at about 1MB/s the whole time. The info from this subdataset contains a lot of ground-control-points, so I tried setting “showGCPs=False”, but that doesn’t solve it. I’m not sure if it’s really the GCPs that’s causing this (when I save the info as a json, it is about 750kb in size).

The second product has georeferencing information, and upon opening of one of its subsdataset, GDAL samples the latitude and longitude arrays to expose ground control points, hence it reads those arrays.

Any ideas what else can cause this difference in execution time?

Regards,

Bert

*From: *gdal-dev <gdal-dev-boun...@lists.osgeo.org> on behalf of b.coerver--- via gdal-dev <gdal-dev@lists.osgeo.org>
*Date: *Thursday, 20 July 2023 at 11:51
*To: *Even Rouault <even.roua...@spatialys.com>, gdal-dev@lists.osgeo.org <gdal-dev@lists.osgeo.org>
*Subject: *Re: [gdal-dev] /vsis3/ on NetCDF from Earthdata

That does it, thank you so much!

*From: *Even Rouault <even.roua...@spatialys.com>
*Date: *Thursday, 20 July 2023 at 11:44
*To: *bcoer...@mailbox.org <b.coer...@mailbox.org>, gdal-dev@lists.osgeo.org <gdal-dev@lists.osgeo.org>
*Subject: *Re: [gdal-dev] /vsis3/ on NetCDF from Earthdata

Bert,

Also set the GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR config option, otherwise the generic open mechanism of GDAL tries to list the content of the VNP02IMG/ directory and it seems there are tons of files there

When doing that, I get a result within a few seconds

Even

Le 20/07/2023 à 09:59, b.coerver--- via gdal-dev a écrit :

    Hello,

    I'm trying to access data from NASA's Earthdata S3 buckets, but I
    get a `"<filename> does not exist in the file system, and is not
    recognized as a supported dataset name."` error after waiting a
    long time (± 50 minutes, the process is downloading some data the
    whole time) doing the following:

    from osgeo import gdal

    gdal_config_options = {

        "AWS_ACCESS_KEY_ID": creds["accessKeyId"],

        "AWS_SESSION_TOKEN": creds["sessionToken"],

        "AWS_SECRET_ACCESS_KEY": creds["secretAccessKey"],

        "AWS_REGION": "us-west-2",

    }

    url =
    "/vsis3/prod-lads/VNP02IMG/VNP02IMG.A2023193.1942.0022023194025636.nc"

    for k, v in gdal_config_options.items():

        gdal.SetConfigOption(k, v)

    out =gdal.Info(url)

    The `creds` variable is a dictionary with temporary credential
    information that I get from
    [here](https://data.laadsdaac.earthdatacloud.nasa.gov/s3credentials
    <https://data.laadsdaac.earthdatacloud.nasa.gov/s3credentials>),
    you need a free account to get them.

    When I introduce an error in one of the keys/tokens (e.g.
    `"AWS_ACCESS_KEY_ID": creds["accessKeyId"] + "x"`, I do get a
    message immediately saying my credentials are unknown. So I do
    think they are being ingested correctly. I’m using GDAL version 3.7.1.

    I also managed to download the entire file using `boto3`, by doing
    the following:

        import boto3

        client = boto3.client(

            's3',

    aws_access_key_id=creds["accessKeyId"],

    aws_secret_access_key=creds["secretAccessKey"],

    aws_session_token=creds["sessionToken"]

        )

        client.download_file('prod-lads',
    'VNP02IMG/VNP02IMG.A2023193.1942.002.2023194025636.nc', 'test.nc')

    Any ideas what I'm doing wrong or how to make this work? In the
    end I'm interested in accessing the files metadata without
    downloading the entire file

    Regards,

    Bert

    _______________________________________________

    gdal-dev mailing list

    gdal-dev@lists.osgeo.org

    https://lists.osgeo.org/mailman/listinfo/gdal-dev

--
http://www.spatialys.com
My software is free, but my time generally not.

--
http://www.spatialys.com
My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to