One more follow-up question:

 

The datasets that I’m interested in contains subdatasets. I can get the info of a subdataset like this:

 

    sub_ds_path =  'HDF5:"/vsis3/prod-lads/VNP02IMG/VNP02IMG.A2021064.2342.002.2021128145323.nc"://observation_data/I04'

    info = gdal.Info(sub_ds_path)

 

This works fine and finishes in a few seconds. However, when I do the same thing for a different dataset (which contains the geolocation of the dataset above):

 

    sub_ds_path = 'HDF5:"/vsis3/prod-lads/VNP03IMG/VNP03IMG.A2021065.2324.002.2021127011303.nc"://geolocation_data/latitude'

    info = gdal.Info(sub_ds_path)

 

This takes about 2.5 minutes and I can see on my network that Python is downloading data at about 1MB/s the whole time. The info from this subdataset contains a lot of ground-control-points, so I tried setting “showGCPs=False, but that doesn’t solve it. I’m not sure if it’s really the GCPs that’s causing this (when I save the info as a json, it is about 750kb in size).

 

Any ideas what else can cause this difference in execution time?

 

Regards,

Bert

 

 

From: gdal-dev <gdal-dev-boun...@lists.osgeo.org> on behalf of b.coerver--- via gdal-dev <gdal-dev@lists.osgeo.org>
Date: Thursday, 20 July 2023 at 11:51
To: Even Rouault <even.roua...@spatialys.com>, gdal-dev@lists.osgeo.org <gdal-dev@lists.osgeo.org>
Subject: Re: [gdal-dev] /vsis3/ on NetCDF from Earthdata

That does it, thank you so much!

 

From: Even Rouault <even.roua...@spatialys.com>
Date: Thursday, 20 July 2023 at 11:44
To: bcoer...@mailbox.org <b.coer...@mailbox.org>, gdal-dev@lists.osgeo.org <gdal-dev@lists.osgeo.org>
Subject: Re: [gdal-dev] /vsis3/ on NetCDF from Earthdata

Bert,

Also set the GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR config option, otherwise the generic open mechanism of GDAL tries to list the content of the VNP02IMG/ directory and it seems there are tons of files there

When doing that, I get a result within a few seconds

Even

Le 20/07/2023 à 09:59, b.coerver--- via gdal-dev a écrit :

Hello,

 

I'm trying to access data from NASA's Earthdata S3 buckets, but I get a `"<filename> does not exist in the file system, and is not recognized as a supported dataset name."` error after waiting a long time (± 50 minutes, the process is downloading some data the whole time) doing the following:

 

from osgeo import gdal

 

gdal_config_options = {

    "AWS_ACCESS_KEY_ID": creds["accessKeyId"],

    "AWS_SESSION_TOKEN":  creds["sessionToken"],

    "AWS_SECRET_ACCESS_KEY": creds["secretAccessKey"],

    "AWS_REGION": "us-west-2",

}

 

url = "">

        

for k, v in gdal_config_options.items():

    gdal.SetConfigOption(k, v)

 

out = gdal.Info(url)

 

The `creds` variable is a dictionary with temporary credential information that I get from [here](https://data.laadsdaac.earthdatacloud.nasa.gov/s3credentials), you need a free account to get them.

 

When I introduce an error in one of the keys/tokens (e.g. `"AWS_ACCESS_KEY_ID": creds["accessKeyId"] + "x"`, I do get a message immediately saying my credentials are unknown. So I do think they are being ingested correctly. I’m using GDAL version 3.7.1.

 

I also managed to download the entire file using `boto3`, by doing the following:

 

    import boto3

 

    client = boto3.client(

        's3',

        aws_access_key_id=creds["accessKeyId"],

        aws_secret_access_key=creds["secretAccessKey"],

        aws_session_token=creds["sessionToken"]

    )

    

    client.download_file('prod-lads', 'VNP02IMG/VNP02IMG.A2023193.1942.002.2023194025636.nc', 'test.nc')

    

Any ideas what I'm doing wrong or how to make this work? In the end I'm interested in accessing the files metadata without downloading the entire file

 

Regards,

Bert

 

 

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
-- 
http://www.spatialys.com
My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to