Le 21/07/2023 à 11:46, b.coer...@mailbox.org a écrit :
One more follow-up question:
The datasets that I’m interested in contains subdatasets. I can get
the info of a subdataset like this:
sub_ds_path=
'HDF5:"/vsis3/prod-lads/VNP02IMG/VNP02IMG.A2021064.2342.002.2021128145323.nc"://observation_data/I04'
info = gdal.Info(sub_ds_path)
This works fine and finishes in a few seconds. However, when I do the
same thing for a different dataset (which contains the geolocation of
the dataset above):
sub_ds_path =
'HDF5:"/vsis3/prod-lads/VNP03IMG/VNP03IMG.A2021065.2324.002.2021127011303.nc"://geolocation_data/latitude'
info = gdal.Info(sub_ds_path)
This takes about 2.5 minutes and I can see on my network that Python
is downloading data at about 1MB/s the whole time. The info from this
subdataset contains a lot of ground-control-points, so I tried setting
“showGCPs=False”, but that doesn’t solve it. I’m not sure if it’s
really the GCPs that’s causing this (when I save the info as a json,
it is about 750kb in size).
The second product has georeferencing information, and upon opening of
one of its subsdataset, GDAL samples the latitude and longitude arrays
to expose ground control points, hence it reads those arrays.
Any ideas what else can cause this difference in execution time?
Regards,
Bert
*From: *gdal-dev <gdal-dev-boun...@lists.osgeo.org> on behalf of
b.coerver--- via gdal-dev <gdal-dev@lists.osgeo.org>
*Date: *Thursday, 20 July 2023 at 11:51
*To: *Even Rouault <even.roua...@spatialys.com>,
gdal-dev@lists.osgeo.org <gdal-dev@lists.osgeo.org>
*Subject: *Re: [gdal-dev] /vsis3/ on NetCDF from Earthdata
That does it, thank you so much!
*From: *Even Rouault <even.roua...@spatialys.com>
*Date: *Thursday, 20 July 2023 at 11:44
*To: *bcoer...@mailbox.org <b.coer...@mailbox.org>,
gdal-dev@lists.osgeo.org <gdal-dev@lists.osgeo.org>
*Subject: *Re: [gdal-dev] /vsis3/ on NetCDF from Earthdata
Bert,
Also set the GDAL_DISABLE_READDIR_ON_OPEN=EMPTY_DIR config option,
otherwise the generic open mechanism of GDAL tries to list the content
of the VNP02IMG/ directory and it seems there are tons of files there
When doing that, I get a result within a few seconds
Even
Le 20/07/2023 à 09:59, b.coerver--- via gdal-dev a écrit :
Hello,
I'm trying to access data from NASA's Earthdata S3 buckets, but I
get a `"<filename> does not exist in the file system, and is not
recognized as a supported dataset name."` error after waiting a
long time (± 50 minutes, the process is downloading some data the
whole time) doing the following:
from osgeo import gdal
gdal_config_options = {
"AWS_ACCESS_KEY_ID": creds["accessKeyId"],
"AWS_SESSION_TOKEN": creds["sessionToken"],
"AWS_SECRET_ACCESS_KEY": creds["secretAccessKey"],
"AWS_REGION": "us-west-2",
}
url =
"/vsis3/prod-lads/VNP02IMG/VNP02IMG.A2023193.1942.0022023194025636.nc"
for k, v in gdal_config_options.items():
gdal.SetConfigOption(k, v)
out =gdal.Info(url)
The `creds` variable is a dictionary with temporary credential
information that I get from
[here](https://data.laadsdaac.earthdatacloud.nasa.gov/s3credentials
<https://data.laadsdaac.earthdatacloud.nasa.gov/s3credentials>),
you need a free account to get them.
When I introduce an error in one of the keys/tokens (e.g.
`"AWS_ACCESS_KEY_ID": creds["accessKeyId"] + "x"`, I do get a
message immediately saying my credentials are unknown. So I do
think they are being ingested correctly. I’m using GDAL version 3.7.1.
I also managed to download the entire file using `boto3`, by doing
the following:
import boto3
client = boto3.client(
's3',
aws_access_key_id=creds["accessKeyId"],
aws_secret_access_key=creds["secretAccessKey"],
aws_session_token=creds["sessionToken"]
)
client.download_file('prod-lads',
'VNP02IMG/VNP02IMG.A2023193.1942.002.2023194025636.nc', 'test.nc')
Any ideas what I'm doing wrong or how to make this work? In the
end I'm interested in accessing the files metadata without
downloading the entire file
Regards,
Bert
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
http://www.spatialys.com
My software is free, but my time generally not.
--
http://www.spatialys.com
My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev