Hi,
I don’t mean that you should try this and that blindly but to describe
what data you have in your hands and what you are planning to do with it so
that the other GDAL users could consider what reasonable alternatives you
could have. I have never done anything that is even close to your use case
but due to other experience I can see potential issues in a few places:
- You try to update image A that has a size 450000 by 225000 pixels
with image B that has the same size. The result would be A updated into a
full copy of B if all pixels in B are valid.
- However, image B probably has very much NoData (we do not know
because you have not told that) and if GDAL deals with NoData correctly the
result would be A updated with valid pixels from B and that is probably
what is desired.
- However, we do not know how effectively GDAL skips the nodata pixels
of B. It may be fast or not. If we know that most part of the world is
NoData it might be good to crop image B to include just the area where
there is data. That’s maybe UK in your case. If skipping the NoData is fast
then cropping won’t give speedup but it is cheap to test.
- You have compressed images. LZW algorithm is compressing some data
more effectively than some other. If you expect that you can replace a
chunk of LZW compressed data inside a TIFF file with another chunk of LZW
compressed data in place you are wrong. The new chunk of data may be larger
and it just cannot fit into the same space. Assumption that updating a 6 GB
image with 600 MB new data would yield a 6 GB image is not correct with
compressed data.
- I can imagine that there could be other technical reasons to write
the replacing data at the end of the existing TIFF and update the image
directories. If the image size is critical it may require re-writing the
updated TIFF into a new TIFF file. The complete re-write can be done in
most optimal way. See this wiki page
https://trac.osgeo.org/gdal/wiki/UserDocs/GdalWarp#GeoTIFFoutput-coCOMPRESSisbroken
- If the images are in AWS it is possible that the process should be
somehow different than with local images. I have no experience about AWS
yet.
- A 450000 by 225000 image is rather big. It is possible that it would
be faster to split the image into smaller parts, update the parts that need
updating, and combine the parts back into a big image. Or keep the parts
and combine them virtually with gdalbuildvrt into VRT.
Your use case is not so usual and it is rather heavy but there are
certainly several ways to do what you want. What should be avoided it to
select an inefficient method and try to optimize it.
Good luck with your experiments,
-Jukka-
*Lähettäjä:* Clive Swan <[email protected]>
*Lähetetty:* keskiviikko 14. joulukuuta 2022 10.29
*Vastaanottaja:* Rahkonen Jukka <[email protected]>
*Aihe:* Re: [gdal-dev] gdalwarp running very slow
Hi Jukka,
Thanks for that, was really stressed.
I will export the UK extent, and rerun the script.
Thanks
Clive
Sent from Outlook for Android
<https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Faka.ms%2FAAb9ysg&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C23104e51c7df4d425ea008daddad3302%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066033206354325%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=y3osHPcjOOvs6KrQUG6q2u1%2Bzyp8dCprHYhf%2Fza4aKY%3D&reserved=0>
------------------------------
*From:* Rahkonen Jukka <[email protected]>
*Sent:* Wednesday, December 14, 2022 7:18:50 AM
*To:* Clive Swan <[email protected]>; [email protected] <
[email protected]>
*Subject:* Re: [gdal-dev] gdalwarp running very slow
Hi,
Thank you for the information about the source files. I do not yet
understand what you are trying to do and why. The both images have the same
size 450000 and 225000 and they cover the same area. Is the “image
5_UK_coastal-2020.tif” just NoData with pixel value -9999 everywhere
outside the UK? The name of the image makes me think so.
-Jukka Rahkonen-
*Lähettäjä:* Clive Swan <[email protected]>
*Lähetetty:* tiistai 13. joulukuuta 2022 19.22
*Vastaanottaja:* [email protected]
*Kopio:* Rahkonen Jukka <[email protected]>
*Aihe:* [gdal-dev] gdalwarp running very slow
Greetings,
I am using the same files, I copied them from an AWS Bucket to a local AWS
Instance.
I tried gdal_merge << tries to create 300GB file
I tried gdal_translate ran but created 2.5 GB not 6.9 GB file
Now trying gdalwarp.
the gdalinfo is the same in both datasets:
coastal-2020.tif (6.9GB)
Driver: GTiff/GeoTIFF
Size is 450000, 225000
Coordinate System is:
GEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]]
Data axis to CRS axis mapping: 2,1
Origin = (-180.000000000000000,90.000000000000000)
Pixel Size = (0.000800000000000,-0.000800000000000)
Metadata:
AREA_OR_POINT=Area
datetime_created=2022-11-14 18:05:14.053301
Image Structure Metadata:
COMPRESSION=LZW
INTERLEAVE=BAND
PREDICTOR=3
Corner Coordinates:
Upper Left (-180.0000000, 90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"N)
Lower Left (-180.0000000, -90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"S)
Upper Right ( 180.0000000, 90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"N)
Lower Right ( 180.0000000, -90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"S)
Center ( 0.0000000, 0.0000000) ( 0d 0' 0.01"E, 0d 0' 0.01"N)
Band 1 Block=128x128 Type=Float32, ColorInterp=Gray
Description = score
NoData Value=-9999
Band 2 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = severity_value
NoData Value=-9999
Band 3 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = severity_min
NoData Value=-9999
Band 4 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = severity_max
NoData Value=-9999
Band 5 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = likelihood
NoData Value=-9999
Band 6 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = return_time
NoData Value=-9999
Band 7 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = likelihood_confidence
NoData Value=-9999
Band 8 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = climate_reliability
NoData Value=-9999
Band 9 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = hazard_reliability
NoData Value=-9999
5_UK_coastal-2020.tif (600MB)
Driver: GTiff/GeoTIFF
Size is 450000, 225000
Coordinate System is:
GEOGCRS["WGS 84",
DATUM["World Geodetic System 1984",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]]
Data axis to CRS axis mapping: 2,1
Origin = (-180.000000000000000,90.000000000000000)
Pixel Size = (0.000800000000000,-0.000800000000000)
Metadata:
AREA_OR_POINT=Area
datetime_created=2022-11-14 18:05:14.053301
hostname=posix.uname_result(sysname='Linux',
nodename='ip-172-31-12-125', release='5.15.0-1022-aws',
version='#26~20.04.1-Ubuntu SMP Sat Oct 15 03:22:07 UTC 2022',
machine='x86_64')
Image Structure Metadata:
COMPRESSION=LZW
INTERLEAVE=BAND
PREDICTOR=3
Corner Coordinates:
Upper Left (-180.0000000, 90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"N)
Lower Left (-180.0000000, -90.0000000) (180d 0' 0.00"W, 90d 0' 0.00"S)
Upper Right ( 180.0000000, 90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"N)
Lower Right ( 180.0000000, -90.0000000) (180d 0' 0.00"E, 90d 0' 0.00"S)
Center ( 0.0000000, 0.0000000) ( 0d 0' 0.01"E, 0d 0' 0.01"N)
Band 1 Block=128x128 Type=Float32, ColorInterp=Gray
Description = score
NoData Value=-9999
Band 2 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = severity_value
NoData Value=-9999
Band 3 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = severity_min
NoData Value=-9999
Band 4 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = severity_max
NoData Value=-9999
Band 5 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = likelihood
NoData Value=-9999
Band 6 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = return_time
NoData Value=-9999
Band 7 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = likelihood_confidence
NoData Value=-9999
Band 8 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = climate_reliability
NoData Value=-9999
Band 9 Block=128x128 Type=Float32, ColorInterp=Undefined
Description = hazard_reliability
NoData Value=-9999
--
Regards,
Clive Swan
--
Hi,
If you are still struggling with the same old problem could you please finally
send the gdalinfo reports of your two input files which are this time:
coastal-2020.tif
5_UK_coastal-2020.tif
-Jukka Rahkonen-
Lähettäjä: gdal-dev <gdal-dev-bounces at lists.osgeo.org
<https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgdal-dev&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C23104e51c7df4d425ea008daddad3302%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066033206354325%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CXQisAtn9pOYceYi%2FOb3t5q5cnSyvuCXcbUQcttrOWw%3D&reserved=0>>
Puolesta Clive Swan
Lähetetty: tiistai 13. joulukuuta 2022 17.23
Vastaanottaja: gdal-dev at lists.osgeo.org
<https://eur06.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.osgeo.org%2Fmailman%2Flistinfo%2Fgdal-dev&data=05%7C01%7Cjukka.rahkonen%40maanmittauslaitos.fi%7C23104e51c7df4d425ea008daddad3302%7Cc4f8a63255804a1c92371d5a571b71fa%7C0%7C0%7C638066033206510566%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=7dYGZEvrPMXi%2BDKseAc4HeYW%2FdDa%2BAqEAQfwX%2B6bF5E%3D&reserved=0>
Aihe: [gdal-dev] gdalwarp running very slow
Greetings,
I am running gdalwarp on a 6GB (output) and 600MB (input) tif image, the AWS
Instance has approx 60 VCPU
It has taken over 6 hours so far - still running, is it possible to optimise
this and speed it up??
gdalwarp -r near -overwrite coastal-2020.tif 5_UK_coastal-2020.tif -co
BIGTIFF=YES -co COMPRESS=LZW -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 -co
NUM_THREADS=ALL_CPUS --config CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE YES