Hi,

Some previous discussion about the same thing in 
https://gis.stackexchange.com/questions/491309/performent-way-of-merging-multiple-rasters-into-a-single-cog-gtiff-with-gdal-w?

With untiled images it is not a surprise that small -projwin does not have a 
great effect. But with 1000x1000 pixels sized images and the default tile size 
256x256 splittin images internally into four would probably not make much 
difference anyway.

Processing time and memory usage seems to grow linearly when the number of 
files in the VRT is increasing. Maybe this is good news.

-Jukka Rahkonen-

________________________________________
Lähettäjä: gdal-dev käyttäjän Abdul Raheem Siddiqui via gdal-dev puolesta
Lähetetty: Maanantai 14. huhtikuuta 2025 7.52
Vastaanottaja: gdal-dev@lists.osgeo.org
Aihe: [gdal-dev] Performance Issue with VRT Pixel Function and Large Number of 
Source Rasters

Dear GDAL Community,I am encountering a performance issue when using a VRT 
consisting of a large number of source rasters and built-in C++ pixel function 
("max"). I would appreciate any guidance on whether some GDAL config option can 
improve this, or I am doing something wrong, or this is a potential 
optimization opportunity.I have A VRT file referencing ~750 individual rasters 
(Byte data type, avg size ~1000x1000 pixels, untiled, and same CRS for all 
source rasters). The VRT uses the built-in “max” pixel function.Running 
gdal_translate to convert the VRT to GTiff takes ~1.5 hours and consumes ~4GB 
RAM.gdal_translate -of GTiff merged.vrt OUTPUT.tifWhen tripling the number of 
source rasters (to ~2250) by duplicating entries in the VRT, processing time 
increases to ~4.5 hours, with RAM usage rising to ~11.5GB.gdal_translate -of 
GTiff merged_3x.vrt OUTPUT.tifExtracting a small subset of the VRT via -projwin 
does not improve performance (still slow, ~14GB RAM used).gdal_translate 
-projwin -146955.241 797044.4497 -138766.1444 789656.0648 -of GTiff 
merged_3x.vrt OUTPUT.tifRemoving the pixel function makes processing 
instantaneous, even with 2250 rasters.Performance is unaffected by --config 
GDAL_CACHEMAX or -co NUM_THREADS=ALL_CPUS.The issue persists across other 
datasets that I have. In fact, it gets really worse when source rasters are 
relatively larger or of Float32 data type.Gdalinfo on one of the source 
rasters: https://pastebin.com/gjHDA2WdData: 
https://drive.google.com/file/d/1LGlzGGZvkPyXvKKgkPVzQGbBw55p5dRd/view?System: 
Windows, 8 CPUs, 32GB RAM (GDAL 3.9.2 via OSGeo4W).Thank you for your time and 
insights. Please reply if you are aware of how performance can be 
improved.Regards,Abdul Siddiqui, PEabdul.siddiqui@ertcorp.comERT  | Earth 
Resources Technology, Inc.14401 Sweitzer Ln. Ste 300Laurel, MD 
20707https://www.ertcorp.com
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to