Hi, When "max" pixel function is used then GDAL needs to analyze every single pixel from every overlapping image for finding out the maximum values. Without pixel functions there is nothing to analyze but the pixels from the last image in a pile will beat all the previous pixels.
I would try something like hand made multiprocessing. Depart the 1000 images into 10 vrt files, 100 images in each, and run 10 parallel gdal_translates with "max". Then run max again for the 10 interim images. Perhaps you will save more than 80% of clock time. -Jukka Rahkonen- ________________________________________ Lähettäjä: Abdul Raheem Siddiqui Lähetetty: Maanantai 14. huhtikuuta 2025 16.09 Vastaanottaja: Rahkonen Jukka Kopio: gdal-dev@lists.osgeo.org Aihe: Re: [gdal-dev] Performance Issue with VRT Pixel Function and Large Number of Source Rasters Laurentiu,Bumping GDAL_MAX_DATASET_POOL_SIZE did not make any difference for me. My source rasters are of different block sizes but they are relatively small. As you have noted, the drastic performance difference with or without Pixel Func points to something else not tiling. I agree.AbdulOn Mon, Apr 14, 2025 at 4:20 AM Rahkonen Jukka <jukka.rahko...@maanmittauslaitos.fi> wrote:Hi,Some previous discussion about the same thing in https://gis.stackexchange.com/questions/491309/performent-way-of-merging-multiple-rasters-into-a-single-cog-gtiff-with-gdal-w?With untiled images it is not a surprise that small -projwin does not have a great effect. But with 1000x1000 pixels sized images and the default tile size 256x256 splittin images internally into four would probably not make much difference anyway.Processing time and memory usage seems to grow linearly when the number of files in the VRT is increasing. Maybe this is good news.-Jukka Rahkonen-________________________________________Lähettäjä: gdal-dev käyttäjän Abdul Raheem Siddiqui via gdal-dev puolestaLähetetty: Maanantai 14. huhtikuuta 2025 7.52Vastaanottaja: gdal-dev@lists.osgeo.orgAihe: [gdal-dev] Performance Issue with VRT Pixel Function and Large Number of Source RastersDear GDAL Community,I am encountering a performance issue when using a VRT consisting of a large number of source rasters and built-in C++ pixel function ("max"). I would appreciate any guidance on whether some GDAL config option can improve this, or I am doing something wrong, or this is a potential optimization opportunity.I have A VRT file referencing ~750 individual rasters (Byte data type, avg size ~1000x1000 pixels, untiled, and same CRS for all source rasters). The VRT uses the built-in “max” pixel function.Running gdal_translate to convert the VRT to GTiff takes ~1.5 hours and consumes ~4GB RAM.gdal_translate -of GTiff merged.vrt OUTPUT.tifWhen tripling the number of source rasters (to ~2250) by duplicating entries in the VRT, processing time increases to ~4.5 hours, with RAM usage rising to ~11.5GB.gdal_translate -of GTiff merged_3x.vrt OUTPUT.tifExtracting a small subset of the VRT via -projwin does not improve performance (still slow, ~14GB RAM used).gdal_translate -projwin -146955.241 797044.4497 -138766.1444 789656.0648 -of GTiff merged_3x.vrt OUTPUT.tifRemoving the pixel function makes processing instantaneous, even with 2250 rasters.Performance is unaffected by --config GDAL_CACHEMAX or -co NUM_THREADS=ALL_CPUS.The issue persists across other datasets that I have. In fact, it gets really worse when source rasters are relatively larger or of Float32 data type.Gdalinfo on one of the source rasters: https://pastebin.com/gjHDA2WdData: https://drive.google.com/file/d/1LGlzGGZvkPyXvKKgkPVzQGbBw55p5dRd/view?System: Windows, 8 CPUs, 32GB RAM (GDAL 3.9.2 via OSGeo4W).Thank you for your time and insights. Please reply if you are aware of how performance can be improved.Regards,Abdul Siddiqui, PEabdul.siddiqui@ertcorp.comERT | Earth Resources Technology, Inc.14401 Sweitzer Ln. Ste 300Laurel, MD 20707https://www.ertcorp.com _______________________________________________ gdal-dev mailing list gdal-dev@lists.osgeo.org https://lists.osgeo.org/mailman/listinfo/gdal-dev