I’ve used mimalloc successfully in the past, worth a look if a drop in 
replacement for new / delete / malloc / free is desirable.  Do note that its 
performance is usually uniformly superior to glibc / msvc but there are 
unintuitive performance cliffs.  Given the block nature of most gdal raster 
workloads, I don’t expect them to surface, but fyi.

Our allocators only call VAlloc when necessary – we don’t issue a call 1:1 when 
a user would’ve used malloc.  The allocator has an internal state that knows 
when to call the underlying OS functions.  So in this case, if a user asks for 
4kb, VAlloc would map in 64kb, and the next time a user asks for 4kb (or any 
size that would fit w/ alignment), we don’t ask VAlloc for memory, we issue a 
pointer bump (or something along those lines).  Naturally this is more 
complicated in a multithreaded context.  What we’ve done there is have a 
per-thread allocator so there’s no contention between threads in user-space.  
Devil in the details, tho.

From: Even Rouault <even.roua...@spatialys.com>
Date: Thursday, March 21, 2024 at 9:59 AM
To: "Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC]" 
<jesse.r.me...@nasa.gov>, Abel Pau <a....@creaf.uab.cat>, 
"gdal-dev@lists.osgeo.org" <gdal-dev@lists.osgeo.org>
Subject: Re: [gdal-dev] [EXTERNAL] [BULK] Re: Experience with slowness of 
free() on Windows with lots of allocations?

CAUTION: This email originated from outside of NASA.  Please take care when 
clicking links or opening attachments.  Use the "Report Message" button to 
report suspicious messages to the NASA SOC.



I've played with VirtualAlloc(NULL, SINGLE_ALLOC_SIZE, MEM_COMMIT | 
MEM_RESERVE, PAGE_READWRITE), and it does avoid the performance issue. However 
I see that VitualAlloc() allocates by chunks of 64 kB, so depending on the size 
of a block, it might cause significant waste of RAM, so that can't be used as a 
direct replacement of malloc().

My inclination would be to perhaps have an optional config option like 
GDAL_BLOCK_CACHE_USE_PRIVATE_HEAP that could be set, and when doing so it would 
use HeapCreate(0, 0, GDAL_CACHEMAX) to create a heap only used by the block 
cache. Not ideal, since that would reserve the whole GDAL_CACHEMAX (but for a 
large enough processing, you'll end up consuming it), but it has the advantage 
of not being extremely intrusive either... and could be easily ditched/replaced 
by something better in the future.

Regarding tcmalloc, I've had to use it on Linux too, but only on scenarios 
involving multithreading where it helps reducing RAM fragmentation: cf 
https://gdal.org/user/multithreading.html#ram-fragmentation-and-multi-threading 
. I've just tried quickly to use it on Windows to test it on the scenario, but 
didn't really manage to make it work. Even building it was challenging. 
Actually I tried https://github.com/gperftools/gperftools and I had to build 
from master since the latest tagged version doesn't build with CMake on 
Windows. But then nothing happens when linking tcmalloc_minimal.lib against my 
toy app. I probably missed something.

Anyway I don't really think we can force tcmalloc to be used in GDAL, as a 
library. Unless there would be a way to have its allocator to be optionnaly 
used at places that we control (ie explicitly call tc_malloc / tc_free), and 
not replace the default malloc / free etc, which might be undesirable when GDAL 
is just a component of a larger application.

Disabling entirely the block cache (or setting it to a minimum value) is only a 
workable option for uncompressed formats, or if you use per-band blocks 
(INTERLEAVE=BAND in GTiff language) and not one block for all bands 
(INTERLEAVE=PIXEL), otherwise you'll pay multiple time the decompression.
Le 21/03/2024 à 14:38, Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND 
APPLICATIONS INC] via gdal-dev a écrit :
+1.  We use a variety of hand-rolled VirtualAlloc based (for basic tasks, a 
simple pointer bump, and for more elaborate needs, a ‘buddy’) allocators, some 
of which try to be smart about memory usage via de-committing regions.  In our 
work, we tend to disable the GDAL cache entirely and rely on the file system’s 
file cache instead, which is a simplification we can make but is surely 
untenable in general here.

From: gdal-dev 
<gdal-dev-boun...@lists.osgeo.org><mailto:gdal-dev-boun...@lists.osgeo.org> on 
behalf of Abel Pau via gdal-dev 
<gdal-dev@lists.osgeo.org><mailto:gdal-dev@lists.osgeo.org>
Reply-To: Abel Pau <a....@creaf.uab.cat><mailto:a....@creaf.uab.cat>
Date: Thursday, March 21, 2024 at 4:51 AM
To: "gdal-dev@lists.osgeo.org"<mailto:gdal-dev@lists.osgeo.org> 
<gdal-dev@lists.osgeo.org><mailto:gdal-dev@lists.osgeo.org>
Subject: [EXTERNAL] [BULK] Re: [gdal-dev] Experience with slowness of free() on 
Windows with lots of allocations?

CAUTION: This email originated from outside of NASA.  Please take care when 
clicking links or opening attachments.  Use the "Report Message" button to 
report suspicious messages to the NASA SOC.



Hi Even,

you’re right. We also know that. When programming the driver I took it in 
consideration. Our solution is not rely on windows to make a good job with 
memory and we try to reuse as memory as possible instead of use calloc/free 
freely.

For instance, in the driver, for each feature I have to get or write the 
coordinates. I could do it every time I have to, so lots of times: create 
memory for reading, and then put them on the feature, and then free... so many 
times. What I do? When opening the layer I create some memory blocs of 250 Mb 
(due to the format itself) and I use that created memory to manage whatever I 
need. And when closing, I free it.

While doing that I observed that sometimes I have to use GDAL code that doesn’t 
take it in consideration (CPLRecode() for instance). Perhaps it could be 
improves as well.

Thanks for noticing that.

De: gdal-dev 
<gdal-dev-boun...@lists.osgeo.org><mailto:gdal-dev-boun...@lists.osgeo.org> En 
nombre de Javier Jimenez Shaw via gdal-dev
Enviado el: dijous, 21 de març de 2024 8:27
Para: Even Rouault 
<even.roua...@spatialys.com><mailto:even.roua...@spatialys.com>
CC: gdal dev <gdal-dev@lists.osgeo.org><mailto:gdal-dev@lists.osgeo.org>
Asunto: Re: [gdal-dev] Experience with slowness of free() on Windows with lots 
of allocations?

In my company we confirmed that "Windows heap allocation mechanism sucks."
Closing the application after using gtiff driver can take many seconds due to 
memory deallocations.

One workaround was to use tcmalloc. I will ask my colleagues more details next 
week.

On Thu, 21 Mar 2024, 01:55 Even Rouault via gdal-dev, 
<gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org>> wrote:
Hi,

while investigating
https://github.com/OSGeo/gdal/issues/9510#issuecomment-2010950408, I've
come to the conclusion that the Windows heap allocation mechanism sucks.
Basically if you allocate a lot of heap regions of modest size with
malloc()/new[], the time spent when freeing them all with corresponding
free()/delete[] is excruciatingly slow (like ~ 10 seconds for ~ 80,000
allocations). The slowness is clearly quadratic with the number of
allocations. You only start noticing it with ~ 30,000 allocations. And
interestingly, another condition for that slowness is that each
individual allocation much be strictly greater than 4096 * 4 bytes. At
exactly that value, perf is acceptable, but add one extra byte, and it
suddenly drops. I suspect that there must be a threshold from which
malloc() starts using VirtualAlloc() instead of the heap, which must
involve slow system calls, instead of a user-land allocation mechanism.

Anyone has already hit that and found solutions? The only potential idea
I found until now would be to use a private heap with HeapCreate() with
a fixed maximum size, which is a bit problematic to adopt by default,
basically that would mean that the size of GDAL_CACHEMAX would be
consumed as soon as one use the block cache.

Even

--
http://www.spatialys.com<http://www.spatialys.com/>
My software is free, but my time generally not.

_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org>
https://lists.osgeo.org/mailman/listinfo/gdal-dev



_______________________________________________

gdal-dev mailing list

gdal-dev@lists.osgeo.org<mailto:gdal-dev@lists.osgeo.org>

https://lists.osgeo.org/mailman/listinfo/gdal-dev

--

http://www.spatialys.com<http://www.spatialys.com/>

My software is free, but my time generally not.
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
  • ... Even Rouault via gdal-dev
    • ... Javier Jimenez Shaw via gdal-dev
      • ... Abel Pau via gdal-dev
        • ... Uhrig, Stefan via gdal-dev
          • ... Uhrig, Stefan via gdal-dev
        • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
          • ... Even Rouault via gdal-dev
            • ... Meyer, Jesse R. (GSFC-618.0)[SCIENCE SYSTEMS AND APPLICATIONS INC] via gdal-dev
            • ... Javier Jimenez Shaw via gdal-dev

Reply via email to