Charles-François Natali added the comment: > @Charles-François: I think your worries about calloc and overcommit are > unjustified. First, calloc and malloc+memset actually behave the same way > here -- with a large allocation and overcommit enabled, malloc and calloc > will both go ahead and return the large allocation, and then the actual > out-of-memory (OOM) event won't occur until the memory is accessed. In the > malloc+memset case this access will occur immediately after the malloc, > during the memset -- but this is still too late for us to detect the malloc > failure.
Not really: what you describe only holds for a single object. But if you allocate let's say 1000 such objects at once: - in the malloc + memset case, the committed pages are progressively accessed (i.e. the pages for object N are accessed before the memory is allocated for object N+1), so they will be counted not only as committed, but also as active (for example the RSS will increase gradually): so at some point, even though by default the Linux VM subsystem is really lenient toward overcommitting, you'll likely have malloc/mmap return NULL because of this - in the calloc() case, all the memory is first committed, but not touched: the kernel will likely happily overcommit all of this. Only when you start progressively accessing the pages will the OOM kick in. > Second, OOM does not cause segfaults on any system I know. On Linux it wakes > up the OOM killer, which shoots some random (possibly guilty) process in the > head. The actual program which triggered the OOM is quite likely to escape > unscathed. Ah, did I say segfault? Sorry, I of course meant that the process will get nuked by the OOM killer. > In practice, the *only* cases where you can get a MemoryError on modern > systems are (a) if the user has turned overcommit off, (b) you're on a tiny > embedded system that doesn't have overcommit, (c) if you run out of virtual > address space. None of these cases are affected by the differences between > malloc and calloc. That's a common misconception: provided that the memory allocated is accessed progressively (see above point), you'll often get ENOMEM, even with overcommitting: $ /sbin/sysctl -a | grep overcommit vm.nr_overcommit_hugepages = 0 vm.overcommit_memory = 0 vm.overcommit_ratio = 50 $ cat /tmp/test.py l = [] with open('/proc/self/status') as f: try: for i in range(50000000): l.append(i) except MemoryError: for line in f: if 'VmPeak' in line: print(line) raise $ python /tmp/test.py VmPeak: 720460 kB Traceback (most recent call last): File "/tmp/test.py", line 7, in <module> l.append(i) MemoryError I have a 32-bit machine, but the process definitely has more than 720MB of address space ;-) If your statement were true, this would mean that it's almost impossible to get ENOMEM with overcommitting on a 64-bit machine, which is - fortunately - not true. Just try python -c "[i for i in range(<large value>)]" on a 64-bit machine, I'll bet you'll get a MemoryError (ENOMEM). ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue21233> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com