On Fri, Dec 13, 2013 at 11:14 AM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > Back in the 1980s, when I was a Mac user and occasional programmer, there > were memory manager routines which (if I recall correctly) would tell you > whether or not an allocation would succeed or not. Macs of that vintage > didn't have virtual memory, it is true, but the principle remains sound: > "is there a single contiguous block of N bytes available? if so, use a > list comprehension, if not, switch to an in-place algorithm". Until > Python, I had never used a programming language that didn't give you at > least a rough-and-ready way to determine how much memory was available so > you could make an informed guess as to whether an expensive operation > would succeed before actually attempting it. > > I will be very disappointed (but hardly surprised) to learn that > functionality which was standard in 1984 computers is not possible in > 2013.
The situation is a bit different. Virtualization of memory makes it a lot harder to figure out what can/can't be allocated without paging. As long as you're okay with it being really rough, you could probably get some info from the system. On my Linux box, /proc/meminfo tells me: MemTotal: 16452036 kB MemFree: 1689636 kB Buffers: 251240 kB Cached: 6228724 kB That suggests I can allocate 1.5GB + .25GB + 6.25GB = 8GB, roughly, before hitting the swapper. (Buffers and Cached represent memory that will be reclaimed to satisfy other requests, but is being used - most of it caching stuff off the primary filesystem.) In very VERY round figures, that would be how much it ought to be safe to allocate. Now, how that compares with your list requirements is a separate question. I've no idea how many list elements would fit in that 8GB, but it probably won't be a billion (which would be the theoretical, one 64-bit pointer per element). This might tie in nicely with the theory I suggested about splitting the job. You could pick a SPLIT by figuring how much memory is free, dividing that by your pointer size or other estimate of usage, halving it because you need two lists, and then halving again for safety. On this same 64-bit system, sys.getsizeof(list(range(N))) for various values of N seems to suggest that it approaches 9*N. So by the guesstimate above, I come up with a plausible SPLIT of 226933 == (1689636+251240+6228724)//9//2//2, not forgetting that this is scaled by one SI unit (so 226 million elements would be safe). Alas, that's not true, because I just tried that, and he hit the swapper... even though sys.getsizeof() returned the expected N*9+112 figure. Which is because I'm an idiot. Of course. The reason it's more is because I'm creating all those integers too... actually, I'm quite impressed with that, that's costing almost nothing more. >>> sys.getsizeof(list(range(100000000))) 900000112 >>> sys.getsizeof([None]*100000000) 800000064 Not quite sure what's going on there. I know those integers are taking up more than 1 byte of storage. Guess getsizeof() isn't the best way to calculate. Anyway, a bit of experimentation should tell you how much you can do safely, and as long as there's a good margin of error, this would be an extremely rough guide. Now, getting that info in a cross-platform way... that's the problem! ChrisA -- https://mail.python.org/mailman/listinfo/python-list