On Tue, 2011-06-07 at 21:03 +0200, David Zülke wrote:
> 144 (not 114!) bytes is for an integer; I'm not quite sure what the
> overheads are for arrays, which token_get_all() produces in
> abundance :) An empty array seems to occupy 312 bytes of memory.
> 
> Also, strings have memory allocated in 8 byte increments as far as I
> know, so "1" eats up 8 bytes, and "12345678901234567" will consume 24
> bytes for the raw text, not 17.

I'm too lazy to do the actual math (well best would be to do
sizeof(zval), sizeof(HashTable), sizeof(Bucket) on your system) and
there are few things to consider:

      * The sizes are different from 32 bit and 64bit; with 64bit
        there's a difference between Windows and Unix/Linux (on Win a
        long will still be 32 bit, but pointers 64 bit, on Linux/Unix
        both are 64bit)
      * On some architectures memory segments have to be aligned in some
        way which might waste memory
      * As David mentioned HashTables (Arrays) are more complex.
      * token_get_all() returns an array of (string | array of (long,
        string, long) )
      * A long takes sizeof(zval)
      * A string takes sizeof(zval)+strlen()+1
      * and array is a HashTable + space for buckets, this includes
        place for some not used elements
      * Each element inside the HT needs additional space for a Bucket
        with some meta data
      * While running your script you also keep the complete script file
        in memory. You also keep some temporary parser data in memory
        while the resulting array is being filled.

In the end it's not fully trivial to gather the size needed. And I'm
sure my list is missing loooots of things.

http://schlueters.de/blog/archives/142-HashTables.html has an short
introduction to HashTables. Skipping many of the details.

johannes

> David
> 
> 
> On 07.06.2011, at 20:26, Mike van Riel wrote:
> 
> > Am i then also correct to assume that the output of
> > memory_get_peak_usage is used for determining the memory_limit?
> > 
> > Also: after correcting with your new information (zval = 114 bytes
> > instead of 68) I still have a rather large offset:
> > 
> >    640952+2165950+114+(276697*114)+(276697*3*114)+2165950 = 131146798 =
> > 125M
> > 
> > (not trying to be picky here; I just don't understand)
> > 
> > _If_ my calculations are correct then a zval should be approx 216 bytes
> > (excluding string contents):
> > 
> >    ((244000000-640952-2165950-2165950) / 4) / 276697 = 215.9647B
> > 
> > Mike
> > 
> > On Tue, 2011-06-07 at 19:50 +0200, David Zülke wrote:
> >> memory_get_peak_usage() is the maximum amount of memory used by the VM of 
> >> PHP (but not by some extensions for instance) up until the point where 
> >> that function is called. So the actual memory usage may be even higher 
> >> IIRC. But yeah, you're basically right. I've explained in another message 
> >> why it might be so much more than you expected (zval overhead, basically)
> >> 
> >> David
> > 
> > 
> > 
> 



-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to