Hi, I'm writing a conference paper about phc, and I discuss the PHP C API in some detail. I've included the text I intend to include in my paper below. I wonder if any of the experts on the topic see any flaws in my account. All the details come from Sarah Golemon's book, my experience with the PHP embed SAPI, study of the source code, and following discussions on this list.
<quote> The primitive unit of data in PHP is the zval, a small structure encompassing a union of values---objects, hashtables, strings and numeric types---and memory-management counters and flags. A PHP variable is a symbol-table entry pointing to a zval, and multiple variables can point to the same zval, using reference counting for memory management. Objects in PHP are copied by reference. Assignment of primitive values, however, is by copy, meaning that semantically the l-value becomes a copy of the r-value. As an optimization, the PHP run-time causes the l-value to share the r-value's zval, increasing the reference count, and the variables become part of the same copy-on-write set. Assignment can also be by reference, which puts the two variables in the same change-on-write set, in a similar fashion. This sets the is_ref flag of the shared zval, indicating that the variables in this set all reference each other. Updating a variable which is a reference updates its zval, changing the value of all the other variables in that change-on-write set. Variables in a copy-on-write set share the same zval, but are not semantically related. Although this is an optimization applied by the PHP run-time, it is a feature which phc must deal with to interact with the run-time, and so it reuses it for performance. In order to update the value of a variable in a copy-on-write set, it must first be separated. A copy of its zval is created---a deep copy in the case of arrays and strings---and the original zval has its reference count decremented. zvals with a reference count of zero are deallocated. Variables in a change-on-write set must similarly be separated if they are assigned to a copy-on-write set. Otherwise, assignment to a variable overwrites a zval---s value field, changing the value of all the variables in that change-on-write set. Variables with a reference count of one, which are in neither a copy-on-write or change-on-write set---are similarly treated. The PHP interpreter keeps a reference to a variable's zval in global and function-local symbol-tables---hashtables indexed by the variable's name. When a function finishes execution, the local symbol-table is destroyed, decreasing the reference count of all zvals contained within. The global symbol table is destroyed at the end of the execution of a script. As a result of the function-local symbol-table, each PHP variable uses a great amount of space. The zval itself is 16 bytes long. However, the symbol-table is a hashtable with a 36 byte bucket. Combined with memory allocation overhead, each variable occupies 68 bytes on a 32-bit platform [?], and nearly double that on a 64-bit platform. This means that variable allocations, copies, separations and deallocations are quite expensive---the PHP interpreter spends over 20% of its time in memory-management, according to our profiling, which does not include time spent incrementing and decrementing reference counts. As a result of the re-use of the PHP run-time, phc is afflicted with the same problem. </quote> Thanks in advance, Paul -- Paul Biggar [EMAIL PROTECTED] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php