I found the issue that was tripping me with the optimization I was
attempting in #1. 

Basically, there seem to be cases where "op_arrays" get destroyed during
middle of program execution (e.g., op_array corresponding to the live
code in a file seems to get destroyed at the end of the "include"
operation; and I guess even the op_arrays corresponding to EVAL
operations have the same property). Since "destroy_op_array()" frees the
IS_CONST operands unconditionally (i.e. with ref count checking), with
the attempted optimization, one can end up with pointers into freed
data. The following simple test reproduced the issue:

a.php:
------ 
<?php
$y = "hellllllloooo world";
 
b.php:
------
<?php
include 'a.php';
print $y;

During execution of live code in "a.php" -- $y will end up pointing to
the IS_CONST operand of the "=" (ASSIGN) opcode in the live code
op_array for "a.php". But at the end of the 'include a.php' statement in
b.php, this op_array gets freed. And an attempt to print $y prints
garbage.

I am going to explore working around the problem by deferring cleanup of
live code op_arrays to the end. [Perhaps for the case of "eval"-- I just
won't enable the optimization.] But perhaps the details are not of
general interest at this point... so I will refrain from additional mail
on the topic.

regards,
Kannan Muthukkaruppan
-----Original Message-----
From: Kannan Muthukkaruppan 
Sent: Thursday, January 10, 2008 6:33 PM
To: Andi Gutmans; internals@lists.php.net
Subject: RE: [PHP-DEV] php-5.2.3: zend_opcode.c: pass_two() question


Thanks, that helps.

#1. In my case, I don't have op_arrays that are shared by multiple
threads/processes. So I attempted to comment out the IS_CONST patching
(is_ref=1/refcount=2) that happens in pass_two() in the hope of
assigning constants by simple reference-count increments. While my basic
test cases worked fine, I am running into some heap corruptions with
non-trivial test cases. I haven't yet been able to come up with a
smaller test case yet. I'll continue to debug this..

But it leads me to believe that there is some other reason as well (in
addition to the one you point out) for doing this IS_CONST patchup. If
you happen to know other reasons, it'll be very helpful.

#2. For dealing with "read-only" op-arrays, perhaps one approach would
be set the REFCOUNT of these IS_CONSTs to a special high value that is
recognized by the run-time as being part of a read-only/immutable zval.
And teach the assignment etc. operations to recognize these as special
(i.e. increment/decrement on this special value should neither be done,
not is it needed).

#3. To give a little background of why I started looking into this: I
noticed that array literals aren't getting recognized & converted into
constants (IS_CONST operands) except when they occur as the RHS of a
"static" declaration or the default value of an optional parameter. I
prototyped some changes to the compiler so that anytime an array
expression contains only literals (including maybe nested literal
arrays), it gets treated as an IS_CONST operand. 

So for something like:

  $z = array(array("a", "b", "c"), array("d", "e", "f"));

The old opcode sequence was:
           INIT_ARRAY                                       ~4, 'a'
           ADD_ARRAY_ELEMENT                                ~4, 'b'
           ADD_ARRAY_ELEMENT                                ~4, 'c'
           INIT_ARRAY                                       ~5, ~4
           INIT_ARRAY                                       ~6, 'd'
           ADD_ARRAY_ELEMENT                                ~6, 'e'
           ADD_ARRAY_ELEMENT                                ~6, 'f'
           ADD_ARRAY_ELEMENT                                ~5, ~6
           ASSIGN                                               !0, ~5

New opcode sequence is simply:
          ASSIGN                                               !0,
<array>

While I did get some performance benefit with this simply by having less
opcodes & thus less overhead in the interpreter loop, I had actually
hoped to see more. Upon investigation, I realized that I hadn't really
avoided the copy (due to the "is_ref=1" on the IS_CONST operand). And
that's the reason for posting the question in this thread.
 
Since PHP disallows array literals in the context of "define" & "class
constants", it seems to be a common programmatic idiom to assign array
literals (such as list of application error messages, or a map of
product ids to name, etc.) to a global variable and refer to the array
everywhere in the program using that global variable. But then, there is
a per-request penalty associated with building/tearing down such globals
that are really just pointing to "read-only" data. This overhead would
go away if assignments from an IS_CONST operand to a target didn't
perform copies.

Regards,
Kannan
-----Original Message-----
From: Andi Gutmans [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 10, 2008 3:46 PM
To: Kannan Muthukkaruppan; internals@lists.php.net
Subject: RE: [PHP-DEV] php-5.2.3: zend_opcode.c: pass_two() question

Op arrays are read-only. This is a design goal and is especially
important for byte code caches where several processes are looking at
the same op array. Therefore setting IS_CONST to is_ref=1/refcount=2 we
are ensuring that they will not be changed.
The copy happens only the first time and afterwards that zval may be
used, incremented, etc...

Net net, it is intended behavior and very important.
Andi


> -----Original Message-----
> From: Kannan Muthukkaruppan [mailto:[EMAIL PROTECTED]
> Sent: Thursday, January 10, 2008 3:36 PM
> To: internals@lists.php.net
> Subject: [PHP-DEV] php-5.2.3: zend_opcode.c: pass_two() question
> 
> An example assignment such as:
>         $x = 10;
> doesn't seem to get handled as a simple ref-count incrementing 
> assignment (in zend_assign_to_variable()). Instead, $x ends up getting

> a new "zval" into which a copy of the RHS (10) is made.
> 
> This seems to be because the zval/znode corresponding to the literal
10
> is an  IS_CONST operand of the assignment opcode. And, during the end 
> of compilation of a function, the IS_CONST operands in the function's 
> op_array are patched up as "references" (is_ref is set to 1).
> 
> zend_opcode.c:pass_two()
>   ...
>  while (opline < end) {
>   if (opline->op1.op_type == IS_CONST) {
>    opline->op1.u.constant.is_ref = 1;
>    opline->op1.u.constant.refcount = 2; /* Make sure is_ref won't be 
> reset */
>   }
>   if (opline->op2.op_type == IS_CONST) {
>    opline->op2.u.constant.is_ref = 1;
>    opline->op2.u.constant.refcount = 2;
>   }
>    ...
> 
> And zend_assign_to_variable(), when it sees the RHS to be a reference 
> ("is_ref == 1") ends up doing a copy rather than a ref-count 
> incrementing assignment.
> 
> It seems like performance would be better if we didn't mark constant 
> operands as "references" in pass_two(). But presumably it is required 
> to handle some cases that'll otherwise break? I am trying to 
> understand what those cases are.  If someone can shed further light on

> this
issue,
> it is much appreciated...
> 
> regards,
> Kannan Muthukkaruppan

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to