Hello Christian, I have put your proposal as a link to a PHP GSoC 2008 idea here: http://wiki.php.net/gsoc/2008
Feel invited to add to this idea in whatever way you want :-) marcus Saturday, December 22, 2007, 4:08:04 PM, you wrote: > Hi, > I was following this thread and came upon Jeff's posting on how closures > could be implemented in PHP. > Since I would find the feature to be EXTREMELY useful, I decided to > actually implement it more or less the way Jeff proposed. So, here's the > patch (against PHP_5_3, I can write one against HEAD if you whish): > http://www.christian-seiler.de/temp/closures-php-5-3.patch > I started with Wez's patch for adding anonymous functions that aren't > closures. I changed it to make sure no shift/reduce or reduce/reduce > error occur in the grammar. Then I started implementing the actual > closure stuff. It was fun because I learned quite a lot about how PHP > actually works. > I had the following main goals while developing the patch: > 1. Don't reinvent the wheel. > 2. Don't break anything unless absolutely necessary. > 3. Keep it simple. > Jeff proposed a new type of zval that holds additional information about > the function that is to be called. Adding a new type of zval would need > changes throughout the ENTIRE PHP source and probably also throughout > quite a few scripts. But fortunately, PHP already posesses a zval that > supports the storage of arbitrary data while being very lightweight: > Resources. So I simply added a new resource type that stores zend > functions. The $var = function () {}; will now make $var a resource (of > the type "anonymous function". > Anonymous functions are ALWAYS defined at compile time, no matter where > they occur. They are simply named __compiled_lamda_1234 and added to the > global function table. But instead of simply returning the string > '__compiled_lambda_1234', I introduced a new opcode that will create > references to the correct local variables that are referenced inside the > function. > For example, if you have: > $func = function () { > echo "Hello World\n"; > }; > This will result in an anonymous function called '__compiled_lambda_0' > that is added to the function table at compile time. The opcode for the > assignment to $func will be something like: > 1 ZEND_DECLARE_ANON_FUNC ~0 '__compiled_lambda_0' > 2 ASSIGN !0, ~0 > The ZEND_DECLARE_ANON_FUNC opcode handler does the following: > It creates a new zend_function, copies the contents of the entire > structure of the function table entry corresponding to > '__compiled_lamda_0' into that new structure, increments the refcount, > registeres it as a resource and returns that resource so it can be > assigned to the variable. > Now, have a look at a real closure: > $string = "Hello World!\n"; > $func = function () { > lexical $string; > echo $string; > }; > This will result in the same opcode as above. But here, three additional > things happen: > 1. The compiler sees the keyword 'lexical' and stores the information, > that a variable called 'string' should be used inside the closure. > 2. The opcode handler sees that a variable named 'string' is marked as > lexical in the function definition. Therefore it creates a reference to > it in a HashTable of the COPIED zend_function (that will be stored in > the resource). > 3. The 'lexical $string;' translates into a FETCH opcode that will work > in exactly the same way as 'static' or 'global' - only fetching it from > the additional HashTable in the zend_function structure. > The resource destructor makes sure that the HashTable containing the > references to the lexical veriables is correctly destroyed upon > destruction of the resource. It does NOT destroy other parts of the > function structure because they will be freed when the function is > removed from the global function table. > With these changes, closures work in PHP. > Some caveats / bugs / todo: > * Calling anonymous functions by name directly is problematic if there > are lexical variables that need to be assigned. I added checks to > make sure this case does not happen. > * In the opcode handler, error handling needs to be added. > * If somebody removes the function from the global function table, > (e.g. with runkit), the new opcode will return NULL instead of > a resource (error handling is missing). Since I do increment > refcount of the zend_function, it SHOULD not cause segfaults or > memory leaks, but I haven't tested it. > * $this is kind of a problem, because all the fetch handlers in PHP > make sure $this is a special kind of variable. For the first version > of the patch I chose not to care about this because what still works > is e.g. the following: > $object = $this; > $func = function () { > lexical $object; > // do something > }; > Also, inside the closures, the class context is not preserved, so > accessing private / protected members is not possible. > I'm not sure this actually represents a problem because you can > always use normal local variables to pass values between closure > and calling method and make the calling method change the properties > itself. > * I've had some problems with eval(), have a look at the following > code: > $func = eval ('return function () { echo "Hello World!\n"; };'); > $func(); > With plain PHP, this seems to work, with the VLD extension loaded > (that shows the Opcodes), it crashes. I don't know if that's a > problem with eval() or just with VLD and I didn't have time to > investigate it further. > * Oh, yes, 'lexical' is now a keyword. Although I really don't think > that TOO many people use that as an identifier, so it probably won't > hurt THAT much. > Except those above points, it really works, even with complex stuff. Let > me show you some examples: > 1. Customized array_filter: > function filter_larger ($array, $min = 42) { > $filter = function ($value) { > lexical $min; > return ($value >= $min); > }; > return array_filter ($array, $filter); > } > $arr = array (41, 43); > var_dump (filter_larger ($arr)); // 43 > var_dump (filter_larger ($arr, 40)); // 41, 43 > var_dump (filter_larger ($arr, 44)); // empty > 2. Jeff's example: > function getAdder($x) { > return function ($y) { > lexical $x; > return $x + $y; > }; > } > $plusFive = getAdder(5); > $plusTen = getAdder(10); > echo $plusFive(4)."\n"; // 9 > echo $plusTen(7)."\n"; // 17 > 3. Nested closures > $outer = function ($value) { > return function () { > lexical $value; > return $value * 2; > }; > }; > $duplicator = $outer (4); > echo $duplicator ()."\n"; // 8 > $duplicator = $outer (8); > echo $duplicator ()."\n"; // 16 > [Ok, yeah, that example is quite stupid and should NOT be used as an > example for good code. ;-) But it's simple and demonstrates the > possibilities.] > It would be great if somebody could review the patch because I'm shure > some parts can still be cleaned up or improved. And it would be even > better if this feature would make it into PHP. ;-) > Regards, > Christian > PS: I'd like to thank Derick Rethans for his GREAT Vulcan Logic > Disassembler - without it, developement would have been a LOT more painful. > PPS: Oh, yeah, if it should be legally necessary, I grant the right to > anybody to use this patch under any OSI certified license you may want > to choose. Best regards, Marcus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php