Hi Juliette, > On Aug 16, 2024, at 9:25 PM, Juliette Reinders Folmer > <php-internals_nos...@adviesenzo.nl> wrote: > > On 17-8-2024 2:00, Mike Schinkel wrote: >> To everyone on this thread and thus nobody in specific (hence my top post): >> >> Given that get_declared_*() populates a new ZEND_HASH each time the function >> is called[1] — which for classes can easily be hundreds of strings — would >> there be any appetite to: >> >> 1. Add an optional parameter — e.g. `get_declared_classes($startingFrom)` — >> that can limit the number of symbols starting with "n" where if n==100 then >> it would return array_slice($symbols,100), and >> >> 2. Add a function — `get_declared_class_count()` — that returns the current >> number of declared symbols of whichever kind of symbol, so we can get the >> "n" before calling `include`/`require` to use with get_declared_*() after >> calling `include`/`require`? >> >> >> This would cut down on creating hashes with 100+ strings that get >> immediately thrown away, and also cut down on memory fragmentation/garbage >> collector churn. >> >> -Mike >> >> [1] >> https://github.com/php/php-src/blob/18d41502da0da1bb3928e60c41f1b821974c2c01/Zend/zend_builtin_functions.c#L1371 > > Mike, > > While your suggestion does not directly relate to what triggered my mail to > the list, I very very much like the idea.
Excellent! > It's as if you are reading my mind or rather, reading one of the autoloaders > [1] I need to maintain, in which your above outlined proposal should > definitely make for more efficiency. Thank you for sharing that. Yes, your use-case is very similar to mine. > Having said that, the same parameter + extra function would then also be > needed for `get_declared_interfaces()`, `get_declared_traits()` - and, if it > would be added, `get_declared_enums()`. That was my assumption and intention, but I did not say so explicitly. I should have. Thanks to Ayesh's excellent PR, I was able to recognize that all of those functions in-fact are implemented via one internal function `get_declared_class_impl()` so it only made sense — to me, at least — that modifying it would enable the same functionality for all four (4) userland functions. > There is one problem I can see with this approach though: since PHP 7.4, the > return value of `get_declared_classes()` (and friends) does not guarantee any > particular order anymore [2]. > So, an `array_slice($symbols, $n)` may not get you the _latest_ classes > loaded, so I think this would only work if the order of classes is guaranteed > in some way. That is a super relevant insight and one I was actually not aware of. Do you happen to know if it was triggered by an explicit RFC to that effect, or if it resulted because of a side-effect of some other implementation change? From my perspective, that is definitely a BC break and one that I am surprised I never noticed before? Do you also know if it is a case of "not guaranteed, but in-fact actual fact it always works that way, or not?" I have not used my lib that needs that functionality to work in a few years, so maybe it is broken now and those who are now maintaining those apps have had to deal with it? Given that you see value in the same use-case I had which was an autoloader, maybe it would be better to achieve the functionality we need in a different, more efficient way? What if instead PHP were to implement an optional 2nd callback parameter to `include()` / `require()` / `include_once()` / `require_once()` to allow us to capture the symbols loaded and their paths? The callback function could return `void` and accept an array of `$symbols` with the following guaranteed minimum structure? $symbols = array( 'classes' => [], 'interfaces' => [], 'traits' => [], 'enums' => [], ); With that you could reimplement all that PHP code you linked — at least for versions of PHP supporting the callback — to look like this: function loadFile($path) { if (strpos(__DIR__, 'phar://') !== 0) { $path = realpath($path); if ($path === false) { return false; } } if (isset(self::$loadedClasses[$path]) === true) { return self::$loadedClasses[$path]; } include($path, function($symbols) use ($path) { foreach ($symbols['classes'] as $className) { self::$loadedClasses[$path] = $className; self::$loadedFiles[$className] = $path; } }); return self::$loadedClasses[$path]; }//end loadFile() This approach would be a less disruptive than my prior suggestion which would have added four (4) new `*_count()` functions and new optional parameters for four (4) existing function vs. this proposal which just adds new optional parameters for four (4) other existing functions. Further, and the signature(s) required for parameters would be much easier to settle on. I checked the code[2] and if I understand it correctly would (only?) require adding a Zend Observer[3] callback to be notified in `zend_bind_class_in_slot()` although I still cannot figure out where that observer would need to be added. Going this approach would also make the userland code in PHPCodeSniffer a lot more efficient than the code you currently must run for every included file simply to determine what symbols were loaded. What do you think if this alternate approach? -Mike [1] https://github.com/php/php-src/blob/18d41502da0da1bb3928e60c41f1b821974c2c01/Zend/zend_builtin_functions.c#L1371 [2] https://github.com/php/php-src/blob/php-8.3.10/Zend/zend_compile.c#L1231-L1269 [3] https://www.linkedin.com/advice/3/what-some-best-practices-common-pitfalls-using-zend