On Thu, 05 Nov 2015 15:14:47, Leigh <lei...@gmail.com> wrote:
On 5 November 2015 at 14:59, Rowan Collins <rowan.coll...@gmail.com>
wrote:
PHP uses null bytes quite a lot to produce deliberately illegal
identifiers. For instance the old eval-like create_function() [e.g.
https://3v4l.org/hqHjh] and the serialization of private members [e.g.
https://3v4l.org/R6Y6k]
In this case, I guess the "@" in "class@anonymous" makes the name
illegal
anyway, but I'm not sold on the null byte being more unacceptable here
than anywhere else.
Regards,
--
Rowan Collins
[IMSoP]
That doesn't mean it's a good approach (*cough* namespaces *cough*), and
these bits of "magic" are supposed to be hidden away from users. I'm
guessing in this particular instance, the point of the null is to make
string operations cut off after "anonymous", however string operations
that respect the zval string length aren't going to do this.
e.g. var_dump() the class name is put through sprintf and it cuts off at
the null, but get_class or ReflectionClass::getName() just returns the
original string, and exposes the implementation details.
Internal names for anonymous classes need to be unique.
The current method of ensuring uniqueness ('\0' + filename + lexer pos) was
written by Dmitry[1], following from Joe's original implementation[2]. As I
understand it, this was supposed to be entirely hidden from userland; hence
the null byte.
So, I prepared a patch for `get_class()` and `ReflectionClass::getName()`,
which in my view should behave the same way as `var_dump()` etc., but I've
now realised that ignoring the unique suffix from the class name will break
functionality that is otherwise desirable:
* Aliasing an anonymous class... (see bug70106[3])
$instance = new class {};
$class_name = get_class($instance);
class_alias($class_name, 'MyAlias');
* Creating a new anonymous class instance dynamically...
$instance = new class {};
$class_name = get_class($instance);
$new_instance = new $class_name;
...although the `get_class()` used here isn't necessary, and you could
write `= new $instance;` without ever knowing $instance's class.
Our options seem to be:
(A) Leave everything as it is. This puts us in a situation where anonymous
class names are treated inconsistently in userland; a situation that I
think could lead to considerable confusion about the "real" names of
these classes and how they should be used.
(B) Patch `get_class()` and `ReflectionClass::getName()` (any others?) to
strip the suffix (the '\0' and everything after it) from anonymous
class
names prior to display. This prevents userland code from distinguishing
between multiple anonymous classes; breaking functionality as described
above. To mitigate this, we could allow `class_alias()` to accept
either
a class name OR an instance in the first argument. The `new` operator
already supports an instance in place of a class name.
(C) Assuming that userland SHOULD have access to the internal names of
anonymous classes, remove the null byte so that the full names are
displayed everywhere. The drawback of this option is that class names
won't be as predictable. It occurs to me that a different naming scheme
could help here - wouldn't a monotonically increasing integer suffix be
enough? The current scheme seems like an awful waste of memory, too.
Option (B) seems like the best to me, because I can't really see a need to
know the unique suffix -- provided any use-cases are catered for by an
alternative (modifying `class_alias()` should be enough).
Thoughts?
Steve
[1] http://git.php.net/?p=php-src.git;a=commitdiff;h=2a1b0b1#patch1
[2] http://git.php.net/?p=php-src.git;a=commitdiff;h=49608e0#patch12
[3] https://bugs.php.net/bug.php?id=70106
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php