On Thu, Apr 26, 2012 at 3:45 AM, C.Koy <can5...@gmail.com> wrote: > As of 5.3.0 this bug does not exist for function names. Only classes and > interfaces. > > Turns out, if you cause a function to be called dynamically by (e.g.) using a variable function, the bug will surface.
<?php setlocale(LC_CTYPE, 'tr_TR'); function IJK() {} # succeeds IJK(); $f = 'IJK'; # causes Fatal error: Call to undefined function IJK() $f(); In contrast, if you set the locale for LC_CTYPE on the command line, the bug doesn't arise at all because the compilation and execution phases both use the same locale. > Could this be a clue for how to fix it for those as well? Function names are generally resolved at compile time (dynamic function names are resolved at run time, which is why the bug surfaces for them), before the call to setlocale in the script has been executed. Class name resolution is put off until execution time for autoloading and possibly other purposes. Converting class names to lowercase at compile time may work. A quick glance at the source shows that class_name, fully_qualified_class_name and class_name_reference all depend on namespace_name, which is the rule that is responsible for the parsing of the class name. namespace_name: T_STRING { $$ = $1; } | namespace_name T_NS_SEPARATOR T_STRING { zend_do_build_namespace_name(&$$, &$1, &$3 TSRMLS_CC); } ; However, static_scalar is also dependent on namespace_name, and I don't believe that symbol should be made case-insensitive. Creating an additional symbol for case-independency would allow a more targeted approach. The various class symbols would then rely on this new symbol, rather than namespace_name. lc_namespace_name: T_STRING { zend_str_tolower($1); $$ = $1; } | lc_namespace_name T_NS_SEPARATOR T_STRING { zend_str_tolower($3); zend_do_build_namespace_name(&$$, &$1, &$3 TSRMLS_CC); } ; Converting class names to lower case early may have additional consequences. It may affect class names in error messages, for example (I didn't dig deep enough to determine this). __CLASS__ should be unaffected (when defining a class, the class name is parsed as a T_STRING; the value for __CLASS__ comes from this symbol). It also won't resolve the bug for dynamic names. I suspect that altering variable_class_name and dynamic_class_name_reference in a manner described previously (use a custom lowercase conversion or temporarily switch locale) to convert the name would resolve the bug in the dynamic case for class names. Changing a number of the production rules for function_call in a similar manner should resolve the bug for dynamic function call. Again, there will likely be unintended consequences. Alternatively, updating zend_do_begin_dynamic_function_call() and zend_do_fetch_class() to use custom conversion should resolve the bug in the dynamic case. I like the idea of using the system default locale for name conversion (making name resolution independent of the current locale), but am concerned that it will make name lookup slow. Instead, a second set of locale-independent, unicode-aware conversion functions (basically, iliaa's original solution, but Unicode compatible) to be used for identifiers would make name resolution independent of the current locale. Any time an identifiers needs to be converted, it would use one of these functions. As a run-time optimization, non-dynamic class names could use the system locale conversion, but that would be a separate thing from resolving this bug.