2012/4/22 C.Koy <can5...@gmail.com> > On 4/21/2012 4:37 AM, Galen Wright-Watson wrote: > >> What about instead creating a special-purpose Zend function to normalize >> class names (zend_normalize_class_name, or zend_classname_tolower)? This >> function would examine the current locale and, if it's a problematic one, >> convert the string to lower case on its own (calling zend_tolower on >> non-problematic characters). Alternatively, zend_normalize_class_name >> could >> switch LC_CTYPE to an appropriate locale (e.g. "UTF-8"; the locale could >> be >> determined at compile time), call zend_str_tolower_copy, then switch back >> before returning. Then, any appropriate function (e.g. >> zend_resolve_class_name, zend_lookup_class_ex, class_exists, class_alias) >> would call zend_normalize_class_name instead of zend_str_tolower_copy/ >> zend_str_tolower_dup. >> > > In plain words/pseudo-code, adding an "if statement" at a certain step > should suffice, like: > > 1. lowercase the name; > 2. if the effective locale is tr_XY, then replace every "ı" with "i"; > 3. look up the name; > > For those who have nothing to do with Turkish locales, that should incur > the overhead of an "if" condition only. > > The fix would need to be applied to at least four functions, so adding a new function would be more maintainable. Also, there are locales that don't begin with "tr_" or have "TR" in the locale name, so the condition would need to be more complex.
Converting "I" or "ı" separately from lowercase conversion is less performant than either option I describe, as it requires an extra loop, which is why I didn't bother suggesting it. I suspect switching the locale is most performant, as it doesn't require additional tests, though I haven't examined the cost of setting the locale. > But, I did not start this thread to discuss such bug fix, because: > > 1. It does not take a genius to figure it out, and should take minutes to > implement for someone experienced in the internals. Given the 10 year span > and dozens of comments/complaints on the bug's entry, it's hard to say this > issue went unnoticed. So I had to conclude that such fix has quietly been > overruled for performance and/or other undisclosed reasons. > Why does it matter if a solution is simple? If anything, that a fix "does not take a genius" is an argument in its favor, if it also solves the problem. If it's already been rejected privately, it's time to bring the reasons into the open (which is why I asked). If not, it should be considered publicly. > 2. Absent bug #18556, case-sensitive PHP has merits as I stated in other > post and several people voiced opinions in favor. Case-sensitive PHP is > worth considering. > > It is, but it's also a major BC break, hence perhaps better suited for PHP6. Case-sensitivity is also a much bigger issue than this bug. A custom conversion function, on the other hand, produces the minimum impact of any option I've read. As such, it's hopefully a solution for this bug that everyone can agree on. >> Does this bug pop-up for locales other than Turkish, Azerbaijani and >> Kurdish >> ? >> > > Theoretically, this problem occurs for any locales sharing a letter > lowercase of which is different from each other's, and the PHP script > changes its locale among these locales throughout its execution. > > The abstract property that makes a locale problematic is obvious. I was looking for specific locales, as they need to be identified for a complete solution.