Re: [PHP-DEV] Complete case-sensitivity in PHP

Galen Wright-Watson Sun, 22 Apr 2012 13:34:14 -0700

2012/4/22 C.Koy <[email protected]>

> On 4/21/2012 4:37 AM, Galen Wright-Watson wrote:
>
>> What about instead creating a special-purpose Zend function to normalize
>> class names (zend_normalize_class_name, or zend_classname_tolower)? This
>> function would examine the current locale and, if it's a problematic one,
>> convert the string to lower case on its own (calling zend_tolower on
>> non-problematic characters). Alternatively, zend_normalize_class_name
>> could
>> switch LC_CTYPE to an appropriate locale (e.g. "UTF-8"; the locale could
>> be
>> determined at compile time), call zend_str_tolower_copy, then switch back
>> before returning. Then, any appropriate function (e.g.
>> zend_resolve_class_name, zend_lookup_class_ex, class_exists,  class_alias)
>> would call zend_normalize_class_name instead of zend_str_tolower_copy/
>> zend_str_tolower_dup.
>>
>
> In plain words/pseudo-code, adding an "if statement" at a certain step
> should suffice, like:
>
> 1. lowercase the name;
> 2. if the effective locale is tr_XY, then replace every "ı" with "i";
> 3. look up the name;
>
> For those who have nothing to do with Turkish locales, that should incur
> the overhead of an "if" condition only.
>
> The fix would need to be applied to at least four functions, so adding a
new function would be more maintainable. Also, there are locales that don't
begin with "tr_" or have "TR" in the locale name, so the condition would
need to be more complex.


Converting "I" or "ı" separately from lowercase conversion is less
performant than either option I describe, as it requires an extra loop,
which is why I didn't bother suggesting it. I suspect switching the locale
is most performant, as it doesn't require additional tests, though I
haven't examined the cost of setting the locale.


> But, I did not start this thread to discuss such bug fix, because:
>
> 1. It does not take a genius to figure it out, and should take minutes to
> implement for someone experienced in the internals. Given the 10 year span
> and dozens of comments/complaints on the bug's entry, it's hard to say this
> issue went unnoticed. So I had to conclude that such fix has quietly been
> overruled for performance and/or other undisclosed reasons.
>

Why does it matter if a solution is simple? If anything, that a fix "does
not take a genius" is an argument in its favor, if it also solves the
problem.

If it's already been rejected privately, it's time to bring the reasons
into the open (which is why I asked). If not, it should be considered
publicly.


> 2. Absent bug #18556, case-sensitive PHP has merits as I stated in other
> post and several people voiced opinions in favor. Case-sensitive PHP is
> worth considering.
>
> It is, but it's also a major BC break, hence perhaps better suited for
PHP6. Case-sensitivity is also a much bigger issue than this bug. A custom
conversion function, on the other hand, produces the minimum impact of any
option I've read. As such, it's hopefully a solution for this bug that
everyone can agree on.


>> Does this bug pop-up for locales other than Turkish, Azerbaijani and
>> Kurdish
>> ?
>>
>
> Theoretically, this problem occurs for any locales sharing a letter
> lowercase of which is different from each other's, and the PHP script
> changes its locale among these locales throughout its execution.
>
> The abstract property that makes a locale problematic is obvious. I was
looking for specific locales, as they need to be identified for a complete
solution.

Re: [PHP-DEV] Complete case-sensitivity in PHP

Reply via email to