On 3-12-2010 21:51, Guy Fink wrote:
In some languages some unicode codepoints have different
uppercase/lowercase pair. In example "i" in english (and most
others) region is uppercased to "I" while in Turkish it is
"I"+Upperdot (i can not write it here).
Take a look over: "Why Applications Fail With The Turkish
Language" at
http://www.i18nguy.com/unicode/turkish-i18n.htm
There is no information on the language in a string, even not in a
Unicodestring. So it is impossible to react on this point here.
IMO there is no need to have a language encoded in the string. Strings
won't get autoconverted to upper/lowercase. It's always a user call to
Upper/Lowercase(S)
The uppercase/lowercase tables have been generated purely on the
official Unicode-Character-Description. Characters having a "SMALL"
in their description are replaced by the one having "CAPITAL" on that
place and vice-versa. (only if the counterpart exists) You can't do
more on this level. Please feel free to implement the functionality
you mention, I'll be sure it will be appreciated.
To take the Language into account when converting, functions like
Upper/Lowercase should have a 2nd optional parameter indicating for what
language the conversion should be done.
THen the default conversion still can take place, but based on the
specified language, the exceptions can be implemented (if there anrent
many exceptions, only a simple case will do)
Marc
--
_______________________________________________
Lazarus mailing list
[email protected]
http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus