JoshyFun schrieb: > Hello Jonas, > > Sunday, November 23, 2008, 6:14:11 PM, you wrote: > >>> //LowerCase arrays size: 2376 bytes >>> const UnicodeLowerCaseArraySource: array [0..593] of WORD=( >>> //UpperCase arrays size: 2408 bytes >>> const UnicodeUpperCaseArraySource: array [0..601] of WORD=( >>> //TitleCase arrays size: 2424 bytes >>> const UnicodeTitleCaseArraySource: array [0..605] of WORD=( > > JM> How does this work, given that upper/lower case sometimes depends on > JM> the language? (e.g., in Turkish the upper case version of "i" is "I" > JM> -- LATIN CAPITAL LETTER I WITH DOT ABOVE) > > Only general case, language tailoring is a completly different beast. > Basic unicode functions are language agnostic and for sure will > produce some bad results in some circunstances. It's almost impossible > to cover all tailoring even using the database about tailorings, not > for the upper/lower but for other operations like word breaking. > > Libraries that cover a lot of language particularities are around > 30-40 (or more) megabytes in runtime data and I think this kind of > dependencies are a no, no, for FPC. >
I think we should simply depend on the OS in this case like the cwstring unit does though linux doesn't make life easy in this case, it requires always a conversion to ucs-4 to get a string upper/lower cased. _______________________________________________ fpc-devel maillist - [email protected] http://lists.freepascal.org/mailman/listinfo/fpc-devel
