vmusulainen wrote:
Hi!

>From comment to #match: method

match: text
        "Answer whether text matches the pattern in this string.
        Matching ignores upper/lower case differences.

Check it now:
1. 'V' match: 'v'  -> true "Ok, It's fine"
2. 'Ш' match: 'ш' -> false "Use non-English (Cyrillic) letters Ups-s"

-regards
Vladimir Musulainen



--
View this message in context: 
http://forum.world.st/String-match-issue-tp4748497.html
Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.



If you debug ('Ш' match: 'ш') and trace through to WideString>>findSubstring:in:startingAt:matchTable:
you will find that (c1 asciiValue) --> 1096
but (matchTable size) --> 256
so the comparison value defaults to (c1 asciiValue + 1) since the character you are comparing is not in the matchTable.
(c2 asciiValue) --> 1064

So for proof of concept change this...
String>>initialize
CaseInsensitiveOrder := (Array new: 2000) fillFrom: AsciiOrder with: #value. "<--MODIFIED"
   ($a to: $z) do:
       [:c | CaseInsensitiveOrder at: c asciiValue + 1
put: (CaseInsensitiveOrder at: c asUppercase asciiValue +1)].
   CaseInsensitiveOrder at: 1096+1 put:1096.    "<--ADDED"
   CaseInsensitiveOrder at: 1064+1 put:1096.   "<--ADDED"

then in Workspace evaluate "String initialize"
and now ('Ш' match: 'ш') --> true.

Now I'm not sure the best way to handle that long term.

btw, you may be tempted to use ('Ш' asciiValue) in place of 1096 in the code, but maybe(I'm not sure) there is a problem saving an image containing Unicode characters.

Maybe String's class variables CaseInsensitiveOrder & CaseSensitiveOrder would be better handled as individual classes to provide flexibility for other sort orders like CaseInsensitiveGermanPhonebook [1] and probably String>>findString:startingAt:caseSensitive: should double-dispatch
and be overriden by WideString.

cheers -ben

[1] http://userguide.icu-project.org/collation
[2] http://www.w3.org/International/wiki/Case_folding
[3] http://cldr.unicode.org/index/cldr-spec/collation-guidelines




Reply via email to