On Mon, Nov 12, 2018 at 3:50 PM Monte Goulding via use-livecode < use-livecode@lists.runrev.com> wrote:
> Text strings in LiveCode are native encoded (MacRoman or ISO 8859) where > possible and where you donβt explicitly tell the engine > For what itβs worth using `offset` is the wrong thing to do if you have > textEncoded your strings into binary data. You want to use `byteOffset` > otherwise the engine will convert your data to a string and assume native > encoding. This is probably why you are getting some case insensitivity. > Unless I'm misunderstanding, this hasn't been my observation. Using offset on a string that has been textEncodet()ed to UTF-32 returns values that are 4 * (the character offset - 1) + 1 -- if it were re-encoded, wouldn't it return the actual offsets (except when it fails)? Also, π encodes to 00010001, and routines that convert to UTF-32 and then use offset will find five instances of that character in the UTF-32 encoding because of improper boundaries. To see this, run this code: on mouseUp put textencode("π","UTF-32") into X put textencode("πππ","UTF-32") into Y put offset(X,Y,1) end mouseUp That will return 2, meaning that it found the encoding for X starting at character 2 + 1 = 3 of Y. In other words, it found X using the last half of the first "π" and the first half of the second "π" _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode