Thanks for this useful synthesis, Slava. Best regards,
Pierre Le 11 juin 2011 à 21:12, Slava Paperno a écrit : > The "set useUnicode to true" command is necessary only if you use the > charToNum() or numToChar() functions. Otherwise they’re not useful. > > The text in your fields is in UTF-16, and you should access it as unicodeText > of field "MyField." > > Word chunks of unicodeText can be correctly retrieved if you use: > > word 2 of unicodeText of field "MyField" > > There is a tutorial by Devin Asay on the use of Unicode in LiveCode at > http://www.runrev.com/developers/lessons-and-tutorials/tutorials/unicode-in-revolution/ > It has examples of retrieving a specific word chunk. > > If you start processing Russian text in your variables, you will often find > it better to convert it to UTF8 first: put uniDecode(unicodeText of field > "MyField", "UTF") into MyUTF8String. To put the result back into a field, > convert it back to UTF-16: set unicodeText of field "MyField" to > uniEncode(MyUTF8String, "UTF8") > > A sure-fire way to do any sort of string comparisons is to convert everything > to decimal code points and then work with the numbers. Some parts of LC is > not capable of shipping Unicode strings, and in those situations using the > numbers solves the problem. > > If you are reading your UTF-8 text from Unicode text files (e.g. saved from > Notepad with the UTF-8 encoding option), you may have to take into account > the first three bytes that you read in: they are the Byte Order Marker. > You'll want to delete them from your strings before trying to access a > specific byte in the string. > > If you still get into trouble, feel free to ask me offlist (s...@cornell.edu) > for a sample application that shows these operations. I'm still working on > it, but when I'm done, I'll make it available online. > > Best regards, > > Slava > > >> -----Original Message----- >> From: use-livecode-boun...@lists.runrev.com [mailto:use-livecode- >> boun...@lists.runrev.com] On Behalf Of Richmond Mathewson >> Sent: Saturday, June 11, 2011 2:24 PM >> To: How to use LiveCode >> Subject: Re: double byte chars? >> >> On 06/11/2011 09:14 PM, Lars Brehmer wrote: >>> My project has Russian text fields (Arial,Russian). With one >> exception, everything works fine. >>> >>> Problem: a filter-as-you-type script. >>> >>> field "t1": зо >>> field "t2": меня зовут Виктор --underscoring shows the matches-- >>> field "t3": зовут курить почему >>> >>> I want to do is find a word in fields t2 and t3 that begins with the >> 2 letters in field t1. Word 2 in field t2 and word 1 in field t3 should >> be matches. But this only works if the matching word is the first word >> in the field! >>> >>> Some simple message box scripts: >> >> At the risk of insulting you, as you are using Unicode I have a funny >> feeling you have to >> prefix this sort of this with >> >> set the useUnicode to true >>> put fld "t1"& cr& fld "t2"& cr& fld "t3" >>> >>> The result is a bunch of numbers, symbols and squares. You can >> clearly spot the matches. >>> >>> Next in the message box: --char 1 to 4 -- double byte chars-- >>> >>> put char 1 to 4 in fld "t1" into aText >>> put char 1 to 4 in word 2 in fld "t2" into bText >>> put char 1 to 4 in word 1 in fld "t3" into cText >>> put aText& cr& bText& cr& cText >>> >>> This should be 3 identical lines, right? But no. Line 2 is missing >> the final char. >>> >>> 7(square)>(square) >>> 7(square)> >>> 7(square)>(square) >>> >>> Next: comparing the strings >>> >>> if cText = aText then beep - it beeps >>> if cText is in aText then beep - it beeps >>> if bText = aText then beep - no beep, obviously >>> >>> BUT >>> >>> if bText is in aText then beep - also no beep! >>> >>> And then >>> >>> put char 1 to 5 in word 2 in field "t2", it returns the same as the >> other two: >>> >>> 7(square)>(square) >>> >>> so then >>> >>> put char 1 to 5 in word 2 into bText >>> >>> but >>> >>> if bText = (or is in) aText still returns nothing >>> >>> Why is that last double byte char always missing when the word is not >> word 1 in its field? If I do char 1 to 3 I get this (again!) >>> >>> 7(square)> >>> 7(square) --last char missing! >>> 7(square)> >>> >>> Using itemDEL = space and char 1 to x in item z behaves the same. >>> >>> Anyone know the answer? >>> >>> Cheers, >>> >>> Lars >>> _______________________________________________ >>> use-livecode mailing list >>> use-livecode@lists.runrev.com >>> Please visit this url to subscribe, unsubscribe and manage your >> subscription preferences: >>> http://lists.runrev.com/mailman/listinfo/use-livecode >> >> >> _______________________________________________ >> use-livecode mailing list >> use-livecode@lists.runrev.com >> Please visit this url to subscribe, unsubscribe and manage your >> subscription preferences: >> http://lists.runrev.com/mailman/listinfo/use-livecode > > > > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode -- Pierre Sahores mobile : (33) 6 03 95 77 70 www.woooooooords.com www.sahores-conseil.com _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode