Briefly, here is the problem: put the selectedText of field "BilingualText" of this card into tCurrSelection --the field may hold, for example, the two words Боб Bob, which is the string assigned to the unicodeText prop. of the field
set the unicodeText of field "YourSelection" of this card to tCurrSelection --alternatively: set the unicodeText of field "YourSelection" of this card to uniEncode(tCurrSelection, "UTF8") Neither alternative works for bilingual text. One version works only for English, the other works only for Russian. In each case, the other language is unreadable in field "YourSelection". In detail: I'm working with bilingual texts (Russian and English) in various LC contexts. The Russian portion is in Unicode. I found things that simply cannot be used with Unicode, and also things that do work, or can be adapted to work. But now I'm stuck with the selectedText and the selectedChunk properties of a text entry field. Here's the script: put the selectedText of field "BilingualText" of this card into tCurrSelection --the field may hold, for example, Боб Bob; the value assigned to the unicodeText prop. of the field set the unicodeText of field "YourSelection" of this card to tCurrSelection The above snippet works fine when the selected text in field "BilingualText" is all Russian. When it is English, some Chinese characters are displayed in field "YourSelection." The reason, as far as I understand, is that the Russian text, when stored in variable tCurrSelection is already uniEncoded, but the English text is not, so before it can be displayed in a Unicode field, it has to be uniEncoded, like this: put the selectedText of field "BilingualText" of this card into tCurrSelection set the unicodeText of field "YourSelection" of this card to uniEncode(tCurrSelection, "UTF8") Indeed, the above snippet works when the selected text is English, but it displays non-readable text when it is Russian (because--I think--the text is twice uniEncoded; I've learned to recognize degrees of "unreadability"). Using the selectedChunk property has the same problem. When I try to examine the decimal code point, charToNum() of each character of the selection and determine whether it is Roman or not, I run into the same problem: if I know that the character I am testing is double-byte Cyrillic, I have to use charToNum(char N to N+1) for each character. But if I use that formula for an ANSI string, I get meaningless results (especially useless for a string with an odd number of characters, like "Bob" because char 3 to 4 of "Bob" returns empty). I thought that Roman letters would be stored in the selectedText property as the combination of a null byte followed by the ANSI code, but apparently that is not the case: "Bob" is stored as three bytes, even when it is part of the selectedText of a Unicode field whereas a Russian three-letter word is stored in the selectedText in 6 bytes. If this sounds wrong, then maybe I am wrong. I'd like to know. I also tried examining the individual bytes in the string, byteToNum() but that doesn't help either, because, for example, decimal 66 can be an ANSI character or it can be the first byte of a double-byte Cyrillic letter. I do know about the requirement to use "set useUnicode to true" for the charToNum() to work. Finally, I tried to examine a uniDecoded() version of the selectedText, and also got nowhere. Is there a solution to this conundrum? Am I missing something obvious? Thanks! Slava _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode