Re: lines of UFT16 text are broken?

Kee Nethery Thu, 31 Mar 2011 09:00:30 -0700

The solution to having line breaks work when using UTF16 text is to not do 
that, it doesn't work.

I tried several characters that contain a byte that could look like a return 
and ... when the text is UTF16 that byte of a normally legit unicode character, 
is treated as a return. My favorite test character was 0A0A which looks like a 
guy with a baseball cap sticking his tongue out and two wiggle lines underneath 
to indicate motion.

Anyway, the solution that seems to work is to keep the data in UTF8 and then 
split it into an array where each line goes into a separate array element.

Then instead of line 5 I look at lineArray[5]

Sure wish LiveCode dealt with unicode as a first class citizen.

Kee

On Mar 29, 2011, at 6:04 PM, Kee Nethery wrote:

> I convert UFT8 text into UTF16 and then work through each line of text.
> 
> The problem I run into is that on my Intel Mac, (bigendian), a return is 
> encoded as "10 0" and if I have a set of characters 
> uniencode("123" & return & "456") 
> it encodes into UTF16 as the bytes:
> 49 0 50 0 51 0_10_0_52 0 53 0 54 0
> 
> When I look at line 1 I get:
> 49 0 50 0 51 0_
> 
> When I look at line 2 I get:
> _0_52 0 53 0 54 0
> 
> The 0 from the return (actually a linefeed) being interpreted as part of the 
> next line. "10 0" is not the line break, "10" is the line break.
> 
> How do I get it to break at "10 0" instead of at "10"? My fear is that I'm 
> going to come across a unicode character that includes "10" in the right 
> location, kind of like "32 10" (no clue what that is) and the system is going 
> to see the "10" and deal with it as the divider between two lines.
> 
> How do people deal with this? Do I need to build a UTF16 version of all the 
> text parsing routines to safely get each line?
> 
> Kee Nethery
> 

_______________________________________________
use-livecode mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: lines of UFT16 text are broken?

Reply via email to