Now, you have me worried, Richard.  Maybe I missed something in what the engine 
does with text files...


On Jun 7, 2013, at 8:11 AM, Dar Scott wrote:

> Yeah, there is no need to use binfile, but it is OK.   You can process the 
> line ends before or after converting to Unicode, if you do.
> 
> Not too cautious for not knowing.  It is a normal and right approach to be 
> aware of potential problems and make code robust for those, but now you know.
> 
> Assuming a valid UTF-8 file...
> 
> Only the ASCII characters in UTF-8 have the high bit zero.  They are 
> represented as single bytes.  (ASCII files are UTF-8 files.)  All other 
> characters are represented with multiple bytes that have the high bit set, 
> not just the first but even the following.  (The first byte in binary is 
> 11xxxxxx and the continuing bytes are 10xxxxxx.)
> 
> This means there are no CR, LF, tab, or comma hidden in the non-ASCII 
> characters.  ASCII never has the high bit set.  You can use line and item 
> chunks with UTF-8.  You can use offset (with care) and replace.
> 
> Now, here is where I'm ignorant.  I am cautious, perhaps overly cautious.  I 
> don't use word or token with UTF-8.  I can never remember how word works, 
> much less token.  Maybe the above is enough for somebody to comment.
> 
> Dar
> 
> 
> On Jun 7, 2013, at 7:39 AM, Richard Gaskin wrote:
> 
>> Dar Scott wrote:
>> 
>>> You can use "file:" with UTF-8.  No ghost ASCII CR or LF will show
>>> up in the representation of any characters other than CR and LF.
>> 
>> Maybe I'm just superstitious, but I've always used "binfile" with Unicode 
>> because I didn't expect the engine to understand the difference between 
>> bytes used as line endings and those same bytes that may appear as part of a 
>> character byte sequence.
>> 
>> Have I been too cautious?
>> 
>> --
>> Richard Gaskin
>> Fourth World
>> LiveCode training and consulting: http://www.fourthworld.com
>> Webzine for LiveCode developers: http://www.LiveCodeJournal.com
>> Follow me on Twitter:  http://twitter.com/FourthWorldSys
>> 
>> _______________________________________________
>> use-livecode mailing list
>> use-livecode@lists.runrev.com
>> Please visit this url to subscribe, unsubscribe and manage your subscription 
>> preferences:
>> http://lists.runrev.com/mailman/listinfo/use-livecode
> 
> 
> _______________________________________________
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription 
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to