Now, you have me worried, Richard. Maybe I missed something in what the engine does with text files...
On Jun 7, 2013, at 8:11 AM, Dar Scott wrote: > Yeah, there is no need to use binfile, but it is OK. You can process the > line ends before or after converting to Unicode, if you do. > > Not too cautious for not knowing. It is a normal and right approach to be > aware of potential problems and make code robust for those, but now you know. > > Assuming a valid UTF-8 file... > > Only the ASCII characters in UTF-8 have the high bit zero. They are > represented as single bytes. (ASCII files are UTF-8 files.) All other > characters are represented with multiple bytes that have the high bit set, > not just the first but even the following. (The first byte in binary is > 11xxxxxx and the continuing bytes are 10xxxxxx.) > > This means there are no CR, LF, tab, or comma hidden in the non-ASCII > characters. ASCII never has the high bit set. You can use line and item > chunks with UTF-8. You can use offset (with care) and replace. > > Now, here is where I'm ignorant. I am cautious, perhaps overly cautious. I > don't use word or token with UTF-8. I can never remember how word works, > much less token. Maybe the above is enough for somebody to comment. > > Dar > > > On Jun 7, 2013, at 7:39 AM, Richard Gaskin wrote: > >> Dar Scott wrote: >> >>> You can use "file:" with UTF-8. No ghost ASCII CR or LF will show >>> up in the representation of any characters other than CR and LF. >> >> Maybe I'm just superstitious, but I've always used "binfile" with Unicode >> because I didn't expect the engine to understand the difference between >> bytes used as line endings and those same bytes that may appear as part of a >> character byte sequence. >> >> Have I been too cautious? >> >> -- >> Richard Gaskin >> Fourth World >> LiveCode training and consulting: http://www.fourthworld.com >> Webzine for LiveCode developers: http://www.LiveCodeJournal.com >> Follow me on Twitter: http://twitter.com/FourthWorldSys >> >> _______________________________________________ >> use-livecode mailing list >> use-livecode@lists.runrev.com >> Please visit this url to subscribe, unsubscribe and manage your subscription >> preferences: >> http://lists.runrev.com/mailman/listinfo/use-livecode > > > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode