I may be wrong but I thought Mac’s ‘Plain Text’ just meant it’s a ‘text.txt’ MIME type file, which could be encoded as ASCII, UTF-8, UTF-16 or UTF-32, rather than a 'text.rtf’ rich text MIME type file, with the embedded markup for styling, such as bold, italic, etc.
The '<U+FEFF>’ at the start of the document is the Byte Order Mark, suggesting UTF-16 in ‘little-endian’ order - see https://en.wikipedia.org/wiki/Byte_order_mark HTH Best, Keith > On 2 Sep 2021, at 12:12, Alex Tweedly via use-livecode > <use-livecode@lists.runrev.com> wrote: > > Sorry to drag us off the interesting topic of licensing :-) into some > Livecode question. > > I know little or nothing about Unicode, text encodings, etc. - so my question > is indeed naive. > > I have a text file (War & Peace from Project Gutenberg), about 3.4Mb. The Mac > describes it simply as "Plain text". > > When I read that into a variable, and then do > replace tChar by SPACE in tWholeText > it takes between 1000 and 4000 millisecs - versus the 8-10 msecs I had > expected from other samples. > > If I put in > put textEncode(tWHoleText, "UTF8") into tWholeText > before the replace then it does indeed tae 8-10 msecs. > > Q1. What (if anything) am I losing by doing that ? > > Q2. Is this the best alternative ? > > Additional info - I just discovered that according to 'more' command line, > the file start with : > > <U+FEFF>The Project .... > > if that is useful. > > Many thanks, > > Alex. > > > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode