I think that as the code changes in since v7 also included some substantial optimisations, I'm no longer certain that there is *in general* a performance hit from v7 onwards... except on Windows, where Mark W has hinted he may soon fix this.

But I'm not absolutely sure. Because the only place I really do massive massive amounts of text processing is on an unattended Windows machine, that's where I see the difference, which I previously attributed to the support for Unicode; on Mac, in general I (a) continue to find the text processing so blindingly fast that it doesn't bother me and (b) don't have a good framework for comparison because this is basically on the machine where I live and work, so there's always a substantial but variable amount of other activity going on.

Ben

On 08/09/2021 16:39, Bob Sneidar via use-livecode wrote:
It sure helped me to understand it! Thanks. As I understand the performance 
issue thought between 6.7 and later versions of LC, it revolves around having 
to process all the unicode strings that are native now. Or so the discussion 
has gone in the past. If not, then the performance hit since v7 has yet to be 
explained sufficiently.

Bob S


On Sep 8, 2021, at 02:42 , Ben Rubinstein via use-livecode 
<use-livecode@lists.runrev.com> wrote:


On 07/09/2021 17:22, Bob Sneidar via use-livecode wrote:
This makes sense to me (I think) because if I am not mistaken, UTF16 is 
Unicode, and UTF8 is simple ASCII. The slowdown from 6.7 to 7.0 was precicely 
the support for Unicode text. Someone will correct me if I am wrong about this. 
As a hobbyist, I try and stay away from localization issues. But I am 
interested in the idea that all text incoming should be text decoded and 
outgoing the inverse. (Did I get that right??)

Cue scenes of strong men reeling back in horror, ladies fainting, etc (Bateman 
cartoons, for those of a British persuasion).

UTF16 is not Unicode, UTF8 is not simple ASCII, and I'm not even sure that the 
slowdown from 6.7 to 7.0 was precisely the support for Unicode text, though I'm 
not sure about that.

Unicode and ASCII are both conventions that assign character interpretations to 
numbers. ASCII assigned approximately 94 character interpretations to the 
numbers 32-126 (plus a few control interpretations to some other numbers). 
WindowsLatin1, MacRoman, ISO-8859-1 etc all did the same but to a wider range 
of numbers up to 255. Unicode does the same thing for a... much... larger 
number of characters and glyphs, and hence using a... much... larger range of 
numbers.

Unicode specifies numbers, not bytes. UTF8 and UTF16 are two of several ways of 
representing Unicode strings in bytes. UTF8 is designed to do so in a way that 
makes ASCII text compatible with UTF8, i.e. a file of ASCII text is a valid 
UTF8 file; the reverse is not necessarily true.

A long-running problem with Metacard, Revolution, LC up to v6 was being 
surprisingly platform-centric about character sets. To this day, textEncode etc 
only support MacRoman on Mac, only support ISO-8859-1 on Linux, and so on; as 
if we never are on one platform, needing to deal with character streams 
generated on another. See
https://quality.livecode.com/show_bug.cgi?id=12205
https://quality.livecode.com/show_bug.cgi?id=22391
https://quality.livecode.com/show_bug.cgi?id=21320

LC7 brought LiveCode into the later part of the 20th century by properly 
supporting Unicode, and by breaking the assumed link between bytes and 
characters. However if I understand correctly, the internal format of strings 
does not, or at least not necessarily, correspond to any externally defined 
standard, but can take various forms in order to maximise efficiencies of 
processing and storage.

Not sure if this helps, but it helped me to write it!

Ben

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to