Re: Slow stack problem

Mark Waddingham via use-livecode Sat, 29 Jun 2024 02:28:11 -0700

On 2024-06-29 08:53, Neville Smythe via use-livecode wrote:

Is it not the case that the processing time for looping over the numberof lines and getting the k-th line in each iteration, for somearbitrary k, going to be
Order(N^2) * C * T

where

N = the number of lines in the text

C = the number of codepoints in each line
T = the (average) time for processing each codepoint to check for areturn character
Now N and C are the same whether the text is ascii or unicode

Largely - yes - although for stuff like this you need to think in termsof bytes not codepoints (as memory throughput becomes 'a thing' when thestrings are anything longer than a few characters) - so unicode is2*ascii in this regard

[ Its actually more than 2x for longer strings but how much more dependson CPU/memory architecture - CPUs can only read from their level 1cache, and there's a cost to a cache miss, and you get 2x as many cachemisses with unicode data as native data, assuming the data is largerthan a single level 1 cache line. ]

Test 1
If I get rid of the red herring by replacing the matchChunk call with asimple
...
Which would appear to mean that processing unicode codepoints for theapparently simple task of checking for return takes 100 times as muchtime as processing ascii. That seems excessive, to put it mildly!

Its a lot slower certainly, but then searching unicode text for a stringis (in the general case) a lot more complex than searching native/asciitext for a string.

Test 2
With the original code using matchChunk, which of course is going tohave its own internal loop on code points so multiply by another 8 (itonly searches the first few characters)and will not always return true so more lines must be processed — butagain these are same multipliers whether ascii or unicode
...
Plain ascii takes   0.07 seconds
Unicode takes 19.9 seconds, a multiplier of nearly 300. — I caneasily believe matchChunk takes 3 times as long to process unicode asascii, this is the sort of factor I would have expected in Test 1.

So 'Test 2' is slightly misleading - as it still suggests matchChunk iscausing a slowdown - which it isn't.

The difference here is Test 2 is doing more work as it isn't alwaysexiting. If you test:


  get line k of fff
  put true into tFound

I suspect you'll find the time to process your data if it containsunicode is pretty similar to that when matchChunk is also called.

In my quick test (which is 32 index lines, 200 fff lines) I get about10ms (no unicode) vs 1400ms (unicode)

OK Mark, hit me again, I am a glutton for punishment, what is wrongwith this analysis?

Nothing in particular - apart from thinking that matchChunk is actuallya relevant factor here ;)

The reasons this delimiter search operation on unicode strings is somuch slower than native is for two reasons:1) We (well, I) heavily optimized the core native/ascii stringoperations in 2015 to make sure there were as fast as possible2) Searching a unicode string for another string (which is what isgoing on here) is much more complex than doing the same for native/ascii


Native/ascii strings have some very pleasant properties:
  - one byte (codeunit) is one character - always.
  - each character has only one representation - its byte value

- casing is a simple mapping between lower and upper case characters -and only about 25% of characters are affected


Unicode has none of these properties

- a unicode character (grapheme) can be arbitrarily many codeunits (2byte quantities) long- characters can have multiple representations - e.g. e-acute vse,combining-acute- casing is not (in general) a simple mapping of one codeunit toanother

Currently the unicode operations in the engine are largely unoptimized -they assume the general case in all things so even searching a stringfor LF (which is the case here) is still done under the assumption thatit might need that (very hefty) extra processing.

Of course it would be nice to have highly optimized low-level unicodestring optimizations but you have to pick your battles (particular whenthe battles are incredibly technical ones!) but the reality is that this(admittedly large!) speed difference is only really noticeable 'atscale' and when scale is involved, there's pretty much always analgorithmic change which can make those 'low-level' performancedifferences irrelevant.


The case here is a really good example.

The line X based code gives (no matchChunk / with matchChunk):

  ASCII 300 lines  13ms / 22ms
  ASCII 3000 lines - 986ms / 1104ms
  ASCII 10000 lines - 10804ms / 11213ms

The array based code gives (no matchChunk / with matchChunk):

  ASCII 300 lines - 2ms / 11ms
  ASCII 3000 lines - 19ms / 101ms
  ASCII 10000 lines - 69ms / 336ms

  UNICODE 300 lines - 7ms / 12ms
  UNICODE 3000 lines - 52ms / 108ms
  UNICODE 10000 lines - 170ms / 359ms

Warmest Regards,

Mark.

--
Mark Waddingham ~ m...@livecode.com ~ http://www.livecode.com/
LiveCode: Build Amazing Things

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Slow stack problem

Reply via email to