Crazy idea - totally untried .... (sorry, I don't have a Win machine)
put 1 into tLineCount
repeat for each line tRow in tWorkTable
put tRow into tNewTable[tLineCount]
add 1 to tLineCount
end repeat
combine tNewTable using CR
Alex.
On 25/08/2021 18:15, Ben Rubinstein via use-livecode wrote:
Some 20 months ago, I reported that I was in a situation where an app
written in 6.7 needed to be updated to access 64bit drivers, which
meant updating to 9.5 - which displayed horrifying increase in
processing time.
In fact I was able to put off the evil day - but now it has returned,
and can be put off no longer. A process that normally takes 2 hours is
currently taking 9. The core processing stage has gone from around ten
minutes to over six hours.
After way too long, I've finally got down to at least one smoking gun;
which is as simple as can be.
Part of what took me so long is a confusion; in production the process
runs on Windows, but I develop on Mac. Although on Mac the overall
process does take about a third longer in LC9 than LC6, the simple
tests I've finally isolated actually run much _quicker_ in LC9 than
LC6. So switching between LC6 and LC9 on Mac as I tried to isolate the
issue was giving confusing signals. But unmistakeably it's *much*
slower on Windows.
A simple routine which loops over a load of tab and return formatted
data loaded from a TSV file, to truncate a particular field, had the
following results processing a 70MB file of approximately 257,000 rows:
6.7.11 MacOS 9 seconds
6.7.11 Win32 10 seconds
9.6.3 MacOS 2 seconds
9.6.3 Win32 498 seconds
I simplified it down to this (pointless) loop which just rebuilds a
table one line at a time:
local tNewTable
repeat for each line tRow in tWorkTable
put tRow & return after tNewTable
end repeat
with these results:
6.7.11 MacOS 8 seconds
6.7.11 Win32 7 seconds
9.6.3 MacOS 0 seconds
9.6.3 Win32 591 seconds
(there's obviously a lot of variability in these - both were running
in IDE, on a logged-in computer, so stuff was probably going on in the
background; but I know the overall effect is similar when built as
standalone and running by schedule on an unattended machine. But the
key thing is: for this task, LC9 is dramatically slower on Windows!)
Have others seen something like this?
When I posted about this before (thread: "OMG text processing
performance 6.7 - 9.5") Mark Waddingham suggested that it might be to
do with a hidden cost of binary<->text transforms. That makes some
sense; but given that the text already exists, I'm wondering whether
taking a line out of text would cause it to be transformed, only to be
transformed again when appending? And in particular, why this would
affect Windows only.
I have also added tests using "is strictly a binary string" in the
code above, and this was true for neither input 'tWorkTable', nor the
output 'tNewTable', nor any of the 257,00 extracted lines.
However it is definitely the accumulating of text that is the issue -
simply looping over the lines - even with testing each one to see if
it is "strictly a binary string" - is a second or less on Windows in LC9.
Has anyone had similar experiences? Suggestions for how this could be
avoided?
Many thanks in advance,
Ben
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your
subscription preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode