I'm very much hoping that Mark W might magically fix this in 9.6.5.
But in the meantime FWIW, the place where this was really hurting (in a script
that took 8 minutes under LC6, was taking 8 hours under LC9, but I've
gradually tamed it down to under an hour by buffering the large accumulations)
was a single sort command, on 70 MB of data in approx 223,000 lines.
I've replaced this line:
sort lines of tNewTable by item iSortCol of each
which took 1 second on Mac, 2063 seconds (i.e. 34 minutes) on Windows, with a
call to this command
command sortLinesByTabbedColumn @tTable, iSortCol
local aTable, tSortTable, iARcounter, tARbuffer, tRow, k
-- load table into an array for fast access by line number
put tTable into aTable
split aTable using return
-- compile index of just the column to sort on, and line number
set the itemDelimiter to tab
repeat for each key k in aTable
get (item iSortCol of aTable[k]) && k
appendRow it, iARcounter, tARbuffer, tSortTable
end repeat
put tARbuffer after tSortTable
-- sort it
sort lines of tSortTable
-- rebuild table out of array, in sorted order
put empty into tARbuffer
put empty into tTable
repeat for each line tRow in tSortTable
put last word of tRow into k
appendRow aTable[k], iARcounter, tARbuffer, tTable
end repeat
put tARbuffer after tTable
end sortLinesByTabbedColumn
which takes 25 seconds on Windows (to my surprise, most of that time was in
the final 'rebuild' loop).
On 02/09/2021 23:53, Bob Sneidar via use-livecode wrote:
I am going to say no, because you still have to traverse the file once to get
it into sqLite, then do the sort, then write out the file when done. I might be
mistaken, the subsequent SQL sort may make up for lost time. Using a memory SQL
really shines when you need to make multiple passes at the data using different
queries. One pass may not impress you much.
For instance, I have a File Management module built into my application. A file
can belong to a customer, and also to a site, and also to a device. Like so:
custid siteid deviceid filepath
123 disk/folder/file1
456 098 disk/folder/file2
789 765 432 disk/folder/file3
Note all have a custid, some have a siteid as well, and some also have a
deviceid.
So rather than query mySQL for the files for each site or device as I select
them, I instead, upon selecting a customer, query mySQL for ALL the file
records for that customer, (which of course contain the file records for all
the sites and devices), then store that in a memory database. Then when a
different site or device belonging to that customer is selected, I query the
memory database for those belonging to that site, or that device in those
modules respectively.
The performance enhancement is significant.
Another way I apply this is to get the objects on a card passing a list of
properties I'm interested in, then store the data in a memory database. I can
then query for objects with certain properties without having to iterate
through all the objects on a card in a repeat loop. For instance, the farthest
left, top, right and bottom object whose visible is true in 4 memory db
queries, giving me the total rect of all the visible objects without
grouping/ungrouping and the hell that can ensue.
Bob S
On Sep 2, 2021, at 11:22 , Bernard Devlin via use-livecode
<use-livecode@lists.runrev.com> wrote:
Whilst waiting for a fix, would a temporary solution be to use sqlite to
create an in-memory database and let sqlite do the sorting for you?
Regards, Bernard.
On Mon, Aug 30, 2021 at 8:23 PM Ben Rubinstein via use-livecode <
use-livecode@lists.runrev.com> wrote:
Thanks to Mark Waddingham's advice about using a buffer var when
accumulating
a large text variabel in stages, I've now got a script that took 8 hours
under
LC9, and (8 minutes under LC6) down by stages to just under 1 hour under
LC9.
However I have some remaining issues not amenable to this approach; of
which
the most significant relates to the sort command.
In all cases it seems to take much longer under LC9 than it did under LC6;
although the factor is quite variable. The most dramatic is one instance,
in
which this statement:
sort lines of tNewTable by item iSortCol of each
takes 35 minutes to execute. `tNewTable` is a variable consisting of some
223,000 lines of text; approx 70MB. The exact same statement with the same
data on the same computer in LC6 takes just 1 second.
Has anyone else noticed something of this sort? As I said, the effect
varies:
e.g. 54 seconds versus 1 second; 22 seconds versus 1 second. So it may not
be
so noticeable in all cases.
TIA,
Ben
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode
_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode