please do goeff.... this subject is very interesting to me. i have some problems where i need to optimize in a similar way. which kind of repeat look have u found works fastest? have u tried ? repeat for each key this_key in array? is that slower?
i love saving milliseconds. :) makes a big diff at scale. On Mon, Jun 11, 2018 at 8:21 PM, Geoff Canyon via use-livecode < use-livecode@lists.runrev.com> wrote: > I have a routine that takes about a minute to run for the test case I > created, and who knows how long for the real use case. Given that I want to > run it several thousand times for actual work, I need it to run (much) > faster. > > Roughly, the routine gathers together a bunch of data from arrays, sorts > the data based on its relationship to other arrays, and then returns a > subset of the result. > > My first pass at the routine looked roughly like this: > > repeat for each line T in the keys of interestArray[uID] > repeat for each line S in storyArray[T] > if abs(item 2 of S - item 1 of interestArray[uID][T]) < 20 \ > and userSeenArray[uID][item 1 of S] < 4 > then put (101 + userSeenArray[uID][item 1 of S] * 30 + 5 * \ > abs(item 2 of S - item 1 of interestArray[uID][T]) - \ > item 2 of interestArray[uID][T]),T,S & cr after > candidateList > end repeat > end repeat > sort lines of candidateList numeric by random(item 1 of each) > > In simple terms: parse through the individual lines of all the entries that > possibly work, calculating a relevant value for each and appending that > value and the line to an interim list, which then gets sorted, randomly > favoring lower values. > > I assumed the problem was all the line-by-line parsing, and I thought of a > clever way to accomplish the same thing. That led to this: > > put storyArray into R > intersect R with interestArray[uID] > combine R using cr and comma > sort lines of R by (101 + userSeenArray[uID][item 2 of each] * 30 + 5 * > \ > abs(item 3 of each - item 1 of interestArray[uID][item 1 of each]) > \ > - item 2 of interestArray[uID][item 1 of each]) > > Much simpler, albeit that's a heck of a "sort by" -- more complex by far > than any I had previously created, and a testament to the power and > flexibility of "each". Alas, despite condensing my code and removing > parsing and loops, that version took ten seconds more than the previous > version, I'm guessing because the previous version has that "if" statement > that weeds out undesirable entries before the sort has to deal with them. > > (I'm writing this email as I parse through this figuring it out) > > So it turns out that the crazy use of "each" is only part of the problem -- > changing that line to: > > sort lines of R by random(10000) > > still takes over 20 seconds -- 3x as fast, but still far too slow. It turns > out that the source data numbers anywhere from 1,000 to 2,000 lines, so > sorting it in any way to randomly select the several lines I need is a > really bad choice. removing the sort step but keeping everything else cuts > the execution time down to under a second. > > Hmm. How to select several lines at weighted random from among a couple > thousand? I'll think on this and post a follow-up. > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode