I have been trying a few variations of the code. It would be nice to have a test branch that use only the data in the repository. I used some fake data instead.
For the tests, I used the function *get-mean-max-bounds* https://github.com/alex-hhh/ActivityLog2/blob/master/rkt/data-frame/meanmax.rkt#L409 with this data (define fake-data2 (for/list ([_ (in-range 10000000)]) (if (< (random) .01) (vector #f #f) (vector (- (random) .5) (- (random) .5))))) so, I tested with (time (get-mean-max-bounds fake-data2)) *** The main time improvement was changing (for ([b bavg] #:when (vector-ref b 1)) ...) to (for ([b (*in-list* bavg)] #:when (vector-ref b 1)) ...) This increase the speed to the double or more. In the microbenchmark, the new duration is the 40%-50% of the original duration. IIUC, in all functions you know the type of sequence of the arguments, so my advice is to add in-list or in-vector to each and every for in the whole file (or project). This is a good general recommendation. With in-list or in-vector or in-range, the generated code is very efficient. Without them, the code has to create a generic object to track the iteration, and the code is much slower. *** I tried eliminating the set! and using for/fold instead. The problem is that the code is slower :(. In general it's better to avoid mutable variables, but in this case removing them makes the program slower. We should take a look at the internal code of Racket and try to fix it, because in a perfect world the version without set! should be faster. Meanwhile, keep the current version... *** I tried replacing the for and set! with an explicit loop. Something like (let loop ([bavg bavg] [min-x #f] [max-x #f] [min-y #f] [max-y #f]) ...) With this change, there is an additional 5% improvement in the speed, but the legibility is reduced too much. So this is better than the version with for and in-list, but I recommend to keep the legible version. *** I tried replacing the initial value of min-x and friends with +inf.0, and removing the if in the updates. I'm convinced this is a good idea, but the change in speed is negligible. In conclusion, try adding as much in-list, in-vector and in-range as you can. Gustavo On Thu, Jan 31, 2019 at 9:58 AM Alex Harsanyi <alexharsa...@gmail.com> wrote: > > On Thursday, January 31, 2019 at 9:23:39 AM UTC+8, Matthew Flatt wrote: >> >> > I would be happy to help you identify where the performance degradation >> > between Racket 7.1 and CS is when running these tests. >> >> Small examples that illustrate slowness in a specific subsystem are >> always helpful. I can't always make the subsystem go faster right away, >> but sometimes. >> >> > I timed some key functions in my application to understand which parts of > Racket CS are slow. I did a write-up in the Gist listed below, but the > result seems to be that even functions that run Racket only code with no IO > or calls into C libraries run slower in Racket CS. Code that calls into > the database library to run SQL insert queries runs significantly slower. > The only things which were faster in Racket CS were one "Racket only" > function, `df-histogram` and a function which retrieved data from an SQL > query, `df-read/sql` > > https://gist.github.com/alex-hhh/1ebc1c83b68ee4620a70fc30d2caa6a3 > > Alex. > > -- > You received this message because you are subscribed to the Google Groups > "Racket Users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to racket-users+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to racket-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.