If you're calling from Racket to TR then you have the contract checking and probably the floats flowing thru there need boxing.
Can you put the loop itself into TR? Robby On Sat, Nov 10, 2012 at 9:22 AM, John Clements <cleme...@brinckerhoff.org> wrote: > I'm trying to implement some simple comb filters for a reverb, using racket > and/or typed racket. I have six of these running in parallel; each one has a > vector, and each time a sample arrives, each comb needs to perform two > floating-point multiplies and two floating point adds, increment a counter > with possible reset, and store/mutate two locations in memory to prepare for > next time. > > The problem for code like this isn't runtime, directly; it's all the GC. > Adding this filter to a simple playback was observed to generate an > additional 1.6 GB of garbage for a 60-second session[*], which sounds like a > lot until you divide by 60 seconds and the 44.1K sample rate, to get 606 > bytes/sample frame. Regardless, you could definitely do it with zero garbage > in C, so I set out to try to reduce this. > > I guessed that most of the garbage in this case was related to boxing of > floats, so I decided to use TR to try to eliminate this. I hauled my code > over to TR, and it worked completely without modification, which was a joy. > Also, the optimization coach tells me that everything is green, and staying > in the Float realm. Unfortunately, it didn't improve the memory use much, and > after some experiments, it looks like it reduces the memory overhead per comb > filter by roughly half, to 278 bytes/sample frame, *but* imposes its own > fixed overhead of 240 bytes/sample frame, which pretty much negates the > benefit of the reduction. > > So, my question is this: should making a call from racket to this TR code > > (: dummy2 (Float -> Float)) > (define (dummy2 in) > (* 0.1 in)) > > ... generate about 240 bytes in garbage? > > > > > FWIW, here's what a comb filter function looks like: > > (: comb1 (Float -> Float)) > (define (comb1 in) > (define delayed1 (flvector-ref v1 c1)) > (define midnode1 (fl+ delayed1 (fl* g11 m1))) > (define out1 (fl+ (fl* g21 midnode1) in)) > (flvector-set! v1 c1 out1) > (define next-c1 (add1 c1)) > (set! c1 (cond [(<= d1 next-c1) 0] > [else next-c1])) > (set! m1 midnode1) > out1) > > I can't see anything in this that would cause allocation. > > > Maybe the next step is to take a look at the compiled bytecode.... > > John > > > > > > [*] FWIW, I'm observing this by running at the command line with -W debug and > then parsing the GC output that appears on the console. > > > ____________________ > Racket Users list: > http://lists.racket-lang.org/users > ____________________ Racket Users list: http://lists.racket-lang.org/users