On Nov 10, 2012, at 7:29 AM, Robby Findler wrote:

> If you're calling from Racket to TR then you have the contract
> checking and probably the floats flowing thru there need boxing.

If I understand you correctly, the contract checking would just be "is this a 
float?" which I imagine wouldn't require additional memory. Certainly the 
return value would have to be boxed, and that would definitely be eight bytes, 
I'm guessing.
> 
> Can you put the loop itself into TR?

Sadly, no; the loops themselves are written by students. Or, more specifically, 
written by students as  a "network" form that expands into a function that's 
called in a loop. I think that expanding into TR would be incredibly hard to 
get right.

John

> 
> Robby
> 
> On Sat, Nov 10, 2012 at 9:22 AM, John Clements
> <cleme...@brinckerhoff.org> wrote:
>> I'm trying to implement some simple comb filters for a reverb, using racket 
>> and/or typed racket.  I have six of these running in parallel; each one has 
>> a vector, and each time a sample arrives, each comb needs to perform two 
>> floating-point multiplies and two floating point adds, increment a counter 
>> with possible reset, and store/mutate two locations in memory to prepare for 
>> next time.
>> 
>> The problem for code like this isn't runtime, directly; it's all the GC. 
>> Adding this filter to a simple playback was observed to generate an 
>> additional 1.6 GB of garbage for a 60-second session[*], which sounds like a 
>> lot until you divide by 60 seconds and the 44.1K sample rate, to get 606 
>> bytes/sample frame. Regardless, you could definitely do it with zero garbage 
>> in C, so I set out to try to reduce this.
>> 
>> I guessed that most of the garbage in this case was related to boxing of 
>> floats, so I decided to use TR to try to eliminate this. I hauled my code 
>> over to TR, and it worked completely without modification, which was a joy. 
>> Also, the optimization coach tells me that everything is green, and staying 
>> in the Float realm. Unfortunately, it didn't improve the memory use much, 
>> and after some experiments, it looks like it reduces the memory overhead per 
>> comb filter by roughly half, to 278 bytes/sample frame, *but* imposes its 
>> own fixed overhead of 240 bytes/sample frame, which pretty much negates the 
>> benefit of the reduction.
>> 
>> So, my question is this: should making a call from racket to this TR code
>> 
>> (: dummy2 (Float -> Float))
>> (define (dummy2 in)
>>  (* 0.1 in))
>> 
>> ... generate about 240 bytes in garbage?
>> 
>> 
>> 
>> 
>> FWIW, here's what a comb filter function looks like:
>> 
>> (: comb1 (Float -> Float))
>> (define (comb1 in)
>>  (define delayed1 (flvector-ref v1 c1))
>>  (define midnode1 (fl+ delayed1 (fl* g11 m1)))
>>  (define out1 (fl+ (fl* g21 midnode1) in))
>>  (flvector-set! v1 c1 out1)
>>  (define next-c1 (add1 c1))
>>  (set! c1 (cond [(<= d1 next-c1) 0]
>>                 [else next-c1]))
>>  (set! m1 midnode1)
>>  out1)
>> 
>> I can't see anything in this that would cause allocation.
>> 
>> 
>> Maybe the next step is to take a look at the compiled bytecode....
>> 
>> John
>> 
>> 
>> 
>> 
>> 
>> [*] FWIW, I'm observing this by running at the command line with -W debug 
>> and then parsing the GC output that appears on the console.
>> 
>> 
>> ____________________
>>  Racket Users list:
>>  http://lists.racket-lang.org/users
>> 

Attachment: smime.p7s
Description: S/MIME cryptographic signature

____________________
  Racket Users list:
  http://lists.racket-lang.org/users

Reply via email to