> On May 31, 2017, at 6:14 PM, Jon Zeppieri <zeppi...@gmail.com> wrote:
> 
> On Wed, May 31, 2017 at 5:54 PM, Steve Byan's Lists
> <steve-l...@byan-roper.org> wrote:
>> Hi Mathias,
>> 
>> Thanks for taking a look.
>> 
>>> On May 31, 2017, at 4:13 PM, Matthias Felleisen <matth...@ccs.neu.edu> 
>>> wrote:
>>> 
>>> 
>>> Can you explain why you create a lazy stream instead of a plain list?
>> 
>> The current size of a short binary trace file is about 10 GB, and I want to 
>> scale to traces many hundreds of megabytes in size. The expanded 
>> s-expression form is about 10 times larger, so keeping the whole list in 
>> memory could require up to many terabytes of memory.
>> 
>> Aside from just handling large traces, I also parallelize the problem by 
>> running analysis processes on different trace files concurrently. So the 
>> amount of memory required for the parallel computation would be about 32 
>> times the memory needed for a single trace analysis process.
>> 
>> So, I don't want to try to fit all the records in memory at once. I thought 
>> that the lazy stream would accomplish this --- am I wrong?
> 
> You're right, but using a lazy stream will still consume more than
> just using `read` within the loop that actually processes the data.
> So, for example:
> 
> (define (map-trace stat%-set in-port)
>  (for/fold ([sexp-count 0])
>            ([trace-record (in-port read in)])
>    (+ sexp-count 1)))
> 
> (I didn't try this, but I think it's right.)
> 
> This way, you don't build up a list or a lazy stream; you just process
> each datum as it's read.


Yes, that’s what I would have proposed next. Just fuse the two functions and 
rely on port reading to do the ‘right thing’. 

I understand that this violates John Hughe’s ‘modularity’ argument in support 
of lazy programming but in return, you will need much less space and get a 
faster reader. 


-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to