ekmett:
> Hi Anakim,
>  
> Nice to see someone else working in this space.
>  
> I have also been working on a set of parallel parsing techniques, which can 
> use
> small Parsec parsers for local context sensitivity.
>  
> See the second set of slides in http://comonad.com/reader/2009/
> iteratees-parsec-and-monoid/ for an overview of how I'm doing something 
> similar
> to feed Parsec independent chunks. Note that this approach bypasses the need
> for a separate sequential scan, which otherwise floods your cache, and lets 
> you
> get closer to the performance limit imposed by Amdahl's law.
>  
> The code in the second set of slides can be adapted to your case: load
> everything into a lazy bytestring or fingertree of strict bytestrings, then 
> for
> each strict bytestring chunk in parallel, scan it for the first newline, and
> then start an iteratee based parsec parser from that point. I use the iteratee
> based parsec parsers so that when I glue the partial parses together I can 
> feed
> the unparsed data on the left side of the first newline in each chunk to the
> parser I'm joining on the left. I provide a monoid for the purpose of gluing
> together these partial parses, which encapsulates this behavior. 

You can get quite a long way by using bytestring-mmap and strict
bytestrings. The first ensures your IO overhead will be low, and the
second ensures you don't migrate work needlessly.
_______________________________________________
Haskell-Cafe mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to