akborder: > > The threaded version running on 2 cores is moderately faster than the > serial one: > > $ ./Parser +RTS -s -N2 > 2,377,165,256 bytes allocated in the heap > 36,320,944 bytes copied during GC > 6,020,720 bytes maximum residency (6 sample(s)) > 6,933,928 bytes maximum slop > 21 MB total memory in use (0 MB lost due to fragmentation) > > Generation 0: 2410 collections, 0 parallel, 0.33s, 0.34s elapsed > Generation 1: 6 collections, 4 parallel, 0.06s, 0.05s elapsed > > Parallel GC work balance: 1.83 (2314641 / 1265968, ideal 2) > > Task 0 (worker) : MUT time: 2.43s ( 1.19s elapsed) > GC time: 0.02s ( 0.02s elapsed) > > Task 1 (worker) : MUT time: 2.15s ( 1.19s elapsed) > GC time: 0.29s ( 0.30s elapsed) > > Task 2 (worker) : MUT time: 2.37s ( 1.19s elapsed) > GC time: 0.07s ( 0.08s elapsed) > > Task 3 (worker) : MUT time: 2.45s ( 1.19s elapsed) > GC time: 0.00s ( 0.00s elapsed) > > INIT time 0.00s ( 0.00s elapsed) > MUT time 2.06s ( 1.19s elapsed) > GC time 0.39s ( 0.39s elapsed) > EXIT time 0.00s ( 0.00s elapsed) > Total time 2.45s ( 1.58s elapsed) > > %GC time 15.7% (24.9% elapsed) > > Alloc rate 1,151,990,234 bytes per MUT second > > Productivity 84.2% of total user, 130.2% of total elapsed > > > The speedup is smaller than what I was expecting given that each unit > of work (250 input lines) is completely independent from the others. > Changing the size of each work unit did not help; garbage collection > times are small enough that increasing the minimum heap size did not > produce any speedup either. > > Is there anything else I can do to understand why the parallel map > does not provide a significant speedup?
Very interesting idea! I think the big thing would be to measure it with GHC HEAD so you can see how effectively the sparks are being converted into threads. Is there a package and test case somewhere we can try out? -- Don _______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
