Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-05 Thread Manlio Perillo
Don Stewart ha scritto: manlio_perillo: Hi. After some work I have managed to implement two simple programs that parse the Netflix Prize data set. For details about the Netflix Prize, there was a post by Kenneth Hoste some time ago. I have cabalized the program, and made available here:

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-04 Thread Don Stewart
manlio_perillo: > Hi. > > After some work I have managed to implement two simple programs that > parse the Netflix Prize data set. > > For details about the Netflix Prize, there was a post by Kenneth Hoste > some time ago. > > I have cabalized the program, and made available here: > http://hask

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Manlio Perillo
Manlio Perillo ha scritto: Manlio Perillo ha scritto: [...] I have executed the program, using the same RTS flags as yours: real6m13.523s user0m53.931s sys0m7.812s 815 MB usage This is an huge improvement! Using UArray and empty + insert: real 5m40.732s user 0m5

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Manlio Perillo
Manlio Perillo ha scritto: [...] I have executed the program, using the same RTS flags as yours: real6m13.523s user0m53.931s sys0m7.812s 815 MB usage This is an huge improvement! Now I have to check if using insert will further improve memory usage. And ... surprise!

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Manlio Perillo
Kenneth Hoste ha scritto: [...] The 26m/700MB I mentioned on my blog was on my ancient PowerBook G4 (1.5GHz PowerPC G4, 1.25G). I redid the same experiment on our iMac (Core2 Duo, 2.0 GHz, 3.0G), i.e.: - read in all the data - count the number of keys in the IntMap (which should be 17,770, i.e

Re[2]: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Bulat Ziganshin
Hello Austin, Monday, March 2, 2009, 11:51:52 PM, you wrote: >> let's calculate. if at GC moment your program has allocated 100 mb of >> memory and only 50 mb was not a garbage, then memory usage will be 150 >> mb > ? A copying collector allocates a piece of memory (say 10mb) which is > used as

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Austin Seipp
Excerpts from Bulat Ziganshin's message of Mon Mar 02 10:14:35 -0600 2009: > let's calculate. if at GC moment your program has allocated 100 mb of > memory and only 50 mb was not a garbage, then memory usage will be 150 > mb ? A copying collector allocates a piece of memory (say 10mb) which is use

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Andy Georges
Hi Kenneth, I've thrown my current code online at http://boegel.kejo.be/files/Netflix_read-and-parse_24-02-2009.hs , let me know if it's helpful in any way... Maybe you could set up a darcs repo for this, such that we can submit patches against your code? -- Andy

Re[2]: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Bulat Ziganshin
Hello Kenneth, Monday, March 2, 2009, 11:14:27 PM, you wrote: > I think my approach is turning out better because I'm: > - building up the IntMap using 'empty' and 'insert', instead of > combining 17,770 'singleton' IntMaps >(which probably results better GC behavior) i don't read into de

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Kenneth Hoste
On Mar 2, 2009, at 19:13 , Manlio Perillo wrote: Manlio Perillo ha scritto: [...] > moreover, you may set up"growing factor". with a g.f. of 1.5, for example, memory will be collected once heap will become 1.5x larger than real memory usage after last GC. this effectively guarantees that me

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Manlio Perillo
Manlio Perillo ha scritto: [...] > moreover, you may set up"growing factor". with a g.f. of 1.5, for example, memory will be collected once heap will become 1.5x larger than real memory usage after last GC. this effectively guarantees that memory overhead will never be over this factor Thank

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Manlio Perillo
Bulat Ziganshin ha scritto: Hello Manlio, Monday, March 2, 2009, 8:16:10 PM, you wrote: By the way: I have written the first version of the program to parse Netflix training data set in D. I also used ncpu * 1.5 threads, to parse files concurrently. However execution was *really* slow, due

Re[2]: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Bulat Ziganshin
Hello Manlio, Monday, March 2, 2009, 8:16:10 PM, you wrote: > By the way: I have written the first version of the program to parse > Netflix training data set in D. > I also used ncpu * 1.5 threads, to parse files concurrently. > However execution was *really* slow, due to garbage collection. >

Re[2]: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Bulat Ziganshin
Hello Manlio, Monday, March 2, 2009, 8:16:10 PM, you wrote: > 1) With default collection algorithm, I have: > 2) With -c option: > So, nothing changed. you should look into +RTS -s stats. those 409 vs 418 mb is just somewhat random values, since GCs in those 2 inviocations are not synchronized

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Manlio Perillo
Bulat Ziganshin ha scritto: Hello Manlio, Monday, March 2, 2009, 6:30:51 PM, you wrote: The process-data-1 program parse the entire dataset using about 1.4 GB of memory (3x increment). This is strange. The memory required is proportional to the number of ratings. It may be IntMap the culpri

Re: [Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Bulat Ziganshin
Hello Manlio, Monday, March 2, 2009, 6:30:51 PM, you wrote: > The process-data-1 program parse the entire dataset using about 1.4 GB > of memory (3x increment). > This is strange. > The memory required is proportional to the number of ratings. > It may be IntMap the culprit, or the garbage colle

[Haskell-cafe] help optimizing memory usage for a program

2009-03-02 Thread Manlio Perillo
Hi. After some work I have managed to implement two simple programs that parse the Netflix Prize data set. For details about the Netflix Prize, there was a post by Kenneth Hoste some time ago. I have cabalized the program, and made available here: http://haskell.mperillo.ath.cx/netflix-0.0.