Re: Not sure why this is filling my sys memory

Vincent Davis Sat, 20 Feb 2010 18:09:01 -0800

On Sat, Feb 20, 2010 at 6:44 PM, Jonathan Gardner <
> jgard...@jonathangardner.net> wrote:

With this kind of data set, you should start looking at BDBs or

PostgreSQL to hold your data. While processing files this large is

possible, it isn't easy. Your time is better spent letting the DB

figure out how to arrange your data for you.

I really do need all of it in at time, It is dna microarray data. Sure there
are 230,00 rows but only 4 columns of small numbers. Would it help to make
them float() ? I need to at some point. I know in numpy there is a way to
set the type for the whole array "astype()" I think.
What I don't get is that it show the size of the dict with all the data to
have only 6424 bytes. What is using up all the memory?

  *Vincent Davis
720-301-3003 *
vinc...@vincentdavis.net
 my blog <http://vincentdavis.net> |
LinkedIn<http://www.linkedin.com/in/vincentdavis>

On Sat, Feb 20, 2010 at 6:44 PM, Jonathan Gardner <
jgard...@jonathangardner.net> wrote:

> On Sat, Feb 20, 2010 at 5:07 PM, Vincent Davis <vinc...@vincentdavis.net>
> wrote:
> >> Code is below, The files are about 5mb and 230,000 rows. When I have 43
> >> files of them and when I get to the 35th (reading it in) my system gets
> so
> >> slow that it is nearly functionless. I am on a mac and activity monitor
> >> shows that python is using 2.99GB of memory (of 4GB). (python 2.6
> 64bit).
> >> The getsizeof() returns 6424 bytes for the alldata . So I am not sure
> what
> >> is happening.
>
> With this kind of data set, you should start looking at BDBs or
> PostgreSQL to hold your data. While processing files this large is
> possible, it isn't easy. Your time is better spent letting the DB
> figure out how to arrange your data for you.
>
> --
> Jonathan Gardner
> jgard...@jonathangardner.net
>

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Not sure why this is filling my sys memory

Reply via email to