Re: Memory usage per top 10x usage per heapy

Junkshops Tue, 25 Sep 2012 11:11:50 -0700

Can you give an example of how these data structures look afterreading only the first 5 lines?

Sure, here you go:


In [38]: mpef._ustore._store

Out[38]: defaultdict(<type 'dict'>, {'Measurement':{'8991c2dc67a49b909918477ee4efd767':<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0fe90>,'7b38b429230f00fe4731e60419e92346':<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0fad0>,'b53531471b261c44d52f651add647544':<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0f4d0>,'44ea6d949f7c8c8ac3bb4c0bf4943f82':<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0f910>,'0de96f928dc471b297f8a305e71ae3e1':<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0f550>}})

In [39]:mpef._ustore._store['Measurement']['b53531471b261c44d52f651add647544'].typeStr

Out[39]: 'Measurement'

In [40]:mpef._ustore._store['Measurement']['b53531471b261c44d52f651add647544'].lineNumber

Out[40]: 5

In [41]: mpef._ustore._idstore

Out[41]: defaultdict(<class'micropheno.exchangeformat.KBaseID.IDStore'>, {'Measurement':<micropheno.exchangeformat.KBaseID.IDStore object at 0x2f0f950>})


In [43]: mpef._ustore._idstore['Measurement']._SIDstore

Out[43]: defaultdict(<function <lambda> at 0x2ece7d0>, {'emailRemoved':defaultdict(<function <lambda> at 0x2c4caa0>, {'microPhenoShew2011':defaultdict(<type 'dict'>, {0: {'MLR_124572462':'8991c2dc67a49b909918477ee4efd767', 'MLR_124572161':'7b38b429230f00fe4731e60419e92346', 'SMMLR_12551352':'b53531471b261c44d52f651add647544', 'SMMLR_12551051':'0de96f928dc471b297f8a305e71ae3e1', 'SMMLR_12550750':'44ea6d949f7c8c8ac3bb4c0bf4943f82'}})})})


-MrsE

On 9/25/2012 4:33 AM, Oscar Benjamin wrote:

On 25 September 2012 00:58, Junkshops <junksh...@gmail.com<mailto:junksh...@gmail.com>> wrote:


    Hi Tim, thanks for the response.


        - check how you're reading the data:  are you iterating over
           the lines a row at a time, or are you using
           .read()/.readlines() to pull in the whole file and then
           operate on that?

    I'm using enumerate() on an iterable input (which in this case is
    the filehandle).


        - check how you're storing them:  are you holding onto more
           than you think you are?

    I've used ipython to look through my data structures (without
    going into ungainly detail, 2 dicts with X numbers of key/value
    pairs, where X = number of lines in the file), and everything
    seems to be working correctly. Like I say, heapy output looks
    reasonable - I don't see anything surprising there. In one dict
    I'm storing a id string (the first token in each line of the file)
    with values as (again, without going into massive detail) the md5
    of the contents of the line. The second dict has the md5 as the
    key and an object with __slots__ set that stores the line number
    of the file and the type of object that line represents.

Can you give an example of how these data structures look afterreading only the first 5 lines?


Oscar

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Memory usage per top 10x usage per heapy

Reply via email to