Can you give an example of how these data structures look after
reading only the first 5 lines?
Sure, here you go:
In [38]: mpef._ustore._store
Out[38]: defaultdict(<type 'dict'>, {'Measurement':
{'8991c2dc67a49b909918477ee4efd767':
<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0fe90>,
'7b38b429230f00fe4731e60419e92346':
<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0fad0>,
'b53531471b261c44d52f651add647544':
<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0f4d0>,
'44ea6d949f7c8c8ac3bb4c0bf4943f82':
<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0f910>,
'0de96f928dc471b297f8a305e71ae3e1':
<micropheno.exchangeformat.Exceptions.FileContext object at 0x2f0f550>}})
In [39]:
mpef._ustore._store['Measurement']['b53531471b261c44d52f651add647544'].typeStr
Out[39]: 'Measurement'
In [40]:
mpef._ustore._store['Measurement']['b53531471b261c44d52f651add647544'].lineNumber
Out[40]: 5
In [41]: mpef._ustore._idstore
Out[41]: defaultdict(<class
'micropheno.exchangeformat.KBaseID.IDStore'>, {'Measurement':
<micropheno.exchangeformat.KBaseID.IDStore object at 0x2f0f950>})
In [43]: mpef._ustore._idstore['Measurement']._SIDstore
Out[43]: defaultdict(<function <lambda> at 0x2ece7d0>, {'emailRemoved':
defaultdict(<function <lambda> at 0x2c4caa0>, {'microPhenoShew2011':
defaultdict(<type 'dict'>, {0: {'MLR_124572462':
'8991c2dc67a49b909918477ee4efd767', 'MLR_124572161':
'7b38b429230f00fe4731e60419e92346', 'SMMLR_12551352':
'b53531471b261c44d52f651add647544', 'SMMLR_12551051':
'0de96f928dc471b297f8a305e71ae3e1', 'SMMLR_12550750':
'44ea6d949f7c8c8ac3bb4c0bf4943f82'}})})})
-MrsE
On 9/25/2012 4:33 AM, Oscar Benjamin wrote:
On 25 September 2012 00:58, Junkshops <junksh...@gmail.com
<mailto:junksh...@gmail.com>> wrote:
Hi Tim, thanks for the response.
- check how you're reading the data: are you iterating over
the lines a row at a time, or are you using
.read()/.readlines() to pull in the whole file and then
operate on that?
I'm using enumerate() on an iterable input (which in this case is
the filehandle).
- check how you're storing them: are you holding onto more
than you think you are?
I've used ipython to look through my data structures (without
going into ungainly detail, 2 dicts with X numbers of key/value
pairs, where X = number of lines in the file), and everything
seems to be working correctly. Like I say, heapy output looks
reasonable - I don't see anything surprising there. In one dict
I'm storing a id string (the first token in each line of the file)
with values as (again, without going into massive detail) the md5
of the contents of the line. The second dict has the md5 as the
key and an object with __slots__ set that stores the line number
of the file and the type of object that line represents.
Can you give an example of how these data structures look after
reading only the first 5 lines?
Oscar
--
http://mail.python.org/mailman/listinfo/python-list