OW Ghim Siong writes:
> I have a big file 1.5GB in size, with about 6 million lines of
> tab-delimited data. I have to perform some filtration on the data and
> keep the good data. After filtration, I have about 5.5 million data
> left remaining. As you might already guessed, I have to read them
On Tue, 30 Nov 2010 18:29:35 +0800
OW Ghim Siong wrote:
>
> Does anyone know why is there such a big difference memory usage when
> storing the matrix as a list of list, and when storing it as a list of
> string?
That's because any object has a fixed overhead (related to metadata and
allocatio
On 11/30/2010 04:29 AM, OW Ghim Siong wrote:
a=open("bigfile")
matrix=[]
while True:
lines = a.readlines(1)
for line in lines:
data=line.split("\t")
if several_conditions_are_satisfied:
matrix.append(data)
print "Number of lines read:", len(li
OW Ghim Siong wrote:
> Hi all,
>
> I have a big file 1.5GB in size, with about 6 million lines of
> tab-delimited data. I have to perform some filtration on the data and
> keep the good data. After filtration, I have about 5.5 million data left
> remaining. As you might already guessed, I have to
OW Ghim Siong wrote:
> I have a big file 1.5GB in size, with about 6 million lines of
> tab-delimited data.
How many fields are there an each line?
> I have to perform some filtration on the data and
> keep the good data. After filtration, I have about 5.5 million data left
> remaining. As you m
Hi all,
I have a big file 1.5GB in size, with about 6 million lines of
tab-delimited data. I have to perform some filtration on the data and
keep the good data. After filtration, I have about 5.5 million data left
remaining. As you might already guessed, I have to read them in batches
and I d