Re: MemoryError on reading mbox file

2007-09-13 Thread Christoph Krammer
On 12 Sep., 16:39, Istvan Albert <[EMAIL PROTECTED]> wrote: > This line reads an entire message into memory as a string. Is it > possible that you have a huge email in there (hundreds of MB) with > some attachment encoded as text? No, the largest single message with the mbox is about 100KB large.

Re: MemoryError on reading mbox file

2007-09-12 Thread Gabriel Genellina
En Wed, 12 Sep 2007 11:39:46 -0300, Istvan Albert <[EMAIL PROTECTED]> escribi�: > On Sep 12, 5:27 am, Christoph Krammer <[EMAIL PROTECTED]> > wrote: > >> string = self._file.read(stop - self._file.tell()) >> MemoryError > > This line reads an entire message into memory as a string. Is it > p

Re: MemoryError on reading mbox file

2007-09-12 Thread Hrvoje Niksic
Christoph Krammer <[EMAIL PROTECTED]> writes: > I have to convert a huge mbox file (~1.5G) to MySQL. Have you tried commenting out the MySQL portion of the code? Does the code then manage to finish processing the mailbox? -- http://mail.python.org/mailman/listinfo/python-list

Re: MemoryError on reading mbox file

2007-09-12 Thread Istvan Albert
On Sep 12, 5:27 am, Christoph Krammer <[EMAIL PROTECTED]> wrote: > string = self._file.read(stop - self._file.tell()) > MemoryError This line reads an entire message into memory as a string. Is it possible that you have a huge email in there (hundreds of MB) with some attachment encoded as te

Re: MemoryError on reading mbox file

2007-09-12 Thread Christoph Krammer
On 12 Sep., 12:20, David <[EMAIL PROTECTED]> wrote: > It may be that Python's garbage collection isn't keeping up with your app. > > You could try periodically forcing it to run. eg: > > import gc > gc.collect() I tried this, but the problem is not solved. When invoking the garbage collection afte

Re: MemoryError on reading mbox file

2007-09-12 Thread David
> > My system has 512M RAM and 768M swap, which seems to run out at an > early stage of this. Is there a way to clean up memory for messages > already processed? It may be that Python's garbage collection isn't keeping up with your app. You could try periodically forcing it to run. eg: import gc

MemoryError on reading mbox file

2007-09-12 Thread Christoph Krammer
Hello everybody, I have to convert a huge mbox file (~1.5G) to MySQL. I tried with the following simple code: for m in mailbox.mbox(fileName): msg = m.as_string(True) hash = md5.new(msg).hexdigest() try: dbcurs.execute("""INSERT INTO archive (hash, msg) VALUES (%s, %s)""", (hash, ms