"Fredrik Lundh" <[EMAIL PROTECTED]> schrieb im Newsbeitrag news:[EMAIL PROTECTED] > Claudio Grondi wrote: > > > What started as a simple test if it is better to load uncompressed data > > directly from the harddisk or > > load compressed data and uncompress it (Windows XP SP 2, Pentium4 3.0 GHz > > system with 3 GByte RAM) > > seems to show that none of the in Python available compression libraries > > really works for large sized > > (i.e. 500 MByte) strings. > > > > Test the provided code and see yourself. > > > > At least on my system: > > zlib fails to decompress raising a memory error > > pylzma fails to decompress running endlessly consuming 99% of CPU time > > bz2 fails to compress running endlessly consuming 99% of CPU time > > > > The same works with a 10 MByte string without any problem. > > > > So what? Is there no compression support for large sized strings in Python? > > you're probably measuring windows' memory managment rather than the com- > pression libraries themselves (Python delegates all memory allocations >256 bytes > to the system). > > I suggest using incremental (streaming) processing instead; from what I can tell, > all three libraries support that. > > </F>
Have solved the problem with bz2 compression the way Frederic suggested: fObj = file(r'd:\strSize500MBCompressed.bz2', 'wb') import bz2 objBZ2Compressor = bz2.BZ2Compressor() lstCompressBz2 = [] for indx in range(0, len(strSize500MB), 1048576): lowerIndx = indx upperIndx = indx+1048576 if(upperIndx > len(strSize500MB)): upperIndx = len(strSize500MB) lstCompressBz2.append(objBZ2Compressor.compress(strSize500MB[lowerIndx:upper Indx])) #:for lstCompressBz2.append(objBZ2Compressor.flush()) strSize500MBCompressed = ''.join(lstCompressBz2) fObj.write(strSize500MBCompressed) fObj.close() :-) so I suppose, that the decompression problems can also be solved that way, but : This still doesn't for me answer the question what the core of the problem was, how to avoid it and what are the memory request limits which should be considered when working with large strings? Is it actually so, that on other systems than Windows 2000/XP there is no problem with the original code I have provided? Maybe a good reason to go for Linux instead of Windows? Does e.g. Suse or Mandriva Linux have also a memory limit a single Python process can use? Please let me know about your experience. Claudio -- http://mail.python.org/mailman/listinfo/python-list