Paul Rubin wrote: > Claudio Grondi <[EMAIL PROTECTED]> writes: > >>Is there a ready to use (free, best Open Source) tool able to sort >>lines (each line appr. 20 bytes long) of a XXX GByte large text file >>(i.e. in place) taking full advantage of available memory to speed up >>the process as much as possible? > > > Try the standard Unix/Linux sort utility. Use the --buffer-size=SIZE > to tell it how much memory to use. I am on Windows and it seems, that Windows XP SP2 'sort' can work with the file, but not without a temporary file and space for the resulting file, so triple of space of the file to sort must be provided. Windows XP 'sort' uses constantly appr. 300 MByte of memory and can't use 100% of CPU all the time, probably due to I/O operations via USB (25 MByte/s experienced top data transfer speed). I can't tell yet if it succeeded as the sorting of the appr. 80 GByte file with fixed length records of 20 bytes is still in progress (for eleven CPU time hours / 18 daytime hours). I am not sure if own programming would help in my case to be much faster than the systems own sort (I haven't tried yet to set the size of memory to use in the options to e.g 1.5 GByte as the sort help tells it is better not to specify it). My machine is a Pentium 4, 2.8 GHz with 2.0 GByte RAM. I would be glad to hear if the time required for sorting I currently experience is as expected for such kind of task or is there still much space for improvement?
Claudio Grondi -- http://mail.python.org/mailman/listinfo/python-list