Hello,
I am trying to sort a 50GB text file whose individual lines are of the form
int (x 7) double int double (x 2)
for example,
-16 -2 -14 -5 1 1 0 0.3080808080808057 0 0.1540404040404028 0.3904338415207971
with a single space between each field and each line terminated by a newline
'\n'. The longest line in the file is 107 chars plus a newline (verified by wc
and my own counter).
I have a dual processor machine, with each processor being an Intel Core 2 Duo
E6850, rated at 3GHz and cache 4096 kB, with 3.8GB total physical memory and 4GB
swap space and two partitions on the hdd with 200GB and 140GB available space.
I am using sort v. 5.2.1 and v. 6.1 & v. 6.9. The former is installed as part of
the RHEL OS and the latter two were compiled from the source at
http://ftp.gnu.org/gnu/coreutils/ with the gcc v. 3.4.6 compiler.
When I attempt to sort the file, with a command like
./sort -S 250M -k 6,6n -k 7,7n -k 8,8n -k 9,9n -k 10,10n -k 11,11n -T /data -T
/data2 -o out.sort in.txt
sort rapidly chews up about 40-50% of total physical memory (=1.5-1.9GB) at
which point the error message 'sort: memory exhausted' appears. This appears to
be independent of the parameter passed through the -S option.
Is this an idiosyncratic problem? I have read backlogs of the list and people
report sort-ing 100GB files. Do you have any ideas?
Leo Butler
_______________________________________________
Bug-coreutils mailing list
Bug-coreutils@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-coreutils