I have pasted in the requested information below. Leo On Fri, 25 Jan 2008, Bob Proulx wrote:
< Leo Butler wrote: < > -16 -2 -14 -5 1 1 0 0.3080808080808057 0 0.1540404040404028 0.3904338415207971 < < That should be fine. < < > I have a dual processor machine, with each processor being an Intel Core 2 < > Duo E6850, rated at 3GHz and cache 4096 kB, with 3.8GB total physical < > memory and 4GB swap space and two partitions on the hdd with 200GB and < > 140GB available space. < < Sounds like a very nice machine. Yes, very. It pays to be nice to the sysadmin ;-). < > I am using sort v. 5.2.1 and v. 6.1 & v. 6.9. The former is installed as < > part of the RHEL OS and the latter two were compiled from the source at < > http://ftp.gnu.org/gnu/coreutils/ with the gcc v. 3.4.6 compiler. < < All good so far. To nail down two more details, could you provide the < output of these commands? < < uname -a $ uname -a Linux erdelyi.maths.ed.ac.uk 2.6.9-67.0.1.ELsmp #1 SMP Fri Nov 30 11:51:05 EST 2007 i686 i686 i386 GNU/Linux < ldd --version | head -n1 $ ldd --version | head -n1 ldd (GNU libc) 2.3.4 < < file /usr/bin/sort ./sort $ file /bin/sort /bin/sort: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), stripped $ file ~/c/coreutils/coreutils-6.1/src/sort /home/lbutler/c/coreutils/coreutils-6.1/src/sort: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped $ file ~/c/coreutils/coreutils-6.9/src/sort /home/lbutler/c/coreutils/coreutils-6.9/src/sort: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped < < That will give us the kernel and libc versions. That last will report < whether the binary programs are 32-bit or 64-bit. < < > When I attempt to sort the file, with a command like < > < > ./sort -S 250M -k 6,6n -k 7,7n -k 8,8n -k 9,9n -k 10,10n -k 11,11n -T /data -T /data2 -o out.sort in.txt < > < > sort rapidly chews up about 40-50% of total physical memory (=1.5-1.9GB) at < > which point the error message 'sort: memory exhausted' appears. This < > appears to be independent of the parameter passed through the -S option. < > ... < > Is this an idiosyncratic problem? < < That is very strange. If by idiosyncratic do you mean is this < particular to your system? Probably. Because I have routinely sorted < large files without problem. But that doesn't mean it isn't a bug. I don't know if this is relevant, but I have extracted the 2nd through 1000th character in the 50GB file, and there appears to be garbage (unprintable chars) in the first line. The remainder of the extract looks fine. Moreover, I split the file into 500MB chunks, sorted these and then merge sorted the pairs. It appears that the 500MB chunks produced by split have been stripped of '\n' and are garbage, as are the sorted files. I can email a sample if need be. < < At 50G the data file is very large compared to your 4G of physical < memory. This means that sort cannot sort it in memory. It will open < temporary files and sort a large chunk to one file and then another < and then another as a first pass splitting up the input file into many < sorted chunks. As a second pass it will merge-sort the sorted chunks < together into the output file. Yes, I have successfully sorted a 7GB file of a similar type on an older machine. I noticed that sort was employing several clever tricks. < < What is the output of this command on your system? < < sysctl vm.overcommit_memory $ /sbin/sysctl vm.overcommit_memory vm.overcommit_memory = 0 < < I am asking because by default the linux kernel overcommits memory and < does not return out of memory conditions. Instead the process (or < some other one) is killed by the linux out-of-memory killer. But < enterprise systems will be configured with overcommit disabled for < reliability reasons and that appears to be how your system is < configured because you wouldn't see a message about being out of < memory from sort otherwise. (I always disable overcommit so as to < avoid the out-of-memory killer.) < < Do you have user process limits active? What is the output of this < command? < < ulimit -a $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited pending signals (-i) 1024 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 75776 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited < < What does free say on your system? < < free $ free total used free shared buffers cached Mem: 4008144 3124208 883936 0 140216 2433740 -/+ buffers/cache: 550252 3457892 Swap: 4192956 101976 4090980 $ < < > I have read backlogs of the list and people report sort-ing 100GB < > files. Do you have any ideas? < < Without doing a lot of debugging I am wondering if your choice of < locale setting is affecting this. I doubt it because all of the sort < fields are numeric. But because this is easy enough could you try < sorting using LC_ALL=C and see if that makes a difference? < < LC_ALL=C sort -k 6,6n -k 7,7n -k 8,8n -k 9,9n -k 10,10n -k 11,11n -T /data -T /data2 -o out.sort in.txt There is no change in behaviour. < < Also could you determine how large the process is at the time that < sort reports running out of memory? I am wondering if it is at a < magic number size such as 2G or 4G that could provide more insight < into the problem. There isn't a magic number, but I have noticed that there appears to be between 40-50% (with a low of 38%) of the memory allocated to sort when it runs out of memory. Thanks Leo _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils