This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky Lin, TaeSung Roh, and Paul Eggert. It adds support for parallelism within an internal sort. On our simple tests on a 2-core desktop x86, overall performance improved by roughly a factor of 1.6. * THANKS: Add Benjamin Nuernberger, Sky Lin, TaeSung Roh. Sort. * TODO: Add note about how to make this algorithm better. * bootstrap.conf: Add nproc, pthread. * doc/coreutils.texi (sort invocation): Document new --threads option. * src/Makefile.am (sort_LDADD): Add $(LIB_PTHREAD). * src/sort.c: Include pthread.h, nproc.h. (SUBTHREAD_LINES_HEURISTIC, THREADS_OPTION): New constants. (sortlines_temp): Remove decl. (usage, long_options, main): New option --threads. (specify_nthreads): New function. (mergelines): New signature, to emphasize the fact that the HI area must be part of the destination. All callers changed. (sortlines): Rewrite to support and use parallelism, with a new signature. All uses changed. Merge in the functionality of sortlines_temp. (struct thread_args): New struct. (sortlines_thread): New function. (sortlines_temp): Remove. (sort): New arg NTHREADS. All uses changed. (main): Disable threading if we are sorting at random. * tests/Makefile.am (TESTS): Add misc/sort-benchmark-random. * tests/misc/sort-benchmark-random: New file. --- THANKS | 23 +++-- TODO | 8 ++ bootstrap.conf | 9 ++- doc/coreutils.texi | 8 ++ src/Makefile.am | 2 +- src/sort.c | 220 +++++++++++++++++++++++++++----------- tests/Makefile.am | 1 + tests/misc/sort-benchmark-random | 67 ++++++++++++ 8 files changed, 263 insertions(+), 75 deletions(-) create mode 100644 tests/misc/sort-benchmark-random
diff --git a/THANKS b/THANKS index 665a9ef..e37fdb3 100644 --- a/THANKS +++ b/THANKS @@ -24,12 +24,11 @@ aldomel aldo...@ix.netcom.com Alen Muzinic zv...@fly.cc.fer.hr Alexander Nguyen v...@seas.ucla.edu Alexander V. Lukyanov l...@netis.ru -Allen Hewes al...@decisiv.net -Axel Dörfler ax...@pinc-software.de Alexandre Duret-Lutz dure...@epita.fr Alexey Solovyov ale...@math.uu.se Alexey Vyskubov ale...@pippuri.mawhrin.net Alfred M. Szmidt a...@kemisten.nu +Allen Hewes al...@decisiv.net Andi Kleen frei...@alancoxonachip.com Andre Novaes Cunha andre.cu...@br.global-one.net Andreas Frische andreasfris...@gmail.com @@ -61,13 +60,15 @@ Arvind Autar autar...@planet.nl Augey Mikus mi...@dqc.org Aurelien Jarno aure...@debian.org Austin Donnelly austin.donne...@cl.cam.ac.uk +Axel Dörfler ax...@pinc-software.de Axel Kittenberger ans...@gmx.net Barry Kelly http://barrkel.blogspot.com/ Bauke Jan Douma bjdo...@xs4all.nl Ben Elliston b...@air.net.au Ben Harris bj...@netbsd.org -Benjamin Cutler cutle...@simla.colostate.edu Bengt Martensson be...@mathematik.uni-bremen.de +Benjamin Cutler cutle...@simla.colostate.edu +Benjamin Nuernberger benjamin...@gmail.com Bernard Giroud bernard.gir...@creditlyonnais.ch Bernd Eckenfels e...@debian.org Bernd Leibing bernd.leib...@rz.uni-ulm.de @@ -169,12 +170,12 @@ Elbert Pol elbert....@gmail.com Eli Zaretskii e...@is.elta.co.il Elias Pipping pipp...@gentoo.org Emile LeBlanc lebl...@math.toronto.edu -Erik Auerswald auers...@unix-ag.uni-kl.de Eric Backus er...@lsid.hp.com Eric Blake e...@byu.net Eric G. Miller e...@jps.net Eric Pemente peme...@northpark.edu Eric S. Raymond e...@snark.thyrsus.com +Erik Auerswald auers...@unix-ag.uni-kl.de Erik Bennett benn...@cvo.oneworld.com Erik Corry e...@kroete2.freinet.de Evan Hunt etha...@armory.com @@ -207,7 +208,6 @@ Gerhard Poul gp...@gnu.org Germano Leichsenring germ...@jedi.cs.kobe-u.ac.jp Glen Lenker glen.len...@gmail.com Göran Uddeborg goe...@uddeborg.se -Guochun Shi g...@ncsa.uiuc.edu GOTO Masanori go...@debian.or.jp Greg Louis glo...@dynamicro.on.ca Greg McGary g...@gnu.org @@ -218,6 +218,7 @@ Greg Wooledge gawoole...@sherwin.com Gregory Leblanc glebl...@cu-portland.edu Guido Leenders guido.leend...@invantive.com Guntram Blohm extern.guntram.bl...@audi.de +Guochun Shi g...@ncsa.uiuc.edu H. J. Lu h...@valinux.com Hans Ginzel h...@matfyz.cz Hans Lermen ler...@fgan.de @@ -232,8 +233,8 @@ Herbert Xu herb...@gondor.apana.org.au Holger Berger hber...@ess.nec.de Hon-Yin Kok h...@yoda.unl.edu Hugh Daniel h...@xanadu.com -Ian Bruce ian.br...@myrealbox.com Iain Calder i...@rogers.com +Ian Bruce ian.br...@myrealbox.com Ian Jackson ijack...@chiark.greenend.org.uk Ian Lance Taylor i...@cygnus.com Ian Turner vec...@pipeline.com @@ -243,8 +244,8 @@ Ingo Saitz i...@debian.org Ivo Timmermans i...@debian.org James ja...@albion.glarp.com James Antill jmanti%essex.ac...@seralph21.essex.ac.uk -James Lemley james.lem...@acxiom.com James Hunt jamesodh...@hotmail.com +James Lemley james.lem...@acxiom.com James Ralston rals...@pobox.com James Sneeringer j...@ocslink.com James Tanis j...@soscorp.com @@ -318,8 +319,8 @@ Keith Thompson k...@cts.com Ken Pizzini k...@halcyon.com Kevin Mudrick kmudr...@healthmarketscience.com Kirk Kelsey kirk.kel...@0x4b.net -Kristin E Thomas krist...@us.ibm.com Kjetil Torgrim Homme kjeti...@ifi.uio.no +Kristin E Thomas krist...@us.ibm.com Kristoffer Rose k...@diku.dk Larry McVoy l...@sgi.com Lars Hecking lheck...@nmrc.ucc.ie @@ -408,8 +409,8 @@ Michiel Bacchiani bacch...@raven.bu.edu Mikael Magnusson mika...@gmail.com Mike Castle dalg...@ix.netcom.com Mike Coleman m...@mathdogs.com -Mike Jetzer mjet...@mke.catalystwms.com Mike Frysinger vap...@gentoo.org +Mike Jetzer mjet...@mke.catalystwms.com Mikko Tuumanen m...@sorvankyla.yok.utu.fi Mikulas Patocka miku...@artax.karlin.mff.cuni.cz Miles Bader mi...@gnu.ai.mit.edu @@ -510,6 +511,7 @@ Savochkin Andrey Vladimirovich s...@msu.ru Scott Lurndal sl...@griffin.engr.sgi.com Sébastien Maret sma...@umich.edu Shing-Shong Shei s...@cs.indiana.edu +Sky Lin chip...@gmail.com Soeren Sonnenburg sonnenb...@informatik.hu-berlin.de Solar Designer so...@owl.openwall.com Stanislav Ievlev in...@altlinux.ru @@ -530,9 +532,10 @@ Stuart Shelton stu...@shelton.me Sven Joachim svenj...@gmx.de Szakacsits Szabolcs sz...@sienet.hu Tadayoshi Funaba t...@kt.rim.or.jp +TaeSung Roh abraham....@gmail.com TAKAI Kousuke ta...@vlsi.kuee.kyoto-u.ac.jp -Theodore Ts'o ty...@rsts-11.mit.edu The Wanderer inversepara...@comcast.net +Theodore Ts'o ty...@rsts-11.mit.edu Theodoros V. Kalamatianos n...@users.sourceforge.net Thomas Bushnell tho...@gnu.ai.mit.edu Thomas Goerlich tho...@schnappmatik.de diff --git a/TODO b/TODO index 7288285..ee7e850 100644 --- a/TODO +++ b/TODO @@ -102,6 +102,14 @@ sort: Investigate better sorting algorithms; see Knuth vol. 3. 5.3.1, who credits Lester Ford, Jr. and Selmer Johnson, American Mathematical Monthly 66 (1959), 387-389. + The current method of parallelization can be improved. Parent + threads wait for children to finish, which can be a bottleneck. It + might be better, at least when computation nears the root of the + merge tree, for parent threads to accept outputs from their children + one by one, and thus merge while their children are merging. This + will require some inter-thread communication (e.g., via shared + objects that are accessed safely via mutual exclusion). + shred: Update shred as described here to conform to DoD 5220 rules: http://lists.gnu.org/archive/html/bug-coreutils/2007-05/msg00075.html diff --git a/bootstrap.conf b/bootstrap.conf index 0747bb8..7358e10 100644 --- a/bootstrap.conf +++ b/bootstrap.conf @@ -74,12 +74,19 @@ gnulib_modules=" memcasecmp memcmp2 mempcpy memrchr mgetgroups mkancesdirs mkdir mkdir-p mkstemp mktime modechange - mountlist mpsort obstack pathmax perl physmem + mountlist + mpsort + nproc + obstack + pathmax + perl + physmem posix-shell posixtm posixver progname propername + pthread putenv quote quotearg raise readlink areadlink-with-size randint diff --git a/doc/coreutils.texi b/doc/coreutils.texi index 70effa1..a9ad20e 100644 --- a/doc/coreutils.texi +++ b/doc/coreutils.texi @@ -4043,6 +4043,14 @@ have a large sort or merge that is I/O-bound, you can often improve performance by using this option to specify directories on different disks and controllers. +...@item --threa...@var{n} +Use no more than @var{n} threads to improve parallelism and thus +reduce the wall-clock time needed for the sort. The default @var{n} +is the number of processors currently online, rounded down to the +nearest power of two. This default may change in the future. If +...@var{n} is a power of two, increasing it to a value less than 2...@var{n} +does not typically help performance. + @item -u @itemx --unique @opindex -u diff --git a/src/Makefile.am b/src/Makefile.am index 2313ed3..6113e5f 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -118,7 +118,7 @@ tac_LDADD = $(LDADD) $(LIB_CLOCK_GETTIME) vdir_LDADD = $(LDADD) $(LIB_CLOCK_GETTIME) $(LIB_SELINUX) $(LIB_CAP) ## If necessary, add -lm to resolve use of pow in lib/strtod.c. -sort_LDADD = $(LDADD) $(POW_LIB) $(LIB_GETHRXTIME) +sort_LDADD = $(LDADD) $(POW_LIB) $(LIB_GETHRXTIME) $(LIB_PTHREAD) # for get_date and gettime date_LDADD = $(LDADD) $(LIB_CLOCK_GETTIME) diff --git a/src/sort.c b/src/sort.c index ced0f2d..b4f0dfb 100644 --- a/src/sort.c +++ b/src/sort.c @@ -23,6 +23,7 @@ #include <config.h> #include <getopt.h> +#include <pthread.h> #include <sys/types.h> #include <sys/wait.h> #include <signal.h> @@ -32,6 +33,7 @@ #include "filevercmp.h" #include "hash.h" #include "md5.h" +#include "nproc.h" #include "physmem.h" #include "posixver.h" #include "quote.h" @@ -83,6 +85,13 @@ struct rlimit { size_t rlim_cur; }; # define OPEN_MAX 20 #endif +/* Heuristic value for the number of lines for which it is worth + creating a subthread, during an internal merge sort, on a machine + that has processors galore. Currently this number is just a guess. + This value must be at least 4. We don't know of any machine where + this number has any practical effect. */ +enum { SUBTHREAD_LINES_HEURISTIC = 4 }; + #define UCHAR_LIM (UCHAR_MAX + 1) #ifndef DEFAULT_TMPDIR @@ -289,8 +298,6 @@ static char const *compress_program; number are present, temp files will be used. */ static unsigned int nmerge = NMERGE_DEFAULT; -static void sortlines_temp (struct line *, size_t, struct line *); - /* Report MESSAGE for FILE, then clean up and exit. If FILE is null, it represents standard output. */ @@ -380,6 +387,7 @@ Other options:\n\ -t, --field-separator=SEP use SEP instead of non-blank to blank transition\n\ -T, --temporary-directory=DIR use DIR for temporaries, not $TMPDIR or %s;\n\ multiple options specify multiple directories\n\ + --threads=N use no more than N threads to improve parallelism\n\ -u, --unique with -c, check for strict ordering;\n\ without -c, output only the first of an equal run\n\ "), DEFAULT_TMPDIR); @@ -423,7 +431,8 @@ enum FILES0_FROM_OPTION, NMERGE_OPTION, RANDOM_SOURCE_OPTION, - SORT_OPTION + SORT_OPTION, + THREADS_OPTION }; static char const short_options[] = "-bcCdfgik:mMno:rRsS:t:T:uVy:z"; @@ -455,6 +464,7 @@ static struct option const long_options[] = {"temporary-directory", required_argument, NULL, 'T'}, {"unique", no_argument, NULL, 'u'}, {"zero-terminated", no_argument, NULL, 'z'}, + {"threads", required_argument, NULL, THREADS_OPTION}, {GETOPT_HELP_OPTION_DECL}, {GETOPT_VERSION_OPTION_DECL}, {NULL, 0, NULL, 0}, @@ -1253,6 +1263,21 @@ specify_sort_size (int oi, char c, char const *s) xstrtol_fatal (e, oi, c, long_options, s); } +/* Specify the number of threads to spawn during internal sort. */ +static unsigned long int +specify_nthreads (int oi, char c, char const *s) +{ + unsigned long int nthreads; + enum strtol_error e = xstrtoul (s, NULL, 10, &nthreads, ""); + if (e == LONGINT_OVERFLOW) + return ULONG_MAX; + if (e != LONGINT_OK) + xstrtol_fatal (e, oi, c, long_options, s); + if (nthreads == 0) + error (SORT_FAILURE, 0, _("number of threads must be nonzero")); + return nthreads; +} + /* Return the default sort size. */ static size_t default_sort_size (void) @@ -2435,25 +2460,28 @@ mergefiles (struct sortfile *files, size_t ntemps, size_t nfiles, return nopened; } -/* Merge into T the two sorted arrays of lines LO (with NLO members) - and HI (with NHI members). T, LO, and HI point just past their - respective arrays, and the arrays are in reverse order. NLO and - NHI must be positive, and HI - NHI must equal T - (NLO + NHI). */ +/* Merge into T (of size NLINES) the two sorted arrays of lines + LO (with NLINES / 2 members), and + T - (NLINES / 2) (with NLINES - NLINES / 2 members). + T and LO point just past their respective arrays, and the arrays + are in reverse order. NLINES must be at least 2. */ static inline void -mergelines (struct line *t, - struct line const *lo, size_t nlo, - struct line const *hi, size_t nhi) +mergelines (struct line *restrict t, size_t nlines, + struct line const *restrict lo) { + size_t nlo = nlines / 2; + size_t nhi = nlines - nlo; + struct line const *hi = t - nlo; + for (;;) if (compare (lo - 1, hi - 1) <= 0) { *--t = *--lo; if (! --nlo) { - /* HI - NHI equalled T - (NLO + NHI) when this function - began. Therefore HI must equal T now, and there is no - need to copy from HI to T. */ + /* HI must equal T now, and there is no need to copy from + HI to T. */ return; } } @@ -2471,53 +2499,54 @@ mergelines (struct line *t, } } -/* Sort the array LINES with NLINES members, using TEMP for temporary space. - NLINES must be at least 2. - The input and output arrays are in reverse order, and LINES and - TEMP point just past the end of their respective arrays. - Use a recursive divide-and-conquer algorithm, in the style - suggested by Knuth volume 3 (2nd edition), exercise 5.2.4-23. Use - the optimization suggested by exercise 5.2.4-10; this requires room - for only 1.5*N lines, rather than the usual 2*N lines. Knuth - writes that this memory optimization was originally published by - D. A. Bell, Comp J. 1 (1958), 75. */ +static void sortlines (struct line *restrict, size_t, struct line *restrict, + unsigned long int, bool); -static void -sortlines (struct line *lines, size_t nlines, struct line *temp) +/* Thread arguments for sortlines_thread. */ +struct thread_args { - if (nlines == 2) - { - if (0 < compare (&lines[-1], &lines[-2])) - { - struct line tmp = lines[-1]; - lines[-1] = lines[-2]; - lines[-2] = tmp; - } - } - else - { - size_t nlo = nlines / 2; - size_t nhi = nlines - nlo; - struct line *lo = lines; - struct line *hi = lines - nlo; - struct line *sorted_lo = temp; - - sortlines (hi, nhi, temp); - if (1 < nlo) - sortlines_temp (lo, nlo, sorted_lo); - else - sorted_lo[-1] = lo[-1]; + struct line *lines; + size_t nlines; + struct line *temp; + unsigned long int nthreads; + bool to_temp; +}; - mergelines (lines, sorted_lo, nlo, hi, nhi); - } +/* Like sortlines, except with a signature acceptable to pthread_create. */ +static void * +sortlines_thread (void *data) +{ + struct thread_args const *args = data; + sortlines (args->lines, args->nlines, args->temp, args->nthreads, + args->to_temp); + return NULL; } -/* Like sortlines (LINES, NLINES, TEMP), except output into TEMP - rather than sorting in place. */ +/* Sort the array LINES with NLINES members, using TEMP for temporary space, + spawning NTHREADS threads. NLINES must be at least 2. + The input and output arrays are in reverse order, and LINES and + TEMP point just past the end of their respective arrays. + + If TO_TEMP, place the result in TEMP (trashing LINES in the + process); otherwise, place the result back into LINES so that it is + an in-place sort (trashing TEMP in the process). + + Use a recursive divide-and-conquer algorithm, in the style + suggested by Knuth volume 3 (2nd edition), exercise 5.2.4-23. If + multithreaded, this requires that TEMP contain NLINES entries; if + singlethreaded, use the optimization suggested by Knuth exercise + 5.2.4-10, which requires room for only 1.5*N lines, rather than the + usual 2*N lines. Knuth writes that this memory optimization was + originally published by D. A. Bell, Comp J. 1 (1958), 75. + + This function is inline so that its tests for multthreadedness and + inplacedness can be optimized away in common cases. */ static void -sortlines_temp (struct line *lines, size_t nlines, struct line *temp) +sortlines (struct line *restrict lines, size_t nlines, + struct line *restrict temp, + unsigned long int nthreads, bool to_temp) { if (nlines == 2) { @@ -2525,8 +2554,17 @@ sortlines_temp (struct line *lines, size_t nlines, struct line *temp) <http://lists.gnu.org/archive/html/bug-coreutils/2005-10/msg00086.html> in the IBM xlc 6.0.0.0 compiler in 64-bit mode. */ int swap = (0 < compare (&lines[-1], &lines[-2])); - temp[-1] = lines[-1 - swap]; - temp[-2] = lines[-2 + swap]; + if (to_temp) + { + temp[-1] = lines[-1 - swap]; + temp[-2] = lines[-2 + swap]; + } + else if (swap) + { + temp[-1] = lines[-1]; + lines[-1] = lines[-2]; + lines[-2] = temp[-1]; + } } else { @@ -2534,13 +2572,43 @@ sortlines_temp (struct line *lines, size_t nlines, struct line *temp) size_t nhi = nlines - nlo; struct line *lo = lines; struct line *hi = lines - nlo; - struct line *sorted_hi = temp - nlo; - sortlines_temp (hi, nhi, sorted_hi); - if (1 < nlo) - sortlines (lo, nlo, temp); + unsigned long int child_subthreads = nthreads / 2; + unsigned long int my_subthreads = nthreads - child_subthreads; + pthread_t thread; + struct thread_args args = {hi, nhi, temp - nlo, child_subthreads, to_temp}; + + if (child_subthreads != 0 && SUBTHREAD_LINES_HEURISTIC <= nlines + && pthread_create (&thread, NULL, sortlines_thread, &args) == 0) + { + /* Guarantee that nlo and nhi are each at least 2. */ + verify (4 <= SUBTHREAD_LINES_HEURISTIC); + + sortlines (lo, nlo, temp, my_subthreads, !to_temp); + pthread_join (thread, NULL); + } + else + { + sortlines (hi, nhi, temp - (to_temp ? nlo : 0), 1, to_temp); + if (1 < nlo) + sortlines (lo, nlo, temp, 1, !to_temp); + else if (!to_temp) + temp[-1] = lo[-1]; + } - mergelines (temp, lo, nlo, sorted_hi, nhi); + struct line *dest; + struct line const *sorted_lo; + if (to_temp) + { + dest = temp; + sorted_lo = lines; + } + else + { + dest = lines; + sorted_lo = temp; + } + mergelines (dest, nlines, sorted_lo); } } @@ -2744,7 +2812,8 @@ merge (struct sortfile *files, size_t ntemps, size_t nfiles, /* Sort NFILES FILES onto OUTPUT_FILE. */ static void -sort (char * const *files, size_t nfiles, char const *output_file) +sort (char * const *files, size_t nfiles, char const *output_file, + unsigned long int nthreads) { struct buffer buf; size_t ntemps = 0; @@ -2758,8 +2827,11 @@ sort (char * const *files, size_t nfiles, char const *output_file) char const *file = *files; FILE *fp = xfopen (file, "r"); FILE *tfp; + + /* If singlethreaded, the merge uses the memory optimization + suggested in Knuth exercise 5.2.4-10; see sortlines. */ size_t bytes_per_line = (2 * sizeof (struct line) - - sizeof (struct line) / 2); + - (1 < nthreads ? 0 : sizeof (struct line) / 2)); if (! buf.alloc) initbuf (&buf, bytes_per_line, @@ -2787,7 +2859,7 @@ sort (char * const *files, size_t nfiles, char const *output_file) line = buffer_linelim (&buf); linebase = line - buf.nlines; if (1 < buf.nlines) - sortlines (line, buf.nlines, linebase); + sortlines (line, buf.nlines, linebase, nthreads, false); if (buf.eof && !nfiles && !ntemps && !buf.left) { xfclose (fp, file); @@ -3039,6 +3111,7 @@ main (int argc, char **argv) bool mergeonly = false; char *random_source = NULL; bool need_random = false; + unsigned long int nthreads = 0; size_t nfiles = 0; bool posixly_correct = (getenv ("POSIXLY_CORRECT") != NULL); bool obsolete_usage = (posix2_version () < 200112); @@ -3361,6 +3434,10 @@ main (int argc, char **argv) add_temp_dir (optarg); break; + case THREADS_OPTION: + nthreads = specify_nthreads (oi, c, optarg); + break; + case 'u': unique = true; break; @@ -3506,6 +3583,9 @@ main (int argc, char **argv) if (need_random) { + /* Threading does not lock the randread_source structure, so + downgrade to one thread to avoid race conditions. */ + nthreads = 1; randread_source = randread_new (random_source, MD5_DIGEST_SIZE); if (! randread_source) die (_("open failed"), random_source); @@ -3560,7 +3640,21 @@ main (int argc, char **argv) IF_LINT (free (sortfiles)); } else - sort (files, nfiles, outfile); + { + if (!nthreads) + { + /* The default NTHREADS is 2 ** floor (log2 (number of processors)). + If comparisons do not vary in cost and threads are + scheduled evenly, with the current algorithm there is no + performance advantage to using a number of threads that + is not a power of 2. */ + unsigned long int np2 = num_processors () / 2; + for (nthreads = 1; nthreads <= np2; nthreads *= 2) + continue; + } + + sort (files, nfiles, outfile, nthreads); + } if (have_read_stdin && fclose (stdin) == EOF) die (_("close failed"), "-"); diff --git a/tests/Makefile.am b/tests/Makefile.am index 5f150ad..2e2bfec 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -201,6 +201,7 @@ TESTS = \ misc/shred-remove \ misc/shuf \ misc/sort \ + misc/sort-benchmark-random \ misc/sort-compress \ misc/sort-continue \ misc/sort-files0-from \ diff --git a/tests/misc/sort-benchmark-random b/tests/misc/sort-benchmark-random new file mode 100644 index 0000000..2bace4f --- /dev/null +++ b/tests/misc/sort-benchmark-random @@ -0,0 +1,67 @@ +#!/bin/sh +# Benchmark sort on randomly generated data. + +# Copyright (C) 2009 Free Software Foundation, Inc. + +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. + +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. + +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +# Written by Glen Lenker. + +if test "$VERBOSE" = yes; then + set -x + sort --version +fi + +. $srcdir/test-lib.sh + +very_expensive_ + +# Use the 'C' locale to avoid problems in case Perl's sort isn't stable. +LC_ALL=C +export LC_ALL + +fail=0 + +perl -e ' +my $num_lines = 500000; +my $length = 100; + +for (my $i=0; $i < $num_lines; $i++) +{ + for (my $j=0; $j < $length; $j++) + { + printf "%c", 32 + rand(94); + } + print "\n"; +}' > in || framework_failure + +# We need to generate a lot of data for sort to show a noticeable +# improvement in performance. Sorting it in PERL may take awhile. + +perl -e ' +open (FILE, "<in"); +my @list = <FILE>; +print sort(@list); +close (FILE); +' > exp || framework_failure + +#start=$(date +%s) +time sort in > out || fail=1 +#duration=$(expr $(date +%s) - $start) + +#echo sorting the random data took $duration seconds + +compare out exp || fail=1 + +Exit $fail -- 1.6.2.1 _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils