Samuel Thibault, le Tue 17 Jan 2012 12:15:16 +0100, a écrit : > Lars Wirzenius, le Tue 17 Jan 2012 10:45:20 +0000, a écrit : > > On Tue, Jan 17, 2012 at 10:30:20AM +0100, Samuel Thibault wrote: > > > Lars Wirzenius, le Tue 17 Jan 2012 09:12:58 +0000, a écrit : > > > > real user system max RSS elapsed cmd > > > > > > > > (s) (s) (s) (KiB) (s) > > > > > > > > 3.2 2.4 5.8 62784 5.8 hardlink --dry-run files > > > > > /dev/null > > > > 1.1 0.4 1.6 15424 1.6 rdfind files > /dev/null > > > > > > > > 1.9 0.2 2.2 9904 2.2 duff-0.5/src/duff -r files > > > > > /dev/null > > > > > > And fdupes on the same set of files? > > > > real user system max RSS elapsed cmd > > (s) (s) (s) (KiB) (s) > > 3.1 2.4 5.5 62784 5.5 hardlink --dry-run files > /dev/null > > 1.1 0.4 1.6 15392 1.6 rdfind files > /dev/null > > 1.3 0.9 2.2 13936 2.2 fdupes -r -q files > /dev/null > > 1.9 0.2 2.1 9904 2.1 duff-0.5/src/duff -r files > /dev/null > > > > Someone should run the benchmark on a large set of data, preferably > > on various kinds of real data, rather than my small synthetic data set. > > On my PhD work directory, with various stuff in it (500MiB, 18000 files, > big but also small files (svn/git checkouts etc)), everything being in > cache already (no disk I/O): > > hardlink -t --dry-run . > /dev/null 1,06s user 0,46s system 99% cpu > 1,538 total > rdfind . > /dev/null 0,68s user 0,19s system 99% cpu > 0,877 total > fdupes -q -r . > /dev/null 2> /dev/null 0,80s user 0,90s system 99% cpu > 1,708 total > ~/src/duff-0.5/src/duff -r . > /dev/null 1,53s user 0,08s system 99% cpu > 1,610 total
And with nothing in cache, SSD hard drive: hardlink -t --dry-run . > /dev/null 1,86s user 1,23s system 12% cpu 24,260 total rdfind . > /dev/null 1,18s user 1,31s system 8% cpu 27,837 total fdupes -q -r . > /dev/null 2> /dev/null 1,30s user 2,13s system 11% cpu 29,820 total ~/src/duff-0.5/src/duff -r . > /dev/null 1,88s user 0,47s system 16% cpu 13,949 total (yes, user time is different, and measures are stable. Also note that I have added -t to hardlink, otherwise it takes file timestamp into account). I guess duff gets a clear win because it does not systematically compute the checksum of files with the same size, but first reads a few bytes, for the big files. samuel -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120117130245.gn4...@type.bordeaux.inria.fr