[ CC ++ [EMAIL PROTECTED] ]
On Tue, Nov 11, 2008 at 2:58 PM, Andrew McGill <[EMAIL PROTECTED]> wrote: > What would you expect this to do --: > > find -type f -print0 | > xargs -0 -n 8 --max-procs=16 md5sum >& ~/md5sums Produce a race condition :) It generates 16 parallel processes, each writing to the md5sums file. Unfortunately sometimes the writes occur at the same offset in the output file. To illustrate: ~$ strace -f -e open,fork,execve sh -c "echo hello > foo" execve("/bin/sh", ["sh", "-c", "echo hello > foo"], [/* 39 vars */]) = 0 [...] open("foo", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3 ~$ strace -f -e open,fork,execve sh -c "echo hello >> foo" execve("/bin/sh", ["sh", "-c", "echo hello >> foo"], [/* 39 vars */]) = 0 [...] open("foo", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3 This version should be race-free: find -type f -print0 | xargs -0 -n 8 --max-procs=16 md5sum >> ~/md5sums 2>&1 I think that writing into a pipe should be OK, since pipes are non-seekable. However, with pipes in this situation you still have a problem if processes try to write more than PIPE_BUF bytes. > Is there a correct way to do md5sums in parallel without having a shared > output buffer which eats output (I presume) -- or is losing output when > haphazardly combining output streams actually strange and unusual? I hope the solution about solved your problem - and please follow up if so. This example is probably worthy of being mentioned in the xargs documentation, too. Thanks for your comment! James. _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils