Paolo Bonzini wrote: >>> I was thinking of an additional option that would automatically decrease >>> -n so that the requested number of processes is started (then of course >>> the load may not be well balanced). >> >> So you mean, rather than the current situation of: >> >> $ yes . | head -n13 | xargs -n4 -P2 >> . . . . >> . . . . >> . . . . >> . >> >> xargs could try to distribute like: >> >> $ yes . | head -n13 | xargs -n4 -P2 >> . . . . >> . . . . >> . . . >> . . > > No, more like > > seq 1 13 | xargs --parallel -P4 > 1 5 9 13 > 2 6 10 > 3 7 11 > 4 8 12 > > (Note there's no -n). Same for > > seq 1 13 | xargs --parallel > > on a 4-core machine. This is _by design_ rearranging files, so it > requires an option.
Right, you're not auto decreasing -n, but when we read all args and we pass arguments round robin, the args will be distrubuted evenly to each parallel process. Does this really require a new option though? When -P is used, the arguments could be processed in any order anyway. Passing args round robin means each process would get MAX(max_args, num_args/nproc). The downside to this is that there would be a bit more latency introduced as max_args*nproc would need to be read before starting a process, rather than just max_args. Also interleaving arguments like this might be undesirable for other reasons? Both these are minor issues I think. We could of course reduce max_args to max_args/nproc to address the minor latency issue. Note currently `find` sets a limit of 128KiB of args to each process which could be about 2000 files for example: $ find /usr/share/ | head -n2000 | wc -c 131337 If we did a more invasive change we could help latency a lot I think. We could set O_NONBLOCK on stdin, and on EWOULDBLOCK, share what we have out to the available processes and then exec. I.E. auto reduce -n to num_args/nproc when we block. This would both result in less interleaving of args and would mean xargs would exec the processes without delay. This would be beneficial even without -P, like in the following example where we wouldn't wait for all input before displaying output: (seq 10; sleep 3; seq 10) | xargs cheers, Pádraig.