On Tue, Nov 27, 2018 at 07:19:43PM +0100, Bruno Haible wrote: > Hi, > > > The workaround is to split argument list into chunks that operating > > system can process. "getconf ARG_MAX" is used to determine size of the > > chunk. > > Two questions on this: > > 1) People say that 'getconf ARG_MAX' returns the appromixate number > of bytes in a command line. [1] > But you use it with 'xargs -n', which gives a limit on the number of > arguments. Shouldn't the patch use 'xargs -s' instead? >
Hi Bruno, You're right about "-s", the patch should probably use it. I started from "-n" with a reasonably low value of arguments (IIRC it was 8000 arguments), then tried to lookup if there's system limit of arguments. I haven't found one, used ARG_MAX instead. > 2) The really available values are slightly smaller. > > On Linux: > $ getconf ARG_MAX > 2097152 > $ LC_ALL=C xargs --show-limits > Your environment variables take up 4744 bytes > POSIX upper limit on argument length (this system): 2090360 > POSIX smallest allowable upper limit on argument length (all systems): 4096 > Maximum length of command we could actually use: 2085616 > Size of command buffer we are actually using: 131072 > > On FreeBSD/x86_64: > $ getconf ARG_MAX > 262144 > $ LC_ALL=C xargs --show-limits > Your environment variables take up 353 bytes > POSIX upper limit on argument length (this system): 259743 > POSIX smallest allowable upper limit on argument length (all systems): 4096 > Maximum length of command we could actually use: 259390 > Size of command buffer we are actually using: 131072 > > On macOS: > $ getconf ARG_MAX > 262144 > $ LC_ALL=C xargs --show-limits > Your environment variables take up 1262 bytes > POSIX upper limit on argument length (this system): 258834 > POSIX smallest allowable upper limit on argument length (all systems): 4096 > Maximum length of command we could actually use: 257572 > Size of command buffer we are actually using: 131072 > > How about being conservative and dividing the limit by 2, to avoid > this margin error? > > Could it be that your patch works only because xargs uses a command buffer > of length 131072, regardless of the value you pass to '-n'? > > Bruno > > [1] > https://www.cyberciti.biz/faq/linux-unix-arg_max-maximum-length-of-arguments/ > This is an interesting coincidence. If we pick big enough value for "-n" that drains command buffer length, the number of arguments is going to be limited by default value of "-s" flag. And ARG_MAX of arguments will always drain command buffer length up to the limit. Here are related excerpts from macOS xargs man page: -n _number_ Set the maximum number of arguments taken from standard input for each invocation of utility. An invocation of utility will use less than _number_ standard input arguments if the number of bytes accumulated (see the -s option) exceeds the specified _size_ or there are fewer than _number_ arguments remaining for the last invocation of utility. The current default value for _number_ is 5000. -s _size_ Set the maximum number of bytes for the command line length provided to utility. The sum of the length of the utility name, the arguments passed to utility (including NULL terminators) and the current environment will be less than or equal to this number. The current default value for _size_ is ARG_MAX - 4096. And from GNU xargs: -n _max-args_, --max-args=_max-args_ Use at most _max-args_ arguments per command line. Fewer than _max-args_ arguments will be used if the size (see the -s option) is exceeded, unless the -x option is given, in which case xargs will exit -s _max-chars_, --max-chars=_max-chars_ Use at most _max-chars_ characters per command line, including the command and initial-arguments and the terminating nulls at the ends of the argument strings. The largest allowed value is system-dependent, and is calculated as the argument length limit for exec, less the size of your environment, less 2048 bytes of headroom. If this value is more than 128KiB, 128Kib is used as the default value; otherwise, the default value is the maximum. 1KiB is 1024 bytes. xargs automatically adapts to tighter constraints. Given that even if the patch is kept as is it should work properly on macOS, FreeBSD (the docs are very close to macOS xargs) and all systems with GNU xargs. I can correct commit message to note the observation. Alternatively, we can replace "-n" with "-s", as you pointed out in 1). But then we will need to correct calculation of VC_ARG_MAX. We can take formulae from [2]: expr `getconf ARG_MAX` - `env|wc -c` - `env|egrep '^[^ ]+='|wc -l` \* 4 - 2048 But it's a bit higher than effective limit of command line buffer on macOS/FreeBSD. To fix that we need to replace 2048 with 4096 in the formulae (according to man page above), so I think the final VC_ARG_MAX should be: expr `getconf ARG_MAX` - `env|wc -c` - `env|egrep '^[^ ]+='|wc -l` \* 4 - 4096 [2] https://www.in-ulm.de/~mascheck/various/argmax/ Thank you, Roman