Hello! I have devised a patch.
Notable findings/details/concerns regarding the implementation are provided below. * Shebangs and length limits Substituting /two/ commands (i.e. "env" and the interpreter it calls) in addition to appending other arguments has a high likelihood to exceed 127 characters. This was a previous default for Linux (BINPRM_BUF_SIZE constant), but appears to have been updated to 255 characters in version 5.1, according to execve(2) [1]: --8<---------------cut here---------------start------------->8--- The kernel imposes a maximum length on the text that follows the "#!" characters at the start of a script; characters beyond the limit are ignored. Before Linux 5.1, the limit is 127 characters. Since Linux 5.1, the limit is 255 characters. --8<---------------cut here---------------end--------------->8--- This new limit should permit patching "env -S CMD ..." shebangs, although I've noted the old limit in case backwards compatibility is an issue. Tangentially related: Fuzzy-searching "shebang length" or "shebang limit" in the repository indicates some workarounds/hacks/references to the old limit. Perhaps this is worth opening a separate issue to address? * Handling binary-not-found cases with a catch-throw Since these changes add new cases where binaries may not be found in $PATH, I opted to use an exception handler to catch all the cases. Testing seemed to indicate that the bootstrap process didn't support the #:unwind? keyword argument for with-exception-handler (possibly due to executing on an older Guile version?), so I used a catch-throw instead. * Testing I've included a patch to add tests for patch-shebang. The last two test cases ("patch-shebang: fail with \"env -S CMD\" form when {CMD,\"env\"} not found") have alternative expectations that I'd like to hear thoughts about. The current expectation is that a shebang of this form is only patched when /both/ "env" and CMD is found, but we could also patch binaries as long as either is found. I decided not to do this since I wasn't sure how to encode the partial success in the return value (e.g. do we consider "partially successful" as #t or #f? Or return a tertiary value?), but maybe there are edge cases that could be fixed by doing this? My local build of gash-and-dependencies (on glibc-headers-mesboot as of writing) also appears to be chugging along okay so far, FWIW. * Footnotes [1] <https://www.man7.org/linux/man-pages/man2/execve.2.html#VERSIONS> Cheers, aurtzy aurtzy (2): utils: Handle "env -S CMD" forms in `patch-shebang'. tests: Add tests for `patch-shebang'. guix/build/utils.scm | 84 +++++++++++++++++++++++++++------------- tests/build-utils.scm | 90 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 147 insertions(+), 27 deletions(-) base-commit: c31662f7294b194663bc521358b01c3a7d7e4e27 -- 2.49.0