Hello!

I have devised a patch.

Notable findings/details/concerns regarding the implementation are
provided below.

* Shebangs and length limits

Substituting /two/ commands (i.e. "env" and the interpreter it calls) in
addition to appending other arguments has a high likelihood to exceed
127 characters.  This was a previous default for Linux (BINPRM_BUF_SIZE
constant), but appears to have been updated to 255 characters in version
5.1, according to execve(2) [1]:

--8<---------------cut here---------------start------------->8---
The kernel imposes a maximum length on the text that follows the
"#!" characters at the start of a script; characters beyond the
limit are ignored.  Before Linux 5.1, the limit is 127 characters.
Since Linux 5.1, the limit is 255 characters.
--8<---------------cut here---------------end--------------->8---

This new limit should permit patching "env -S CMD ..." shebangs,
although I've noted the old limit in case backwards compatibility is an
issue.

Tangentially related: Fuzzy-searching "shebang length" or "shebang
limit" in the repository indicates some workarounds/hacks/references to
the old limit.  Perhaps this is worth opening a separate issue to
address?

* Handling binary-not-found cases with a catch-throw

Since these changes add new cases where binaries may not be found in
$PATH, I opted to use an exception handler to catch all the cases.
Testing seemed to indicate that the bootstrap process didn't support the
#:unwind? keyword argument for with-exception-handler (possibly due to
executing on an older Guile version?), so I used a catch-throw instead.

* Testing

I've included a patch to add tests for patch-shebang.

The last two test cases ("patch-shebang: fail with \"env -S CMD\" form
when {CMD,\"env\"} not found") have alternative expectations that I'd
like to hear thoughts about.  The current expectation is that a shebang
of this form is only patched when /both/ "env" and CMD is found, but we
could also patch binaries as long as either is found.  I decided not to
do this since I wasn't sure how to encode the partial success in the
return value (e.g. do we consider "partially successful" as #t or #f?
Or return a tertiary value?), but maybe there are edge cases that could
be fixed by doing this?

My local build of gash-and-dependencies (on glibc-headers-mesboot as of
writing) also appears to be chugging along okay so far, FWIW.

* Footnotes

[1] <https://www.man7.org/linux/man-pages/man2/execve.2.html#VERSIONS>

Cheers,

aurtzy

aurtzy (2):
  utils: Handle "env -S CMD" forms in `patch-shebang'.
  tests: Add tests for `patch-shebang'.

 guix/build/utils.scm  | 84 +++++++++++++++++++++++++++-------------
 tests/build-utils.scm | 90 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 147 insertions(+), 27 deletions(-)


base-commit: c31662f7294b194663bc521358b01c3a7d7e4e27
-- 
2.49.0




Reply via email to