Re: [PATCH] Add shopts for trailing newlines in command substitution and here strings

Kevin Pulo Sat, 30 Aug 2025 08:14:21 -0700

On Fri, 29 Aug 2025 at 01:00, Koichi Murase <myoga.mur...@gmail.com> wrote:


> 2025年8月28日(木) 17:20 Kevin Pulo <kevin.p...@mongodb.com>:
> > Similar arguments can be made about any behavior-modifying option, but they
> > continue to be added to bash (eg. patsub_replacement, localvar_inherit).
>
> Yes, you are right, and in fact, I raised discussions for
> `patsub_replacement' and `localvar_{unset,inherit}' in the past.

Thanks for all this extra context, it's very helpful.

> Nevertheless, I think the addition of the trailing-newline-preserving
> feature through shell options has a larger impact on existing codes
> than `patsub_replacement' and `localvar_inherit'.

Agreed.

> Have you gauged the impact of adding "cmdsubst_trailing_nls" on the existing
> frameworks in a similar way?

No, but I'm happy to do so.

> 2) Next, for `patsub_replacement' and `localvar_inherit', we don't
> need a complicated workaround like « local saved=$(shopt -p
> cmdsubst_trailing_nls); shopt -s cmdsubst_trailing_nls ... eval --
> "$saved" » to make the code work with any sides of the options.

I agree that the difficulty of dealing with it is important, but I still think
the frameworks could do it easily by adding | strip_trailing_nls inside the
comsub.

> OK, unquoted $()'s (that undergo word splitting) are unaffected, assuming the
> normal IFS.

You're right that I had forgotten about non-default IFS.

> However, the quoted command substitutions (or command substitutions in
> variable assignments) are more common than the unquoted $() and may still be
> affected by the new option.

Agreed.

> > But this also means there could be another approach for frameworks - to
> > ensure that the commands run inside substitutions emit no trailing
> > newlines.  Then the result will be the same regardless of
> > cmdsubst_strip_newlines.
>
> If it is acceptable to change the behavior at the side of the command
> inside $(), isn't it easier to modify the commands that want to
> preserve the trailing newlines so that they output an extra `.',
> allowing to simply call it as « foo=$(cmd); x=${x%.} »?

The answer is no, and ironically the reason is (I think) similar to why « local
saved=$(shopt -p cmdsubst_trailing_nls); shopt -s cmdsubst_trailing_nls ...
eval -- "$saved" » is so painful.

Specifically, because it requires extra code which is _outside_ the comsub.
Changing $(cmd) to « x=$(cmd;echo .); x=${x%.} » is much worse than changing it
to $(cmd | helper) because of the extra x=${x%.} assignment.

For example, consider doing something like:

    cmd "$(foo)$(bar)$(baz)"

when trailing newlines are significant.  This would even not be too bad if $()
instead consumed at most a single trailing newline (rather than all of them),
because then I could at least just do:

    cmd "$(foo)
    $(bar)
    $(baz)"

or

    cmd "$(foo)"$'\n'"$(bar)"$'\n'"$(baz)"$'\n'

and those are reasonably readable and straightforward.

But needing to add and strip the dot makes it all much worse, because now it's
necessary to do something like:

    foo="$(foo; echo .)" foo=${foo%.}
    bar="$(bar; echo .)" bar=${bar%.}
    baz="$(baz; echo .)" baz=${baz%.}
    cmd "$foo$bar$baz"
    unset -v foo bar baz

This is a mess compared to the original.  By contrast, if it were possible to
do:

    cmd "$(foo | preserve)$(bar | preserve)$(baz | preserve)"

(for some hypothetical "preserve" function that somehow preserved trailing
newlines) then that would be also a lot better.

> > This could be done by piping into an ugly sed/awk program, [...] Also note
> > that, AFAIK, the reverse is not possible.  That is, if bash doesn't have a
> > trailing newline preserving command substitution, it is not possible to
> > write a preserve_trailing_nls function that makes $(foo |
> > preserve_trailing_nls) work as intended.
>
> You can define e.g. « preserve_trailing_nls() { REPLY=$(cat;echo .)
> REPLY=${REPLY%.}; } » and turn on `lastpipe`, so ${| foo |
> preserve_trailing_nls; } works as you expect.

This is a good idea, and probably the best workaround so far, although it's not
quite that simple.  lastpipe means turning off job control (which is fine for
me, but maybe not for others), and also turning on pipefail since I want to be
able to check if foo has failed.  Also, if the comsub has multiple statements,
then it's necessary to use {} to turn them into a compound statement, and then
pipe that into preserve_trailing_nls.  In my case I would be ok with
permanently enabling lastpipe and pipefail and disabling job control, but for
those who aren't they need to be done inside the comsol.  The least ugly in
this case would probably be something like:

    function _ {
        _save=$(shopt -p lastpipe pipefail; shopt -op monitor)
        shopt -s lastpipe pipefail
        set +o monitor
    }
    function __ {
        REPLY=${ cat;echo .;} REPLY=${REPLY%.}
        eval -- "$_save"
        unset -v _save
    }

to use like this:

    echo "${|_;{ foo; bar | baz; }|__;}"

or this if you deliberately want a subshell:

    echo "${|_;( foo; bar | baz )|__;}"

While this will work, and is better because it's fully contained within the
comsub, it's still a pretty ugly magic incantation and not what I'd consider
"idiomatic".

> If you can assume NUL doesn't appear in the stream, you can even do ${| foo |
> read -rd ''; } without introducing any functions.

Actually I would rather have the function to hide even just this simple read
-rd ''.  :)  But also read -rd '' returns failure (assuming no NUL), so in fact
it would have to be more like ${| foo | { read -rd '' || [[ $REPLY ]]; }; },
which is definitely ugly enough to be hidden inside a function.

> If you want to preserve the exit status, you can combine these with a process
> substitution `< <(foo)'.

Process substitution is less useful, because it causes foo's exit status to be
lost, and also executes in a subshell (whereas the above can set variables.

> > Sorry, I wasn't clear with this.  What I mean is, suppose you have some
> > scripts [...] one day you discover that the trailing newline stripping is
> > causing incorrect behavior.  You realize that you hadn't considered this
> > case, [...] with user-selectable shell options to control the trailing
> > newline handling, instead of updating the syntax at many locations, you can
> > instead just set the shopts once at the start of each script.
>
> I don't think it is a good idea to add an option to turn someone's
> misunderstanding into a real language feature. If we start to
> implement every kind of misunderstanding as real shell features, the
> language becomes unmanageable. I cannot agree with this specific
> reasoning.

I don't agree that the feature request is based on a misunderstanding.  I've
known about the trailing newline stripping behavior for years, but I was still
caught by (basically) the exact situation I described.  It wasn't until I
happened to add test cases to my system which caused empty lines at the _end_
of the output, that I realised I needed to deal with this problem.  (Earlier
tests had blank lines but in the middle of the output, and they worked fine.)

I know that many people misunderstand this shell behavior, but that doesn't
change the fact that getting complete output into the shell is difficult, and
the main reason is the unconditional comsub trailing newline stripping
behavior.  It's not unreasonable to want a language to have a simple way of
reading plain input without modifying it.  This is why read has -r, for
example.

> > Suppose bash had a feature where, if a shell function called _comsub is
> > defined, then that function gets called whenever `` or $() is encountered.
>
> It's fine if you are talking about a hypothetical feature just for
> fun, but I'm not sure if it is related to the current discussion. If
> you seriously suggest this as a new feature, I have to repeat the same
> discussion of the impact on the existing scripts using the traditional
> command substitutions $(). This should introduce a new syntax (such as
> « ${func_name cmd;} » or « ${func_name(...)} »). Actually, it's much
> worse than the suggested shell option. I don't think we need to seek
> an even worse approach for preserving newlines than the shell-option
> approach.

I'm not seriously suggesting this feature.  The point was just to show the
difference between builtins and $(), specifically that it's possible for users
to customize builtins, but not $().

However, this is also why I care more about the comsub behavior, than the here
strings behavior - because I have already added this echo function in my
scripts, which means I can just do 'echo "$foo" | ...' instead of '... <<<$foo'

    function echo {
        local lastarg="${@: -1}"
        if [ "$1" = "-n" -o "${lastarg: -1}" = $'\n' ]; then
            command echo -n "$@"
        else
            command echo "$@"
        fi
    }

> > This would still break any external code that assumes traditional behavior,
> > but if that's a concern I could add some sophistication to make it only
> > apply to my own scripts:
>
> Not all users can do this properly. This becomes a new unnecessary pitfall.

Whether or not people can do it properly is a different issue.  The point is
whether it's possible or impossible (even given an infinitely intelligent
user).

If a user creates a function which wraps a builtin and modifies the behavior
carelessly, causing other code to break - and then complains about it -
everyone would agree that it is the user who is wrong and should fix their
function.  The same would be true for anyone who wrote a broken (hypothetical)
_comsub function.  Anyway, as above, it was just an illustration, not a real
suggestion.

Kev

Re: [PATCH] Add shopts for trailing newlines in command substitution and here strings

Reply via email to