Re: sh(1): POSIX "Command Search and Execution"

Robert Elz Fri, 20 Sep 2024 14:30:10 -0700

    Date:        Fri, 20 Sep 2024 18:53:11 +0200
    From:        <tlaro...@kergis.com>
    Message-ID:  <zu2od2phtsn4h...@kergis.com>


  | For some reason[*], I looked at sh(1) "Command Search and Execution"
  | in POSIX (issue 7 2018 and then issue 8 2024).

Over the past many years this has been one of the most debated
parts of the specification.   It is constantly being reworded.

  | From the specification above, I'm puzzled about two things: regular
  | built-ins and PATH search:

Yes, that aspect in particular, there is an attitude amongst some of
the people who work on the standard that users must be able to replace
regular built in utilities by their own replacements, simply by placing
their own in a directory that is in PATH before the directory where the
standard version of the utility exists.

Almost all shell developers consider this to be nonsense, and refuse
to have anything to do with it (some version of ksh93 is reputed to
have rules something like that implemented though.)

In issue 7 and before (not sure now for how long before, but that no longer
matters) all regular built-in utilities were required to have a file
system implementation (so that, for example, xargs could run it, without
a shell being involved) - even those which make absolutely no sense
outside the shell, like "wait" and "fg" (some others which are mostly
useless outside the shell, like "cd" could at least be argued to be able
to attempt the operation, and issue an error on failure, even if the
effects would be lost).   Some systems install such things by making
links to a script like

        #! /bin/sh
        ${0##*/} "$@"

with the names of the relevant built-in utilities, solely to meet that
requirement.   NetBSD always refused to indulge in such stupidity.

  | In issue 7, built-ins are segregated in two groups: "special
  | built-ins" and "regular built-ins", the latter being the complement of
  | the former (a built-in that is not "special").

That's always been done - there are other differences in how they're
required to operate than in this area - such as what happens when one
fails, and the effects of variable assignments as part of the same
command.  The special built-ins are mostly things that most people almost
consider to be syntax (like "break" "continue" "return" "." ...)

  | But in the spec, a regular built-in can only be invoked in e),

Not quite, the utilities listed in (d) are all regular built-in
utilities, and those simply get executed.   This is the (useful)
big change in Issue 8 - that list are now knows as "intrinsic"
utilities, which have two properties of note - first, those ones
aren't required to exist in the filesystem any more, and second,
they're exempt from the path search nonsense.

Fortunately, implementations are also allowed to designate any
other built-in utility as being intrinsic (though it is recommended
that they don't).   In our shell, every built-in is intrinsic.
(I believe bash is the same).

  | that is the corresponding name file has to be accessible via the PATH.
  | If it is not, one can not invoke a regular built-in?

That is the intent, yes.

  | This may have sense for an utility required by POSIX

No, it makes sense for nothing.

  | but there may be a regular built-in that POSIX doesn't speak about...

That one is actually not a problem - both because such a utility could
also be implemented as a file system command, and so meet the requirement,
but more because as soon as an application attempts to invoke any non
standard utility, all bets are off, that's outside what the standard
specifies, and so the standard specifies nothing about what should happen.

And yes, that means that if you write your own command (or add one from
pkgsrc, that is not a standard utility) then the standard doesn't require
that things like redirection (or anything else really) will work.

Of course, no real implementation would ever break things that way, what
is a standard utility, and what is not, is not distinguished anywhere
(except that to conform with the standard, all the standard utilities,
except the ones that are part of options that are not included, like for
example uucp and its friends, must be implemented, and available in
some defined PATH setting - which isn't necessarily the one that any
normal user ever uses.)

  | And what does "a successful search" mean? From the referenced
  | paragraph "XBD Environment Variables":
  |
  | ---8<---issue 7 2018
  | The list shall be searched from beginning to end, applying the
  | filename to each prefix, until an executable file with the specified
  | name and appropriate execution permissions is found. 
  | --->8---issue 7 2018
  |
  | But this contradicts the use of the shell in the paragraph I'm talking
  | about, since if the permissions can be stat'ed, the "executable" nature
  | of the file can not be ascertained without exec'ing

I think that's just a wording bug, and should be fixed (and would be
if someone pointed it out) - all they really mean is a file with 'x'
permission in PATH.   However, you're right, the term "executable file"
is defined to mean something that "exec*(2)" can execute, and that isn't
what they really mean there - no-one expects an attempt be made to actually
execute the file located, just that the shell would try that if there was
no built-in to execute instead.

  | ---8<---issue 7 2018
  | The term "built-in" implies that the shell can execute the utility
  | directly and does not need to search for it. 
  | --->8---issue 7 2018
  |
  | The proposition is for all built-ins. And this contradicts the
  | paragraph where the built-in has to be searched for previously...

No, it doesn't - the version searched for (and found) (if you believe
anything should actually operate like that) isn't the built-in, that's
the file system equivalent (like we have /bin/echo and the built-in
echo in sh(1), which are actually entirely different commands).

The intent is that the shell locates the file system version of the
executable, then, if there is a built-in with the same name, and that
built-in claims to be the equivalent of the version in the directory
in which the shell found the file system version, then the built-in
is executed instead of the file system version.

  | "The special built-in utilities in this section need not be provided
  | in a manner accessible via the exec family of functions defined in
  | the System Interfaces volume of POSIX.1-2017."

Yes, not even the most insane of the posix committee ever believed that
"break" or "return" would be useful in any way as a file system command.

  | i.e. not special built-ins have to be provided in a manner accessible
  | via the exec family of functions.)

Yes, but (as above) only the standard ones - anything non standard (anywhere,
including an option to a standard utility that isn't defined for that
utility) places things outside the standard, and none of the rules apply.

  | What does:
  |
  | "the built-in or function is associated with the 
  | directory that was most recently tested during the successful
  | PATH search"
  |
  | mean? How is a directory "associated" to a built-in or a function,

No-one actually knows, that isn't specified anywhere, it is up to the
implementation to make that work, but I believe that the intent is
that each (non-intrinsic) regular built-in is associated with a path
somewhere or other (compiled into the shell, in a file that the shell
reads at startup, perhaps via a sysctl like interface - whatever the
implementation prefers).   That is, for us we have "echo" "test" and
"printf" (and more) built in, so something somewhere would have

        echo    /bin
        test    /bin
        printf  /usr/bin

(and many more) defined - then if the user types "echo hello" the
system searches PATH, finds "echo" in some directory, then checks
this list - if the directory found by the search matches the one in
the list, then the built-in gets executed.   If the directory is
different, then the command from the file system gets executed, and
as you surmised, if the command isn't found by the search, then a
"command not found" error results (even though the built-in is there.)

So if you had PATH=~/bin:/usr/pkg/bin:/bin

and you had a "test" in ~/bin, "echo" in /usr/pkg/bin and no printf
in any of those three directories, then the built-in versions would
never be executed.

  | Note: in the NetBSD implementation---I didn't look in the CSRG
  | archives to see if these are in fact here from long ago---there are
  | prefixes in the path: "%builtins" and "%func";
  | perhaps are these an attempt to this association?

They are from long long long ago, yes.   %func is something entirely
different, and unrelated, and not entirely useless.   "%builtins" was
an attempt to comply with what the language in some much older version
of the standard (when all this was much less precisely specified than
it is now).  That's a joke, and most versions of ash (the parent of
our shell, FreeBSD's dash, perhaps others) have long deleted it.
We haven't, but probably should, it is undocumented, and no-one uses it.

  | These are builtins or funcs if the prefix is specified as the
  | preceding "dir" in PATH?

I don't really want to document %builtins, so everyone forget you
ever read this, but the idea is that if that is specified as a suffix
of an entry in PATH, and the PATH search reaches that entry, then a
built-in command will be found and executed (if there is no %builtins
entry in PATH, then one is assumed right at the start, which means
built-ins are always executed if named ... that's what almost everyone
simply assumes will happen).   By explicitly sticking %builtins
elsewhere, it is possible for a user to override a builtin with a
file-system command located earlier in the PATH.

That's a dumb way to do it though, much better is simply to supply
a function like:

        echo()
        {
                /path/to/the/echo/I/like "$@"
        }

instead - and the usefulness of that is one of the reasons that the
NetBSD shell always reads the $ENV file (even in non-interactive shells).
This way you can selectively override built-ins (except the special ones
that you really don't want to override) with whatever versions you prefer.

The %func thing is entirely different - if a search for a command reaches
that directory (the one with %func as a suffix - just in PATH, not in the
directory name) without having yet located the command (or we would not
have gotten that far) and there is a file in the directory with the same
name as the command being sought (I think this one needs 'r' permission,
and not necessarily 'x', but I haven't checked, so might be wrong - 'r'
is needed for sure though) then the shell will read that file, as if with
the '.' command.   If after that has happened, there is now a function with
the name of the command to be executed defined (clearly there wasn't before,
or the function would have already been executed, without any PATH search)
then the shell will execute the function (and search PATH no more).  The
newly defined function (and anything else that running the script that was
found happens to accomplish - normally just defining other functions as
well) remains in the shell to be used again later if needed, with no PATH
search involved.

The idea is that you make a file containing functions you sometimes use,
place that file in some directory, say ~/myfuncs and link it to the name
of every function it defines (the directory can have other groups of
unrelated functions) and then you put ~/myfuncs%func as an entry in PATH
(usually it would go fairly early in PATH - but that depends upon what
you're attempting to achieve - perhaps last if the intent is to provide
fallback versions of commands in case the system that you're using happens
not to have them installed) - then when you happen to need to use one of
those functions (you're doing something which needs one) then that function
gets defined "by magic" along with any other related functions you're likely
to use if you're using any of them.  On the other hand, if you never need
these functions in a shell, then ten never get loaded, and so save a little
memory in the shell, and a tiny bit of command search time.

  | Could somebody explain this in an "international" english, that is
  | something a not english native speaker with an average english
  | vocabulary could parse?

I don't know, does the above count?

  | [*]: The reason why I looked at the spec is that, under Plan9, there is
  | a feature that I find quite neat and consistent: utilities can be
  | organized in subfolders and one can invoke from the shell (rc(1)) an
  | utility like this: "ip/ipconfig ...". This organized the utilities in
  | groups, instead of putting everything flat in a directory.

Yes, some people like that, and there are one or two shells which allow
it I think (not typical POSIX type shells) - you can accomplish that, more
or less, by just adding all those directories to PATH, and then you get to
avoid typing the "ip/" part of the name.

  | I thus wanted to see how I could add this (it is not POSIX compliant)

No, it isn't, POSIX requires that any command with a '/' in its name
be simply executed (from the filesystem) using the name given, without
any other processing (of the command name).

  | by setting an option, without disturbing much the POSIX behavior
  | or introducing security problems that the POSIX spec had tried to
  | address...

It isn't really a security issue I think - just isn't the way that
shells have ever worked (way back to the Thompson shell) - either
there's a path search (in that shell the directories to examine, and
their order, was built into the shell, no way for users to alter it)
for simple one-segment names (no '/') and others are simply exec'd.
That's very hard to change now (in general, an option could allow it
though) as it is so ingrained in how people work.

kre

Re: sh(1): POSIX "Command Search and Execution"

Reply via email to