Date:        Fri, 21 Feb 2025 09:08:13 -0500
    From:        Chet Ramey <chet.ra...@case.edu>
    Message-ID:  <59a1d1d0-b6eb-4652-9e77-1fc4c5992...@case.edu>

  | Given the following, which POSIX says is unspecified:
  |
  | printf '%s %3$s %s\n' A B C D
  |
  | ksh93-u+m prints "A D", which is just wrong.

It actually prints "A D \n" (the \n is obvious, and unimportant, but
the extra space makes a big difference).

  | No matter how you mix numbered
  | and unnumbered specifications, or whether you implement numbered
  | specifications at all, you can't just drop it.

It isn't, and while bizarre indeed, that's a defensible operation.

As you say, this is all unspecified in POSIX, which means anything
is acceptable - here it looks as of the "%3$" is counting the args
after args used already for unnumbered conversions have already been
removed, since the A has already been consumed, the remaining args are
B C D, so the third is D.   So %3$s prints the D, which is followed by
the space which is next in the format string.   The following unnumbered
conversion then takes the arg which follows the last that was used,
which doesn't exist here, meaning the final %s prints "".

Add one more arg, making the command become

        printf '%s %3$s %s\n' A B C D E

and the output is "A D E\n" just as predicted.

All perfectly consistent and even rational - though certainly not the
way I would do it.

I abandoned my plans on implementing the numbered conversions when POSIX
insisted that even in the presence of numbered arg conversions, if all
the args aren't consumed by the format, the format string needs to be
repeated, the same as is done when there are no numbered conversions.

Despite most implementations actually working that way, doing so makes
no sense, and makes it much harder to actually use the numbered arg
conversions for their intended purpose -- which isn't just so applications
can supply the args in an order different from what the format string
expects them - it is so the format string can be obtained from a message
catalog, in which different translations (different languages) might easily
need to consume different selections of the args (and in different orders,
which is the point).   Some of the translations might not use the final
args in the list, in which case a POSIX implementation is required to run
the format string again, despite that making no sense at all.

kre


Reply via email to