On Fri, Mar 14, 2025 at 01:34:48 +0100, Steffen Nurpmeso wrote:
>   a() {
>     echo $#,1="$1"/$1,2="$2"/$2,3="$3"/$3,4="$4"
>     echo $#,'*'="$*"/$*,
>   }
>   set -- '' 'a' ''
>   #IFS=\ ; echo "$*"$* $*; a "$*"$* $*;unset IFS
>   #IFS= ; echo "$*"$* $*; a "$*"$* $*;unset IFS
>   IFS=:; echo "$*"$* $*; a "$*"$* $*;unset IFS
> 
> outputs
> 
>   :a: a  a
>   4,1=:a:/ a ,2=a/a,3=/,4=a
>   4,*=:a::a::a/ a  a  a,
> 
> I have a problem  ^  with this space character of bash.

There's so much noise here.  All I want to see is ONE command that
produces unexpected output.  Nothing else.  Now, let me see if I can
strip down your example to reproduce your result.

hobbit:~$ cat foo
#!/bin/bash
IFS=:
a() {
    printf '%d args:' "$#"
    printf ' <%s>' "$@"
    echo
    echo $#,'*'="$*"/$*,
}
set -- '' a ''
a "$*"$* $*
hobbit:~$ ./foo
4 args: <:a:> <a> <> <a>
4,*=:a::a::a/ a  a  a,

This is the same as the output you got, yes?  A slash, a space, an "a",
two spaces, another "a", two spaces, a final "a", a comma and a newline.

I added code to show the arguments being passed into the function,
since for some reason you introduced a function and another round of
expansions into the picture.

So, let's start at the global scope.  You're permanently changing IFS
to a colon, and you're defining 3 positional parameters: the empty string,
the letter "a", and another empty string.

Next, you're calling the function "a" with 4 arguments:

    <:a:> <a> <> <a>

Inside the function "a", with IFS still permanently changed to colon,
you're calling echo with a series of arguments that are the result of
multiple expansions concatenated together.  This mess needs to be
dissected very carefully.

    echo $#,'*'="$*"/$*,

So, we have:

    $# unquoted; it produces strictly numeric output, and IFS is : so
      there are no surprises here
    the three literal characters ,*=
    "$*" quoted
    the literal character /
    $* unquoted
    the literal character ,

1) The $# expands to "4", which becomes the first character of the first
   argument word.

2) The three literal characters ,*= are appended to that.

3) "$*" is quoted, so it expands to the single word ":a::a::a" and this
   is appended to the first argument word.

4) The literal character / is appended.
   At this point, the first argument word is
    <4,*=:a::a::a/>

5) $* appears unquoted.  Now things get tricky.

   We generate a string by concatenating all the positional parameters
   together with : (which is the first character of IFS) between them.
   Doing this gives us the string ":a::a::a" just like in step 3.
   However, since this expansion is unquoted, it undergoes a round of
   word splitting, and a round of pathname expansion.  There are no
   globbing characters in this string, so I'll omit the pathname
   expansion step.

   Each instance of : in the string causes a word split to occur, so
   let's proceed left to right.

   We start with one word which is the empty string.
   The first character of the input string is ":" so we close off the
   first word (empty), and begin a second word (also empty for now).
   The next character is "a", so we append that to the second word.
   The third character is ":", so we close off the second word and
   begin a third word (empty for now).
   The fourth character is ":", so we close off the third word and
   begin a fourth word (empty for now).
   The fifth character is "a", so we append that to the fourth word.
   The sixth character is ":", so we close off the fourth word and
   begin a fifth word.
   The seventh character is ":", so we close off the fifth word and
   begin a sixth word.
   The eight and final character is "a", so we append that to the sixth
   word.

   All together, then, the unquoted expansion gives us six new words:
      <> <a> <> <a> <> <a>

   But remember, we started out with a growing argument word:
      <4,*=:a::a::a/>

   The six new words are added to this, by appending the first new
   word to the growing argument word, and then adding each additional
   word as a new argument word.  The first new word is empty, so there
   is no change made to the first argument word.  The new set of argument
   words after step 5 is therefore:
      <4,*=:a::a::a/> <a> <> <a> <> <a>

6) After the unquoted $* you have a "," character, which is appended to
   the last argument word.  After this step, we have the final set of
   argument words:
      <4,*=:a::a::a/> <a> <> <a> <> <a,>

7) These argument words are passed to the echo command.  echo will
   potentially evaluate these as options, or process backslash
   combinations within each argument.  However, none of our words
   begin with "-" or contain backslash, so we can set aside the
   concern of echo mangling the arguments.

   echo will print each argument word, with a space character between
   each pair of argument words, and with a newline added to the end.
   There are 6 argument words, so echo will add a total of 5 spaces
   and one newline.

   We get the following string from echo (plus a newline):
      <4,*=:a::a::a/ a  a  a,>

   There are two spaces after the fourth "a" because there's one space
   added between <a> and <>, and a second space added between <> and
   the next <a>.  The same reasoning applies to the two spaces between
   the fifth and sixth "a"s.  echo prints the "a" argument (arg 4),
   then a space, then the empty string argument (arg 5), then another
   space, then the final "a," argument (arg 6).

Do you see how horrifyingly *messy* this is?  This is why we don't
play with unquoted parameter expansions resulting in word splitting.

Just stop doing it.  PLEASE.

Reply via email to