Hi Andrew!

On 11/13/22 23:12, Andrew Pinski wrote:
On Sun, Nov 13, 2022 at 1:57 PM Alejandro Colomar via Gcc
<gcc@gcc.gnu.org> wrote:

Hi!

I'd like to get warnings if I write the following code:

char foo[3] = "foo";

This should be easy to add as it is already part of the -Wc++-compat
option as for C++ it is invalid code.

<source>:2:19: warning: initializer-string for array of 'char' is too long
     2 | char     two[2] = "foo";   // 'f' 'o'
       |                   ^~~~~
<source>:3:19: warning: initializer-string for array of 'char' is too
long for C++ [-Wc++-compat]
     3 | char   three[3] = "foo";   // 'f' 'o' 'o'
       |                   ^~~~~


... (for your more complex case [though I needed to modify one of the
strings to exactly 8]

<source>:5:7: warning: initializer-string for array of 'char' is too
long for C++ [-Wc++-compat]
     5 |       "01234567",
       |       ^~~~~~~~~~

               else if (warn_cxx_compat
                        && compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
                 warning_at (init_loc, OPT_Wc___compat,
                             ("initializer-string for array of %qT "
                              "is too long for C++"), typ1);

That is the current code which does this warning even so it is just a
matter of adding an option to c-family/c.opt and then having
c++-compat enable it and using that new option here.

Thanks,
Andrew Pinski

Great! I'd like to implement it myself, as I've never written any GCC code yet, so it's interesting to me. If you recall any (hopefully recent) case where a similar thing happened (the warning was already implemented and only needed a name), it might help me check how it was done.

BTW, I had another idea to add a suffix to string literals to make them unterminated:

char foo[3] = "foo"u;  // OK
char bar[4] = "bar";   // OK

char baz[4] = "baz"u;  // Warning: initializer is too short.
char etc[3] = "etc";   // Warning: unterminated string.

Is that doable?  Do you think it makes sense?

I have a code base that uses a mix of terminated and unterminated strings, and it would be nice to be able to tell the one I want in each case.

Cheers,

Alex



It's hard to keep track of sizes to make sure that the string literals always
initialize to terminated strings.  It seems something that should be easy to
implement in the compiler.

A morecomplex case where it's harder to keep track of sizes is:

static const char  log_levels[][8] = {
      "alert",
      "error",
      "warn",
      "notice",
      "info",
      "debug",
};

Here, 8 works now (and 7 too, but for aligmnent reasons I chose 8).  If tomorrow
we add or change an entry, It'll be hard to keep it safe.  Such a warning would
help a lot.


An example program is:

$ cat str.c
char     two[2] = "foo";   // 'f' 'o'
char   three[3] = "foo";   // 'f' 'o' 'o'
char    four[4] = "foo";   // 'f' 'o' 'o' '\0'
char    five[5] = "foo";   // 'f' 'o' 'o' '\0' '\0'
char implicit[] = "foo";   // 'f' 'o' 'o' '\0'

$ cc -Wall -Wextra str.c
str.c:1:19: warning: initializer-string for array of ‘char’ is too long
      1 | char     two[2] = "foo";   // 'f' 'o'
        |                   ^~~~~
/usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/12/../../../x86_64-linux-gnu/Scrt1.o:
in function `_start':
(.text+0x17): undefined reference to `main'
collect2: error: ld returned 1 exit status


Here, I'd like that with the new warning, 'three' would also get warned.

Cheers,

Alex
--
<http://www.alejandro-colomar.es/>

--
<http://www.alejandro-colomar.es/>

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to