On 08/01/18 18:04, Joseph Myers wrote: > On Wed, 1 Aug 2018, Bernd Edlinger wrote: > >> On 07/30/18 17:49, Joseph Myers wrote: >>> On Mon, 30 Jul 2018, Bernd Edlinger wrote: >>> >>>> Hi, >>>> >>>> this is how I would like to handle the over length strings issue in the C >>>> FE. >>>> If the string constant is exactly the right length and ends in one explicit >>>> NUL character, shorten it by one character. >>> >>> I don't think shortening should be limited to that case. I think the case >>> where the constant is longer than that (and so gets an unconditional >>> pedwarn) should also have it shortened - any constant that doesn't fit in >>> the object being initialized should be shortened to fit, whether diagnosed >>> or not, we should define GENERIC / GIMPLE to disallow too-large string >>> constants in initializers, and should add an assertion somewhere in the >>> middle-end that no too-large string constants reach it. >>> >> >> Okay, there is an update following your suggestion. > > It seems odd to me to have two separate bits of code dealing with reducing > the length, rather than something like > > if (too long) > { > /* Decide whether to do a pedwarn_init, or a warn_cxx_compat warning, > or neither. */ > /* Shorten string, in either case. */ > } > > The memcmp with "\0\0\0\0" is introducing a hidden assumption that any > sort of character in strings is never more than four bytes. It also seems > unnecessary, in that ultimately the over-long string should be shortened > regardless of whether what's being removed is a zero character or not. > > It should not be possible to be over-long and fail tree_fits_uhwi_p > (TYPE_SIZE_UNIT (type)), simply because STRING_CST lengths are stored in > host int (even if, ideally, they'd use some other type to allow for > STRING_CSTs over 2GB in size). (And I don't think GCC can represent > target type sizes that don't fit in unsigned HOST_WIDE_INT anyway; the > only way for a target type size in bytes to fail to be representable in > unsigned HOST_WIDE_INT should be if the size is not constant.) >
Agreed. A new simplified version of the patch is attached. Bootstrapped and reg-tested as usual. Is it OK for trunk? Thanks Bernd.
2018-08-01 Bernd Edlinger <bernd.edlin...@hotmail.de> * c-typeck.c (digest_init): Shorten overlength strings. diff -pur gcc/c/c-typeck.c gcc/c/c-typeck.c --- gcc/c/c-typeck.c 2018-06-20 18:35:15.000000000 +0200 +++ gcc/c/c-typeck.c 2018-07-31 18:49:50.757586625 +0200 @@ -7435,19 +7435,17 @@ digest_init (location_t init_loc, tree type, tree } } - TREE_TYPE (inside_init) = type; if (TYPE_DOMAIN (type) != NULL_TREE && TYPE_SIZE (type) != NULL_TREE && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST) { unsigned HOST_WIDE_INT len = TREE_STRING_LENGTH (inside_init); + unsigned unit = TYPE_PRECISION (typ1) / BITS_PER_UNIT; /* Subtract the size of a single (possibly wide) character because it's ok to ignore the terminating null char that is counted in the length of the constant. */ - if (compare_tree_int (TYPE_SIZE_UNIT (type), - (len - (TYPE_PRECISION (typ1) - / BITS_PER_UNIT))) < 0) + if (compare_tree_int (TYPE_SIZE_UNIT (type), len - unit) < 0) pedwarn_init (init_loc, 0, ("initializer-string for array of chars " "is too long")); @@ -7456,8 +7454,21 @@ digest_init (location_t init_loc, tree type, tree warning_at (init_loc, OPT_Wc___compat, ("initializer-string for array chars " "is too long for C++")); + if (compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0) + { + unsigned HOST_WIDE_INT size + = tree_to_uhwi (TYPE_SIZE_UNIT (type)); + const char *p = TREE_STRING_POINTER (inside_init); + char *q = (char *)xmalloc (size + unit); + + memcpy (q, p, size); + memset (q + size, 0, unit); + inside_init = build_string (size + unit, q); + free (q); + } } + TREE_TYPE (inside_init) = type; return inside_init; } else if (INTEGRAL_TYPE_P (typ1))