On 7/27/20 6:32 AM, Martin Liška wrote:
Hey.

As mentioned in the PR, we should not create a string constant for a type
that is different from char_type_node. Looking at expr.c, I was inspired
and used 'TYPE_MAIN_VARIANT (chartype) == char_type_node' to verify that underlying
type is a character type.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests. And it fixes chromium
build with gcc-10 branch with the patch applied.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

     PR tree-optimization/96058
     * expr.c (string_constant): Build string_constant only
     for a type that is main variant of char_type_node.
---
  gcc/expr.c | 22 +++++++++++++---------
  1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 5db0a7a8565..c3fdd82b319 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11828,17 +11828,21 @@ string_constant (tree arg, tree *ptr_offset, tree *mem_size, tree *decl)
      chartype = TREE_TYPE (chartype);
        while (TREE_CODE (chartype) == ARRAY_TYPE)
      chartype = TREE_TYPE (chartype);
-      /* Convert a char array to an empty STRING_CST having an array
-     of the expected type and size.  */
-      if (!initsize)
-      initsize = integer_zero_node;

-      unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
-      init = build_string_literal (size, NULL, chartype, size);
-      init = TREE_OPERAND (init, 0);
-      init = TREE_OPERAND (init, 0);
+      if (TYPE_MAIN_VARIANT (chartype) == char_type_node)

The change to c_getstr I recently committed made it clear that
the function can:

  Return a pointer P to a NUL-terminated string containing
  the sequence of bytes corresponding to the representation
  of the object referred to by SRC (or a subsequence of such
  bytes within it if SRC is a reference to an initialized
  constant array plus some constant offset).

I.e., c_getstr returns a STRING_CST for arbitrary non-string
constants.  This enables optimizations like the by-pieces
expansion of calls to raw memory functions like memcpy, or
the folding of other raw memory calls like memchr with non-
string arguments.

c_getstr relies on string_constant for that.  Restricting
the latter function to just character types prevents these
optimizations for zero-initialized constants of other types.
A test case that shows the difference to the by-pieces
expansion goes something like this:

  const struct { char a[64]; } x = { 0 };

  void f (void *d)
  {
    __builtin_memcpy (d, &x, sizeof x - 1);
  }

A test case for the memchr folding is similar:

  const struct { char a[64]; } x = { 0 };

  int f (void *d)
  {
    return __builtin_memchr (&x, 0, sizeof x) != 0;
  }

The tests I committed with the change didn't exercise any of
this so that's my bad.  I'm still not sure I understand how
the problem with the incomplete type comes up (I haven't had
a chance to look into the recent updates on the bug yet) but
to retain the optimization (and keep the comments in sync
with the code) I think a better solution than restricting
the function to integers is to limit it to complete types.
Beyond that, extending the function to also constant arrays
or nonzero aggregates will also enable the optimization for
those.

Martin

+    {
+      /* Convert a char array to an empty STRING_CST having an array
+         of the expected type and size.  */
+      if (!initsize)
+        initsize = integer_zero_node;
+
+      unsigned HOST_WIDE_INT size = tree_to_uhwi (initsize);
+      init = build_string_literal (size, NULL, chartype, size);
+      init = TREE_OPERAND (init, 0);
+      init = TREE_OPERAND (init, 0);

-      *ptr_offset = integer_zero_node;
+      *ptr_offset = integer_zero_node;
+    }
      }

    if (decl)

Reply via email to