On 08/08/2018 05:08 AM, Jason Merrill wrote:
On Wed, Aug 8, 2018 at 9:04 AM, Martin Sebor <mse...@gmail.com> wrote:
On 08/07/2018 02:57 AM, Jason Merrill wrote:

On Wed, Aug 1, 2018 at 12:49 AM, Martin Sebor <mse...@gmail.com> wrote:

On 07/31/2018 07:38 AM, Jason Merrill wrote:


On Tue, Jul 31, 2018 at 9:51 AM, Martin Sebor <mse...@gmail.com> wrote:


The middle-end contains code to determine the lengths of constant
character arrays initialized by string literals.  The code is used
in a number of optimizations and warnings.

However, the code is unable to deal with constant arrays initialized
using the braced initializer syntax, as in

  const char a[] = { '1', '2', '\0' };

The attached patch extends the C and C++ front-ends to convert such
initializers into a STRING_CST form.

The goal of this work is to both enable existing optimizations for
such arrays, and to help detect bugs due to using non-nul terminated
arrays where nul-terminated strings are expected.  The latter is
an extension of the GCC 8 _Wstringop-overflow and
-Wstringop-truncation warnings that help detect or prevent reading
past the end of dynamically created character arrays.  Future work
includes detecting potential past-the-end reads from uninitialized
local character arrays.



  && TYPE_MAIN_VARIANT (TREE_TYPE (valtype)) == char_type_node)



Why? Don't we want this for other character types as well?


It suppresses narrowing warnings for things like

  signed char a[] = { 0xff };

(there are a couple of tests that exercise this).


Why is plain char different in this respect?  Presumably one of

char a[] = { -1 };
char b[] = { 0xff };

should give the same narrowing warning, depending on whether char is
signed.


Right.  I've added more tests to verify that it does.

At the same time, STRING_CST is supposed to be able to represent
strings of any integer type so there should be a way to make it
work.  On the flip side, recent discussions of changes in this
area suggest there may be bugs in the wide character handling of
STRING_CST so those would need to be fixed before relying on it
for robust support.

In any case, if you have a suggestion for how to make it work for
at least the narrow character types I'll adjust the patch.


I suppose braced_list_to_string should call check_narrowing for C++.


I see.  I've made that change.  That has made it possible to
convert arrays of all character types.  Thanks!

Currently it uses tree_fits_shwi_p (signed host_wide_int) and then
stores the extracted value in a host unsigned int, which is then
converted to host char.  Does the right thing happen for -fsigned-char
or targets with a different character set?

I believe so.  I've added tests for these too (ASCII and EBCDIC)
and also changed the type of the extracted value to HWI to match
(it doesn't change the results of the tests).

Attached is an updated patch with these changes plus more tests
as suggested by Joseph.

Great.  Can we also move the call to braced_list_to_string into
check_initializer, so it works for templates as well?  As a case just
before the block that calls reshape_init seems appropriate.

Done in the attached patch.  I've also avoided dealing with
zero-length arrays and added tests to make sure their size
stays is regardless of the form of their initializer and
the appropriate warnings are issued.

Using build_string() rather than build_string_literal() needed
a tweak in digest_init_r().  It didn't break anything but since
the array type may not have a domain yet, neither will the
string.  It looks like that may get adjusted later on but I've
temporarily guarded the code with #if 1.  If the change is
fine I'll remove the #if before committing.

This initial patch only handles narrow character initializers
(i.e., those with TYPE_STRING_FLAG set).  Once this gets some
exposure I'd like to extend it to other character types,
including wchar_t.

Martin
PR tree-optimization/71625 - missing strlen optimization on different array initialization style

gcc/c/ChangeLog:

	PR tree-optimization/71625
	* c-parser.c (c_parser_declaration_or_fndef): Call
	braced_list_to_string.

gcc/c-family/ChangeLog:

	PR tree-optimization/71625
	* c-common.c (braced_list_to_string): New function.
	* c-common.h (braced_list_to_string): Declare it.

gcc/cp/ChangeLog:

	PR tree-optimization/71625
	* decl.c (check_initializer):  Call braced_list_to_string.
	(eval_check_narrowing): New function.
	* gcc/cp/typeck2.c (digest_init_r): Accept strings literals
	as initilizers for all narrow character types.

gcc/testsuite/ChangeLog:

	PR tree-optimization/71625
	* g++.dg/init/string2.C: New test.
	* g++.dg/init/string3.C: New test.
	* g++.dg/init/string4.C: New test.
	* gcc.dg/init-string-3.c: New test.
	* gcc.dg/strlenopt-55.c: New test.
	* gcc.dg/strlenopt-56.c: New test.
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index d919605..b10d9c9 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -8509,4 +8509,102 @@ maybe_add_include_fixit (rich_location *richloc, const char *header)
   free (text);
 }
 
+/* Attempt to convert a braced array initializer list CTOR for array
+   TYPE into a STRING_CST for convenience and efficiency.  When non-null,
+   use EVAL to attempt to evalue constants (used by C++).  Return
+   the converted string on success or null on failure.  */
+
+tree
+braced_list_to_string (tree type, tree ctor, tree (*eval)(tree, tree))
+{
+  unsigned HOST_WIDE_INT nelts = CONSTRUCTOR_NELTS (ctor);
+
+  /* If the array has an explicit bound, use it to constrain the size
+     of the string.  If it doesn't, be sure to create a string that's
+     as long as implied by the index of the last zero specified via
+     a designator, as in:
+       const char a[] = { [7] = 0 };  */
+  unsigned HOST_WIDE_INT maxelts = HOST_WIDE_INT_M1U;
+  if (tree size = TYPE_SIZE_UNIT (type))
+    {
+      if (tree_fits_uhwi_p (size))
+	{
+	  maxelts = tree_to_uhwi (size);
+	  maxelts /= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (type)));
+
+	  /* Avoid converting initializers for zero-length arrays.  */
+	  if (!maxelts)
+	    return NULL_TREE;
+	}
+    }
+  else if (!nelts)
+    /* Avoid handling the undefined/erroneous case of an empty
+       initializer for an arrays with unspecified bound.  */
+    return NULL_TREE;
+
+  tree eltype = TREE_TYPE (type);
+
+  auto_vec<char> str;
+  str.reserve (nelts + 1);
+
+  unsigned HOST_WIDE_INT i;
+  tree index, value;
+
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), i, index, value)
+    {
+      unsigned HOST_WIDE_INT idx = index ? tree_to_uhwi (index) : i;
+
+      /* auto_vec is limited to UINT_MAX elements.  */
+      if (idx > UINT_MAX)
+	return NULL_TREE;
+
+      /* Attempt to evaluate constants.  */
+      if (eval)
+	value = eval (eltype, value);
+
+      /* Avoid non-constant initializers.  */
+     if (!tree_fits_shwi_p (value))
+	return NULL_TREE;
+
+      /* Skip over embedded nuls except the last one (initializer
+	 elements are in ascending order of indices).  */
+      HOST_WIDE_INT val = tree_to_shwi (value);
+      if (!val && i + 1 < nelts)
+	continue;
+
+      /* Bail if the CTOR has a block of more than 256 embedded nuls
+	 due to implicitly initialized elements.  */
+      unsigned nchars = (idx - str.length ()) + 1;
+      if (nchars > 256)
+	return NULL_TREE;
+
+      if (nchars > 1)
+	{
+	  str.reserve (idx);
+	  str.quick_grow_cleared (idx);
+	}
+
+      if (idx > maxelts)
+	return NULL_TREE;
+
+      str.safe_insert (idx, val);
+    }
+
+  if (!nelts)
+    /* Append a nul for the empty initializer { }.  */
+    str.safe_push (0);
+
+#if 1
+  /* Build a STRING_CST with the same type as the array, which
+     may be an array of unknown bound.  */
+  tree res = build_string (str.length (), str.begin ());
+  TREE_TYPE (res) = type;
+#else
+  /* Build a string literal but return the embedded STRING_CST.  */
+  tree res = build_string_literal (str.length (), str.begin (), eltype);
+  res = TREE_OPERAND (TREE_OPERAND (res, 0), 0);
+#endif
+  return res;
+}
+
 #include "gt-c-family-c-common.h"
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index fcec95b..8a802bb 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1331,6 +1331,7 @@ extern void maybe_add_include_fixit (rich_location *, const char *);
 extern void maybe_suggest_missing_token_insertion (rich_location *richloc,
 						   enum cpp_ttype token_type,
 						   location_t prev_token_loc);
+extern tree braced_list_to_string (tree, tree, tree (*)(tree, tree) = NULL);
 
 #if CHECKING_P
 namespace selftest {
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 7a92628..5ad4f57 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -2126,6 +2126,15 @@ c_parser_declaration_or_fndef (c_parser *parser, bool fndef_ok,
 	      if (d != error_mark_node)
 		{
 		  maybe_warn_string_init (init_loc, TREE_TYPE (d), init);
+
+		  /* Try to convert a string CONSTRUCTOR into a STRING_CST.  */
+		  tree valtype = TREE_TYPE (init.value);
+		  if (TREE_CODE (init.value) == CONSTRUCTOR
+		      && TREE_CODE (valtype) == ARRAY_TYPE
+		      && TYPE_STRING_FLAG (TREE_TYPE (valtype)))
+		    if (tree str = braced_list_to_string (valtype, init.value))
+		      init.value = str;
+
 		  finish_decl (d, init_loc, init.value,
 			       init.original_type, asm_name);
 		}
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 78ebbde..d2c5b5d 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6282,6 +6282,30 @@ build_aggr_init_full_exprs (tree decl, tree init, int flags)
   return build_aggr_init (decl, init, flags, tf_warning_or_error);
 }
 
+/* Attempt to determine the constant VALUE of integral type and convert
+   it to TYPE, issuing narrowing warnings/errors as necessary.  Return
+   the constant result or null on failure.  Callback for
+   braced_list_to_string.  */
+
+static tree
+eval_check_narrowing (tree type, tree value)
+{
+  if (tree valtype = TREE_TYPE (value))
+    {
+      if (TREE_CODE (valtype) != INTEGER_TYPE)
+	return NULL_TREE;
+    }
+  else
+    return NULL_TREE;
+
+  value = scalar_constant_value (value);
+  if (!value)
+    return NULL_TREE;
+
+  check_narrowing (type, value, tf_warning_or_error);
+  return value;
+}
+
 /* Verify INIT (the initializer for DECL), and record the
    initialization in DECL_INITIAL, if appropriate.  CLEANUP is as for
    grok_reference_init.
@@ -6397,7 +6421,18 @@ check_initializer (tree decl, tree init, int flags, vec<tree, va_gc> **cleanups)
 	    }
 	  else
 	    {
-	      init = reshape_init (type, init, tf_warning_or_error);
+	      /* Try to convert a string CONSTRUCTOR into a STRING_CST.  */
+	      tree valtype = TREE_TYPE (decl);
+	      if (TREE_CODE (valtype) == ARRAY_TYPE
+		  && TYPE_STRING_FLAG (TREE_TYPE (valtype))
+		  && TREE_CODE (init) == CONSTRUCTOR
+		  && TREE_TYPE (init) == init_list_type_node)
+		if (tree str = braced_list_to_string (valtype, init,
+						      eval_check_narrowing))
+		  init = str;
+
+	      if (TREE_CODE (init) != STRING_CST)
+		init = reshape_init (type, init, tf_warning_or_error);
 	      flags |= LOOKUP_NO_NARROWING;
 	    }
 	}
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 7763d53..72515d9 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1056,7 +1056,9 @@ digest_init_r (tree type, tree init, int nested, int flags,
 
 	  if (TYPE_PRECISION (typ1) == BITS_PER_UNIT)
 	    {
-	      if (char_type != char_type_node)
+	      if (char_type != char_type_node
+		  && char_type != signed_char_type_node
+		  && char_type != unsigned_char_type_node)
 		{
 		  if (complain & tf_error)
 		    error_at (loc, "char-array initialized from wide string");
diff --git a/gcc/testsuite/g++.dg/init/string2.C b/gcc/testsuite/g++.dg/init/string2.C
new file mode 100644
index 0000000..5da13bd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/string2.C
@@ -0,0 +1,104 @@
+// PR tree-optimization/71625 - missing strlen optimization on different
+// array initialization style
+//
+// Verify that strlen() calls with constant character array arguments
+// initialized with string constants are folded.  (This is a small
+// subset of pr71625).
+// { dg-do compile }
+// { dg-options "-O0 -Wno-error=narrowing -fdump-tree-gimple" }
+
+#define A(expr) do { typedef char A[-1 + 2 * !!(expr)]; } while (0)
+
+/* This is undefined but accepted without -Wpedantic.  Verify that
+   the size is zero.  */
+const char ax[] = { };
+
+void size0 ()
+{
+  A (sizeof ax == 0);
+}
+
+const char a0[] = { 'a', 'b', 'c', '\0' };
+
+int len0 ()
+{
+  return __builtin_strlen (a0);
+}
+
+// Verify that narrowing warnings are preserved.
+const signed char
+sa0[] = { 'a', 'b', 255, '\0' };    // { dg-warning "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } }
+
+int lens0 ()
+{
+  return __builtin_strlen ((const char*)sa0);
+}
+
+const unsigned char
+ua0[] = { 'a', 'b', -1, '\0' };     // { dg-warning "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } }
+
+int lenu0 ()
+{
+  return __builtin_strlen ((const char*)ua0);
+}
+
+const char c = 0;
+const char a1[] = { 'a', 'b', 'c', c };
+
+int len1 ()
+{
+  return __builtin_strlen (a1);
+}
+
+template <class T>
+int tmplen ()
+{
+  static const T
+    a[] = { 1, 2, 333, 0 };         // { dg-warning "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } }
+  return __builtin_strlen (a);
+}
+
+template int tmplen<char>();
+
+const wchar_t ws4[] = { 1, 2, 3, 4 };
+const wchar_t ws7[] = { 1, 2, 3, 4, 0, 0, 0 };
+const wchar_t ws9[9] = { 1, 2, 3, 4, 0 };
+
+void wsize ()
+{
+  A (sizeof ws4 == 4 * sizeof *ws4);
+  A (ws4[0] == 1 && ws4[1] == 2 && ws4[2] == 3 && ws4[3] == 4);
+
+  A (sizeof ws7 == 7 * sizeof *ws7);
+  A (ws7[0] == 1 && ws7[1] == 2 && ws7[2] == 3 && ws7[4] == 4
+     && !ws7[5] && !ws7[6]);
+
+  A (sizeof ws9 == 9 * sizeof *ws9);
+  A (ws9[0] == 1 && ws9[1] == 2 && ws9[2] == 3 && ws9[4] == 4
+     && !ws9[5] && !ws9[6] && !ws9[7] && !ws9[8]);
+}
+
+#if 0
+
+// The following aren't handled.
+
+const char &cref = c;
+const char a2[] = { 'a', 'b', 'c', cref };
+
+int len2 ()
+{
+  return __builtin_strlen (a2);
+}
+
+
+const char* const cptr = &cref;
+const char a3[] = { 'a', 'b', 'c', *cptr };
+
+int len3 ()
+{
+  return __builtin_strlen (a3);
+}
+
+#endif
+
+// { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } }
diff --git a/gcc/testsuite/g++.dg/init/string3.C b/gcc/testsuite/g++.dg/init/string3.C
new file mode 100644
index 0000000..8212e81
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/string3.C
@@ -0,0 +1,35 @@
+// PR tree-optimization/71625 - missing strlen optimization on different
+// array initialization style
+//
+// Verify that strlen() call with a constant character array argument
+// initialized with non-constant elements isn't folded.
+//
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-optimized" }
+
+
+extern const char c;
+const char a0[] = { 'a', 'b', 'c', c };
+
+int len0 ()
+{
+  return __builtin_strlen (a0);
+}
+
+const char &ref = c;
+const char a1[] = { 'a', 'b', 'c', ref };
+
+int len1 ()
+{
+  return __builtin_strlen (a1);
+}
+
+const char* const ptr = &c;
+const char a2[] = { 'a', 'b', 'c', *ptr };
+
+int len2 ()
+{
+  return __builtin_strlen (a2);
+}
+
+// { dg-final { scan-tree-dump-times "strlen" 3 "optimized" } }
diff --git a/gcc/testsuite/g++.dg/init/string4.C b/gcc/testsuite/g++.dg/init/string4.C
new file mode 100644
index 0000000..5df4176
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/string4.C
@@ -0,0 +1,60 @@
+// PR tree-optimization/71625 - missing strlen optimization on different
+// array initialization style
+
+// Verify that zero-length array initialization results in the expected
+// array sizes and in the expected diagnostics.  See init-string-3.c
+// for the corresponding C test.
+
+// { dg-do compile }
+// { dg-options "-Wall -Wno-unused-local-typedefs -fpermissive" }
+
+#define A(expr) typedef char A[-1 + 2 * !!(expr)];
+
+const char a[] = { };
+
+A (sizeof a == 0);
+
+
+const char b[0] = { };
+
+A (sizeof b == 0);
+
+// Also verify that the error is "too many initializers for
+// 'const char [0]'" and not "initializer-string is too long."
+const char c[0] = { 1 };      // { dg-error "too many initializers for .const char \\\[0]" }
+
+A (sizeof c == 0);
+
+
+void test_auto_empty (void)
+{
+  const char a[] = { };
+
+  A (sizeof a == 0);
+}
+
+void test_auto_zero_length (void)
+{
+  const char a[0] = { };
+
+  A (sizeof a == 0);
+
+  const char b[0] = { 0 };    // { dg-error "too many initializers" }
+
+  A (sizeof b == 0);
+
+  const char c[0] = "";       // { dg-warning "too long" }
+
+  A (sizeof c == 0);
+}
+
+
+void test_compound_zero_length (void)
+{
+  A (sizeof (const char[]){ } == 0);
+  A (sizeof (const char[0]){ } == 0);
+  A (sizeof (const char[0]){ 0 } == 0);    // { dg-error "too many" }
+  A (sizeof (const char[0]){ 1 } == 0);    // { dg-error "too many" }
+  A (sizeof (const char[0]){ "" } == 0);   // { dg-warning "too long" }
+  A (sizeof (const char[0]){ "1" } == 0);  // { dg-warning "too long" }
+}
diff --git a/gcc/testsuite/gcc.dg/init-string-3.c b/gcc/testsuite/gcc.dg/init-string-3.c
new file mode 100644
index 0000000..e955f2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/init-string-3.c
@@ -0,0 +1,58 @@
+/* PR tree-optimization/71625 - missing strlen optimization on different
+   array initialization style
+
+   Verify that zero-length array initialization results in the expected
+   array sizes.
+
+   { dg-do compile }
+   { dg-options "-Wall -Wno-unused-local-typedefs" }  */
+
+#define A(expr) typedef char A[-1 + 2 * !!(expr)];
+
+const char a[] = { };
+
+A (sizeof a == 0);
+
+
+const char b[0] = { };
+
+A (sizeof b == 0);
+
+
+const char c[0] = { 1 };    /* { dg-warning "excess elements" } */
+
+A (sizeof c == 0);
+
+
+void test_auto_empty (void)
+{
+  const char a[] = { };
+
+  A (sizeof a == 0);
+}
+
+void test_auto_zero_length (void)
+{
+  const char a[0] = { };
+
+  A (sizeof a == 0);
+
+  const char b[0] = { 0 };    /* { dg-warning "excess elements" } */
+
+  A (sizeof b == 0);
+
+  const char c[0] = "";
+
+  A (sizeof c == 0);
+}
+
+
+void test_compound_zero_length (void)
+{
+  A (sizeof (const char[]){ } == 0);
+  A (sizeof (const char[0]){ } == 0);
+  A (sizeof (const char[0]){ 0 } == 0);   /* { dg-warning "excess elements" } */
+  A (sizeof (const char[0]){ 1 } == 0);   /* { dg-warning "excess elements" } */
+  A (sizeof (const char[0]){ "" } == 0);
+  A (sizeof (const char[0]){ "1" } == 0);  /* { dg-warning "too long" } */
+}
diff --git a/gcc/testsuite/gcc.dg/strlenopt-55.c b/gcc/testsuite/gcc.dg/strlenopt-55.c
new file mode 100644
index 0000000..d5a0295
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-55.c
@@ -0,0 +1,230 @@
+/* PR tree-optimization/71625 - missing strlen optimization on different
+   array initialization style
+
+   Verify that strlen() of braced initialized array is folded
+   { dg-do compile }
+   { dg-options "-O1 -Wall -fdump-tree-gimple -fdump-tree-optimized" } */
+
+#include "strlenopt.h"
+
+#define S								\
+  "\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f"	\
+  "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f"	\
+  "\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f"	\
+  "\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f"	\
+  "\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f"	\
+  "\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f"	\
+  "\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f"	\
+  "\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f"	\
+  "\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f"	\
+  "\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f"	\
+  "\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf"	\
+  "\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf"	\
+  "\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf"	\
+  "\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf"	\
+  "\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef"	\
+  "\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"
+
+/* Arrays of char, signed char, and unsigned char to verify that
+   the length and contents of all are the same as that of the string
+   literal above.  */
+
+const char c256[] = {
+  S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10],
+  S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20],
+  S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30],
+  S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40],
+  S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50],
+  S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60],
+  S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70],
+  S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80],
+  S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90],
+  S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100],
+  S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109],
+  S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118],
+  S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127],
+  S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136],
+  S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145],
+  S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154],
+  S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163],
+  S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172],
+  S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181],
+  S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190],
+  S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199],
+  S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208],
+  S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217],
+  S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226],
+  S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235],
+  S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244],
+  S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253],
+  S[254], S[255] /* = NUL */
+};
+
+const signed char sc256[] = {
+  S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10],
+  S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20],
+  S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30],
+  S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40],
+  S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50],
+  S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60],
+  S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70],
+  S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80],
+  S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90],
+  S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100],
+  S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109],
+  S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118],
+  S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127],
+  S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136],
+  S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145],
+  S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154],
+  S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163],
+  S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172],
+  S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181],
+  S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190],
+  S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199],
+  S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208],
+  S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217],
+  S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226],
+  S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235],
+  S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244],
+  S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253],
+  S[254], S[255] /* = NUL */
+};
+
+const unsigned char uc256[] = {
+  S[0], S[1], S[2], S[3], S[4], S[5], S[6], S[7], S[8], S[9], S[10],
+  S[11], S[12], S[13], S[14], S[15], S[16], S[17], S[18], S[19], S[20],
+  S[21], S[22], S[23], S[24], S[25], S[26], S[27], S[28], S[29], S[30],
+  S[31], S[32], S[33], S[34], S[35], S[36], S[37], S[38], S[39], S[40],
+  S[41], S[42], S[43], S[44], S[45], S[46], S[47], S[48], S[49], S[50],
+  S[51], S[52], S[53], S[54], S[55], S[56], S[57], S[58], S[59], S[60],
+  S[61], S[62], S[63], S[64], S[65], S[66], S[67], S[68], S[69], S[70],
+  S[71], S[72], S[73], S[74], S[75], S[76], S[77], S[78], S[79], S[80],
+  S[81], S[82], S[83], S[84], S[85], S[86], S[87], S[88], S[89], S[90],
+  S[91], S[92], S[93], S[94], S[95], S[96], S[97], S[98], S[99], S[100],
+  S[101], S[102], S[103], S[104], S[105], S[106], S[107], S[108], S[109],
+  S[110], S[111], S[112], S[113], S[114], S[115], S[116], S[117], S[118],
+  S[119], S[120], S[121], S[122], S[123], S[124], S[125], S[126], S[127],
+  S[128], S[129], S[130], S[131], S[132], S[133], S[134], S[135], S[136],
+  S[137], S[138], S[139], S[140], S[141], S[142], S[143], S[144], S[145],
+  S[146], S[147], S[148], S[149], S[150], S[151], S[152], S[153], S[154],
+  S[155], S[156], S[157], S[158], S[159], S[160], S[161], S[162], S[163],
+  S[164], S[165], S[166], S[167], S[168], S[169], S[170], S[171], S[172],
+  S[173], S[174], S[175], S[176], S[177], S[178], S[179], S[180], S[181],
+  S[182], S[183], S[184], S[185], S[186], S[187], S[188], S[189], S[190],
+  S[191], S[192], S[193], S[194], S[195], S[196], S[197], S[198], S[199],
+  S[200], S[201], S[202], S[203], S[204], S[205], S[206], S[207], S[208],
+  S[209], S[210], S[211], S[212], S[213], S[214], S[215], S[216], S[217],
+  S[218], S[219], S[220], S[221], S[222], S[223], S[224], S[225], S[226],
+  S[227], S[228], S[229], S[230], S[231], S[232], S[233], S[234], S[235],
+  S[236], S[237], S[238], S[239], S[240], S[241], S[242], S[243], S[244],
+  S[245], S[246], S[247], S[248], S[249], S[250], S[251], S[252], S[253],
+  S[254], S[255] /* = NUL */
+};
+
+const __CHAR16_TYPE__ c16_4[] = {
+  1, 0x7fff, 0x8000, 0xffff,
+  0x10000   /* { dg-warning "\\\[-Woverflow]" } */
+};
+
+const char a2_implicit[2] = { };
+const char a3_implicit[3] = { };
+
+const char a3_nul[3] = { 0 };
+const char a5_nul1[3] = { [1] = 0 };
+const char a7_nul2[3] = { [2] = 0 };
+
+const char ax_2_nul[] = { '1', '2', '\0' };
+const char ax_3_nul[] = { '1', '2', '3', '\0' };
+
+const char ax_3_des_nul[] = { [3] = 0, [2] = '3', [1] = '2', [0] = '1' };
+
+const char ax_3[] = { '1', '2', '3' };
+const char a3_3[3] = { '1', '2', '3' };
+
+const char ax_100_3[] = { '1', '2', '3', [100] = '\0' };
+
+#define CONCAT(x, y) x ## y
+#define CAT(x, y) CONCAT (x, y)
+#define FAILNAME(name) CAT (call_ ## name ##_on_line_, __LINE__)
+
+#define FAIL(name) do {				\
+    extern void FAILNAME (name) (void);		\
+    FAILNAME (name)();				\
+  } while (0)
+
+/* Macro to emit a call to funcation named
+   call_in_true_branch_not_eliminated_on_line_NNN()
+   for each call that's expected to be eliminated.  The dg-final
+   scan-tree-dump-time directive at the bottom of the test verifies
+   that no such call appears in output.  */
+#define ELIM(expr)							\
+  if (!(expr)) FAIL (in_true_branch_not_eliminated); else (void)0
+
+#define T(s, n) ELIM (strlen (s) == n)
+
+void test_nulstring (void)
+{
+  T (a2_implicit, 0);
+  T (a3_implicit, 0);
+
+  T (a3_nul, 0);
+  T (a5_nul1, 0);
+  T (a7_nul2, 0);
+
+  T (ax_2_nul, 2);
+  T (ax_3_nul, 3);
+  T (ax_3_des_nul, 3);
+
+  T (ax_100_3, 3);
+  T (ax_100_3 + 4, 0);
+  ELIM (101 == sizeof ax_100_3);
+  ELIM ('\0' == ax_100_3[100]);
+
+  /* Verify that all three character arrays have the same length
+     as the string literal they are initialized with.  */
+  T (S, 255);
+  T (c256, 255);
+  T ((const char*)sc256, 255);
+  T ((const char*)uc256, 255);
+
+  /* Verify that all three character arrays have the same contents
+     as the string literal they are initialized with.  */
+  ELIM (0 == memcmp (c256, S, sizeof c256));
+  ELIM (0 == memcmp (c256, (const char*)sc256, sizeof c256));
+  ELIM (0 == memcmp (c256, (const char*)uc256, sizeof c256));
+
+  ELIM (0 == strcmp (c256, (const char*)sc256));
+  ELIM (0 == strcmp (c256, (const char*)uc256));
+
+  /* Verify that the char16_t array has the expected contents.  */
+  ELIM (c16_4[0] == 1 && c16_4[1] == 0x7fff
+	&& c16_4[2] == 0x8000 && c16_4[3] == 0xffff
+	&& c16_4[4] == 0);
+}
+
+/* Verify that excessively large initializers don't run out of
+   memory.  Also verify that the they have the expected size and
+   contents.  */
+
+#define MAX (__PTRDIFF_MAX__ - 1)
+
+const char large_string[] = { 'a', [1234] = 'b', [MAX] = '\0' };
+
+const void test_large_string_size (void)
+{
+  ELIM (sizeof large_string == MAX + 1);
+
+  /* The following expressions are not folded without optimization.  */
+  ELIM ('a'  == large_string[0]);
+  ELIM ('\0' == large_string[1233]);
+  ELIM ('b'  == large_string[1234]);
+  ELIM ('\0' == large_string[1235]);
+  ELIM ('\0' == large_string[MAX - 1]);
+}
+
+
+/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } }
+   { dg-final { scan-tree-dump-times "memcmp" 0 "gimple" } }
+   { dg-final { scan-tree-dump-times "strcmp" 0 "gimple" } }
+   { dg-final { scan-tree-dump-times "call_in_true_branch_not_eliminated" 0 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/strlenopt-56.c b/gcc/testsuite/gcc.dg/strlenopt-56.c
new file mode 100644
index 0000000..39a532b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/strlenopt-56.c
@@ -0,0 +1,50 @@
+/* PR tree-optimization/71625 - conversion of braced initializers to strings
+   Verify that array elements have the expected values regardless of sign
+   and non-ASCII execution character set.
+   { dg-do compile }
+   { dg-require-iconv "IBM1047" }
+   { dg-options "-O -Wall -fexec-charset=IBM1047 -fdump-tree-gimple -fdump-tree-optimized" } */
+
+#include "strlenopt.h"
+
+const char a[] = { 'a', 129, 0 };
+const signed char b[] = { 'b', 130, 0 };
+const unsigned char c[] = { 'c', 131, 0 };
+
+const char s[] = "a\201";
+const signed char ss[] = "b\202";
+const unsigned char us[] = "c\203";
+
+
+#define A(expr)   ((expr) ? (void)0 : __builtin_abort ())
+
+void test_values (void)
+{
+  A (a[0] == a[1]);
+  A (a[1] == 'a');
+
+  A (b[0] == b[1]);
+  A (b[1] == (signed char)'b');
+
+  A (c[0] == c[1]);
+  A (c[1] == (unsigned char)'c');
+}
+
+void test_lengths (void)
+{
+  A (2 == strlen (a));
+  A (2 == strlen ((const char*)b));
+  A (2 == strlen ((const char*)c));
+}
+
+void test_contents (void)
+{
+  A (0 == strcmp (a, s));
+  A (0 == strcmp ((const char*)b, (const char*)ss));
+  A (0 == strcmp ((const char*)c, (const char*)us));
+}
+
+
+/* { dg-final { scan-tree-dump-times "strlen" 0 "gimple" } }
+   { dg-final { scan-tree-dump-times "strcmp" 0 "gimple" } }
+   { dg-final { scan-tree-dump-times "abort" 0 "optimized" } } */

Reply via email to