On 08/02/2018 04:22 AM, Bernd Edlinger wrote:
Hi,

this is an update of the patch to prevent unsafe optimizations due to strlen 
range assuming
always zero-terminated char arrays.

Since the previous version I do no longer try to locate the outermost char 
array,
but just bail out if there is a typecast, that means, supposed we have a 
2-dimensional
array, char a[x][y], strlen (s.a[x]) may assume a[x] is zero-terminated, if the 
optimization
is enabled.  strlen ((char*)s.a) involves a type cast and should assume nothing.

Additionally due to the discussion, I came to the conclusion that this strlen 
range optimization
should only be used when enabled with -fassume-zero-terminated-char-arrays or 
-Ofast.

Note that I included the test case with the removed assertion as
gcc/testsuite/gcc.dg/strlenopt-57.c

Initially it would only miscompile with -Ofast, but with r263018 aka
PR tree-optimization/86043, PR tree-optimization/86042 it was miscompiled with 
-O3 as well.

I located the reason in gcc/tree-ssa-strlen.c (get_min_string_length) which did 
not
check if the string constant is in fact zero terminated:

@@ -3192,7 +3194,9 @@ get_min_string_length (tree rhs, bool *full_string
       && TREE_READONLY (rhs))
     rhs = DECL_INITIAL (rhs);

-  if (rhs && TREE_CODE (rhs) == STRING_CST)
+  if (rhs && TREE_CODE (rhs) == STRING_CST
+      && compare_tree_int (TYPE_SIZE_UNIT (TREE_TYPE (rhs)),
+                          TREE_STRING_LENGTH (rhs)) >= 0)
     {
       *full_string_p = true;
       return strlen (TREE_STRING_POINTER (rhs));

Fortunately this also shows a way how to narrow strlen return value 
expectations when
we are able to positively prove that a string must be zero terminated.



Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?

The warning bits are definitely not okay by me.  The purpose
of the warnings (-W{format,sprintf}-{overflow,truncation} is
to detect buffer overflows.  When a warning doesn't have access
to string length information for dynamically created strings
(like the strlen pass does) it uses array sizes as a proxy.
This is useful both to detect possible buffer overflows and
to prevent false positives for overflows that cannot happen
in correctly written programs.

By changing the logic to not consider the bounds of the array
the warnkngs will become prone to many false positives as is
evident from your changes to the tests (you had to explicitly
enable the new option).

In effect, the patch undoes a couple of years worth of fine
tuning of the warnings to strike a balance between true and
false positives.  That's completely unacceptable to me, and,
frankly, proposals along these lines are exceedingly
disruptive.

Beyond that, I don't have a problem with adding a new option
but I continue to strongly disagree with disabling the strlen
optimization by default.  It implies that by default GCC
targets invalid code.  If there is consensus that some new
option is necessary, the right setting is on by default,
along with warnings for programs that pass non-terminated
arrays to string functions.  I have already submitted a patch
that effect for strlen and const arrays and am working on
extending it to other built-in functions and dynamically
created strings (when I have a chance between defending
my past work).

Either way, if a new option is introduced, the array bound
computation for sprintf (and all other buffer overflow
warnings, likely including -Wrestrict) will need to be
decoupled from the option so the current behavior is
preserved.

As I have also said, the area of the provenance of subobject
vs enclosing object pointers is under discussion in WG14 and
the C Object Model study group.  The informal consensus so
far is to maintain the provenance of subobjects and introduce
some mechanism to make it possible to extend the provenance
to the enclosing object.  This would be a pervasive change,
not one just limited to strings.  So if introducing some new
option at this stage is thought to be necessary this should
be taken into consideration.

Martin

Reply via email to