On 1/11/22 15:40, Jason Merrill wrote:
On 11/30/21 17:32, Martin Sebor via Gcc-patches wrote:
Attached is a revised patch with the following changes based
on your comments:

1) Set and use statement uids to determine which statement
    precedes which in the same basic block.
2) Avoid testing flag_isolate_erroneous_paths_dereference.
3) Use post-dominance to decide whether to use the "maybe"
    phrasing vs a definite form.

David raised (and in our offline discussion today reiterated)
an objection to the default setting of the option being
the strictest.  I have not changed that in this revision.
See my rationale for this choice in my reply below:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583176.html

In the latest C2x draft I see in the list of undefined behavior

"The value of a pointer that refers to space deallocated by a call to the free or realloc function is used (7.22.3)."

So the case that would be technically undefined would be comparing the reallocated pointer to the old pointer which has been deallocated.

The C++ draft is more nuanced: it says, "When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values (6.8.3).  Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior.  Any other use of an invalid pointer value has implementation-defined behavior."

So the case above is implementation-defined in C++, not undefined.

Let's put =2 in -Wall for now.

With that change, this and the -Wdangling-pointer patch are OK on Friday afternoon if there are no other comments before then.
I've adjusted both patches and pushed r12-6605 and r12-6606 after
retesting with a few packages.  In -Wdangling-pointer I reworded
the warning to refer to "unnamed temporary" instead of "compound
literal" since it triggers for those as well.

While rebuilding LLVM and a few its projects with the patches
I noticed the -Wdangling-pointer change exposes what seems like
a Ranger bug for one translation unit (it enters an infinite loop):
I opened pr104038 for it.

Martin


On 11/23/21 2:16 PM, Martin Sebor wrote:
On 11/22/21 6:32 PM, Jeff Law wrote:


On 11/1/2021 4:17 PM, Martin Sebor via Gcc-patches wrote:
Patch 1 in the series detects a small subset of uses of pointers
made indeterminate by calls to deallocation functions like free
or C++ operator delete.  To control the conditions the warnings
are issued under the new -Wuse-after-free= option provides three
levels.  At the lowest level the warning triggers only for
unconditional uses of freed pointers and doesn't warn for uses
in equality expressions.  Level 2 warns also for come conditional
uses, and level 3 also for uses in equality expressions.

I debated whether to make level 2 or 3 the default included in
-Wall.  I decided on 3 for two reasons: 1) to raise awareness
of both the problem and GCC's new ability to detect it: using
a pointer after it's been freed, even only in principle, by
a successful call to realloc, is undefined, and 2) because
it's trivial to lower the level either globally, or locally
by suppressing the warning around such misuses.

I've tested the patch on x86_64-linux and by building Glibc
and Binutils/GDB.  It triggers a number of times in each, all
due to comparing invalidated pointers for equality (i.e., level
3).  I have suppressed these in GCC (libiberty) by a #pragma,
and will see how the Glibc folks want to deal with theirs (I
track them in BZ #28521).

The tests contain a number of xfails due to limitations I'm
aware of.  I marked them pr?????? until the patch is approved.
I will open bugs for them before committing if I don't resolve
them in a followup.

Martin

gcc-63272-1.diff

Add -Wuse-after-free.

gcc/c-family/ChangeLog

    * c.opt (-Wuse-after-free): New options.

gcc/ChangeLog:

    * diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle
    OPT_Wreturn_local_addr and OPT_Wuse_after_free_.
    * diagnostic-spec.h (NW_DANGLING): New enumerator.
    * doc/invoke.texi (-Wuse-after-free): Document new option.
    * gimple-ssa-warn-access.cc (pass_waccess::check_call): Rename...
    (pass_waccess::check_call_access): ...to this.
    (pass_waccess::check): Rename...
    (pass_waccess::check_block): ...to this.
    (pass_waccess::check_pointer_uses): New function.
    (pass_waccess::gimple_call_return_arg): New function.
    (pass_waccess::warn_invalid_pointer): New function.
    (pass_waccess::check_builtin): Handle free and realloc.
    (gimple_use_after_inval_p): New function.
    (get_realloc_lhs): New function.
    (maybe_warn_mismatched_realloc): New function.
    (pointers_related_p): New function.
    (pass_waccess::check_call): Call check_pointer_uses.
    (pass_waccess::execute): Compute and free dominance info.

libcpp/ChangeLog:

    * files.c (_cpp_find_file): Substitute a valid pointer for
    an invalid one to avoid -Wuse-0after-free.

libiberty/ChangeLog:

    * regex.c: Suppress -Wuse-after-free.

gcc/testsuite/ChangeLog:

    * gcc.dg/Wmismatched-dealloc-2.c: Avoid -Wuse-after-free.
    * gcc.dg/Wmismatched-dealloc-3.c: Same.
    * gcc.dg/attr-alloc_size-6.c: Disable -Wuse-after-free.
    * gcc.dg/attr-alloc_size-7.c: Same.
    * c-c++-common/Wuse-after-free-2.c: New test.
    * c-c++-common/Wuse-after-free-3.c: New test.
    * c-c++-common/Wuse-after-free-4.c: New test.
    * c-c++-common/Wuse-after-free-5.c: New test.
    * c-c++-common/Wuse-after-free-6.c: New test.
    * c-c++-common/Wuse-after-free-7.c: New test.
    * c-c++-common/Wuse-after-free.c: New test.
    * g++.dg/warn/Wdangling-pointer.C: New test.
    * g++.dg/warn/Wmismatched-dealloc-3.C: New test.
    * g++.dg/warn/Wuse-after-free.C: New test.

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 63fc27a1487..2065402a2b9 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc

@@ -3397,33 +3417,460 @@ pass_waccess::maybe_check_dealloc_call (gcall *call)
      }
  }
+/* Return true if either USE_STMT's basic block (that of a pointer's use) +   is dominated by INVAL_STMT's (that of a pointer's invalidating statement, +   which is either a clobber or a deallocation call), or if they're in
+   the same block, USE_STMT follows INVAL_STMT.  */
+
+static bool
+gimple_use_after_inval_p (gimple *inval_stmt, gimple *use_stmt,
+              bool last_block = false)
+{
+  tree clobvar =
+    gimple_clobber_p (inval_stmt) ? gimple_assign_lhs (inval_stmt) : NULL_TREE;
+
+  basic_block inval_bb = gimple_bb (inval_stmt);
+  basic_block use_bb = gimple_bb (use_stmt);
+
+  if (inval_bb != use_bb)
+    {
+      if (dominated_by_p (CDI_DOMINATORS, use_bb, inval_bb))
+    return true;
+
+      if (!clobvar || !last_block)
+    return false;
+
+      auto gsi = gsi_for_stmt (use_stmt);
+
+      auto_bitmap visited;
+
+      /* A use statement in the last basic block in a function or one that
+     falls through to it is after any other prior clobber of the used
+     variable unless it's followed by a clobber of the same variable. */
+      basic_block bb = use_bb;
+      while (bb != inval_bb
+         && single_succ_p (bb)
+         && !(single_succ_edge (bb)->flags & (EDGE_EH|EDGE_DFS_BACK)))
+    {
+      if (!bitmap_set_bit (visited, bb->index))
+        /* Avoid cycles. */
+        return true;
+
+      for (; !gsi_end_p (gsi); gsi_next_nondebug (&gsi))
+        {
+          gimple *stmt = gsi_stmt (gsi);
+          if (gimple_clobber_p (stmt))
+        {
+          if (clobvar == gimple_assign_lhs (stmt))
+            /* The use is followed by a clobber.  */
+            return false;
+        }
+        }
+
+      bb = single_succ (bb);
+      gsi = gsi_start_bb (bb);
+    }
+
+      return bb == EXIT_BLOCK_PTR_FOR_FN (cfun);
+    }
?!?  I would have thought the block dominance test plus checking UIDs if the two statements are in the same block would be all you need. Can you elaborate more on what that hunk above is trying to do?

The loop is entered only for -Wdangling-pointer.  It looks for
the first clobber of the CLOBVAR variable (one whose clobber
statement has been seen during the CFG traversal and whose use
is being validated) in the successors along the single edge
from the use block.  If the search finds a clobber, the use
is valid.  If it doesn't, the use is one of a variable having
gone out of scope (the clobber must be before the use).

Among the cases the loop handles is the one in PR 63272
(the request for -Wdangling-pointer) where the use neither
follows the clobber in the same block nor dominated by it.

There may be a way to optimize it somehow but because it's
a search I don't think a simple UID check alone would be
enough.

+
+  for (auto si = gsi_for_stmt (inval_stmt); !gsi_end_p (si);
+       gsi_next_nondebug (&si))
+    {
+      gimple *stmt = gsi_stmt (si);
+      if (stmt == use_stmt)
+    return true;
+    }
+
+  return false;
+}
So from a compile-time standpoint, would it be better to to assign UIDs to each statement so that within a block you can just compare the UIDs? That's a pretty standard way to deal with the problem of statement domination within a block if we're going to be doing multiple queries.

I'd considered it but because statement UIDs don't exist at
the start of a pass, assigning them means either traversing all
statements in the whole CFG first, even in functions with no
deallocation calls or clobbers, or doing it lazily, after
the first such statement has been seen.  It might ultimately
be worthwhile if more warnings(*) end up relying on it but at
this point I'm not sure the optimization wouldn't end up slowing
things down on average.

For some data, in a GCC bootstrap, each statement visited by
this loop is visited on average twice (2.2 times), and
the average sequence of statements traversed by the loop is
2.65, with a maximum of 22 times and 18 statements, respectively.
So still not sure it would be a win.

Let me know if this is something you think I need to pursue at
this stage.

[*] I think simple memory/resource leak detection might perhaps
be one.

+
+/* Return true if P and Q point to the same object, and false if they
+   either don't or their relationship cannot be determined.  */
+
+static bool
+pointers_related_p (gimple *stmt, tree p, tree q, pointer_query &qry)
+{
+  if (!ptr_derefs_may_alias_p (p, q))
+    return false;
Hmm, I guess that you don't need to worry about the case where P and Q point to different elements within an array.  They point to different final objects, though they do share a common enclosing object. Similarly for P & Q pointing to different members within a structure.

Right.  The if statement is an optimization to avoid having to
determine the identity of the complete objects that P and Q
point to.  That's done by the calls to get_ref() below (for
complete objects; as you note, we don't care about subobjects
for this).

+
+/* For a STMT either a call to a deallocation function or a clobber, warn +   for uses of the pointer PTR it was called with (including its copies
+   or others derived from it by pointer arithmetic).  */
+
+void
+pass_waccess::check_pointer_uses (gimple *stmt, tree ptr)
+{
+  gcc_assert (TREE_CODE (ptr) == SSA_NAME);
+
+  const bool check_dangling = !is_gimple_call (stmt);
+  basic_block stmt_bb = gimple_bb (stmt);
+
+  /* If the deallocation (or clobber) statement dominates more than
+     a single basic block issue a "maybe" k
That seems wrong.   What you're looking for is a post-dominance relationship I think.   If the sink (free/delete) is post-dominated by the use, then it's a "must", if it's not post-dominated, then it's a maybe.  Of course, that means you need to build post-dominators.

I'm sure you're right in general.  To avoid false positives
the warning is very simplistic and only considers straight
paths through the CFG, so I'm not sure this matters.  But
I'm fine with using the post-dominance test instead if you
thin it's worthwhile (it doesn't change any tests).

+
+      if (check_dangling
+          && gimple_code (use_stmt) == GIMPLE_RETURN
+          && optimize && flag_isolate_erroneous_paths_dereference)
+        /* Avoid interfering with -Wreturn-local-addr (which runs only
+           with optimization enabled).  */
+        continue;
Umm, that looks like a hack.  I can't think of a good reason why removal of erroneous paths should gate any of this code.  ISTM that you're likely papering over a problem elsewhere.

This code avoids issuing -Wdangling-pointer for problems that
will later be diagnosed by -Wreturn-local-addr.  E.g., in this
case from Wreturn-local-addr-2.c:

   ATTR (noipa) void*
   return_array_plus_var (int i)
   {
       int a[32];
       void *p = a + i;
     sink (p);
     return p;         /* { dg-warning "function returns address of local" } */
   }

Without the test we'd end up with

   warning: using dangling pointer ‘p’ to ‘a’ [-Wdangling-pointer=]

in addition to -Wreturn-local-addr (and a whole slew of
failures in the -Wreturn-local-addr tests).

-Wreturn-local-addr only runs when
flag_isolate_erroneous_paths_dereference is nonzero, so
the conditional makes sure -Wdangling-pointer is issued when
either the flag or -Wreturn-local-addr is disabled.  I think
that works as expected (i.e., there's no problem elsewhere).

I could have the code issue -Wdangling-pointer and suppress
-Wreturn-local-addr but that doesn't seem right since
the pointer hasn't gone out of scope yet at the point it's
returned.

Alternatively, I could change this instance of
-Wdangling-pointer to -Wreturn-local-addr but that also
doesn't seem like good design since we have a whole pass
dedicated to the latter warning.

I can't think of any other more elegant solutions but I'm open
to suggestions.

Martin



Reply via email to