Re: [RFC] restricting aliasing by standard containers (PR 98465)

Richard Biener via Gcc Sun, 10 Jan 2021 23:44:08 -0800

On Fri, Jan 8, 2021 at 9:16 PM Martin Sebor <mse...@gmail.com> wrote:
>
> On 1/8/21 12:51 AM, Richard Biener wrote:
> > On Thu, Jan 7, 2021 at 10:41 PM Martin Sebor <mse...@gmail.com> wrote:
> >>
> >> The test case in PR 98465 brings to light a problem we've discussed
> >> before (e.g., PR 93971) where a standard container (std::string in
> >> this case but the problem applies to any class that owns and manages
> >> allocated memory) might trigger warnings for unreachable code.
> >> The code is not eliminated due to a missing aliasing constraint:
> >> because GCC doesn't know that the member pointer to the memory
> >> managed by the container cannot alias other objects, it emits code
> >> that can never be executed in a valid program and that's prone to
> >> causing false positives.
> >>
> >> To illustrate, at the moment it's impossible to fold away the assert
> >> below because there's no way to determine in the middle end that
> >> String::s cannot point to a:
> >>
> >>     extern char array[];
> >>
> >>     class String {
> >>       char *s;
> >>     public:
> >>        String (const char *p): s (strdup (p)) { }
> >>        String (const String &str): s (strdup (str.s)) { }
> >>        ~String () { free (s); }
> >>
> >>        void f () { assert (s != array); }
> >>     };
> >>
> >> The constraint is obvious to a human reader (String::s is private
> >> and nothing sets it to point to array) but there's no way for GCC
> >> to infer it from the code alone (at least not in general): there
> >> could be member or friend functions defined in other translation
> >> units that violate this assumption.
> >>
> >> One way to solve the problem is to explicitly declare that
> >> String::s, in fact, doesn't point to any such objects and that it
> >> only ever points to allocated memory.  My idea for doing that is
> >> to extend attribute malloc to (or add a new attribute for) pointer
> >> variables to imply that the pointer only points to allocated memory.
> >>
> >> However, besides pointing to allocated memory, std::string can also
> >> point to its own internal buffer, so the extended malloc attribute
> >> couldn't be used there by itself.  I think this could be solved by
> >> also either extending the may_alias attribute or adding a new
> >> "alias" (or some such) attribute to denote that a pointer variable
> >> may point to an object or subobject.
> >>
> >> Putting the two together, to eliminate the assert, std::string would
> >> be annotated like so:
> >>
> >>     class string {
> >>       char *s __attribute__ ((malloc, may_alias (buf)));
> >>       char buf[8];
> >>     public:
> >>        string (): s (buf) { }
> >>        string (const char *p): s (strdup (p)) { }
> >>        string (const string &str): s (strdup (str.s)) { }
> >>        ~string () { if (s != buf) free (s); }
> >>
> >>        void f () { assert (s != array); }
> >>     };
> >>
> >> The may_alias association with members is relative to the this pointer
> >> (i.e., as if by may_alias (this->buf), as opposed to being taken as
> >> may_alias (String::buf) and meaning that s might be equal to any other
> >> String::s with a different this.  To help avoid mistakes, setting s
> >> in violation of the constraints would trigger warnings.
> >>
> >> If this sounds reasonable I'm prepared to prototype it, either for
> >> GCC 11 if it's in scope to solve the PR and there's still time, or
> >> (I suspect more likely) for GCC 12.
> >>
> >> Richard, what are your thoughts/concerns?
> >
> > I'm not sure it's feasible to make use of this attribute.  First
> > there's the malloc part which has difficult semantics (similar
> > to restrict) when generating PTA constraints.  We might see
> >
> >   _1 = str.s;
> >   _2 = str.s;
> >
> > but are of course required to associate the same allocated
> > dummy object with both pointers (as opposed to when we'd
> > see two malloc calls).  What would possibly work is to
> > have the object keyed on the field decl, but then for
> >
> >   _1 = p_to_str_4(D);
> >   _2 = _1 + offsetof-s;
> >   _3 = *_2;
> >
> > we have to somehow conservatively arrive at the same object.
> > I don't see how that can work out.
> >
> > All the same applies to the may_alias part but I guess when the
> > malloc part falls apart that's not of much interest.
> >
> > So I'm concerned about correctness - I'm sure you can hack
> > sth together to get some testcases optimized.  But I'm not sure
> > you can make it correct in all cases (within the current PTA
> > framework).
>
> Thanks for the feedback.
>
> Absent some source level annotation I can't think of a good way
> to avoid these false positives.  Do you have any other ideas?
>
> If not, would you be opposed to introducing these attributes to
> suppress warnings (at least at first)?  Besides avoiding the false
> positives, implementing just that part might also be a good proof
> of concept for the aliasing solution (or a confirmation of your
> intuition).


I don't think "overloading" malloc and alias attributes in the
suggested way is good.  Thinking of a general way to communicate
restrictions of what a pointer/field points to would be nice
though, with all of the above caveats in mind.

Richard.

>
> Martin
>
> >
> > Richard.
> >
> >> Martin
> >>
> >> PS An alternate solution might be to provide a late-evaluated built-in,
> >> something like
> >>
> >>     <tri-state> __builtin_decl (T *ptr)
> >>
> >> that would return a <yes> answer if ptr could be determined to point
> >> to a declared object or subobject, a <no> if not (e.g., it points to
> >> allocated storage), and a <don't know> if it couldn't be determined.
> >>    The built-in would then be used in code to eliminate infeasible
> >> paths.  For example, a built-in like that could be used to eliminate
> >> the assert in string::f():
> >>
> >>     void string::f ()
> >>     {
> >>       if (<yes> == __builtin_decl_p (s) && s != buf)
> >>         __builtin_unreachable ();
> >>
> >>       assert (s != array);
> >>     }
> >>
> >> A built-in might be more flexible but would also be harder to use
> >> (and likely more error-prone).
>

Re: [RFC] restricting aliasing by standard containers (PR 98465)

Reply via email to