On Fri, Jan 8, 2021 at 9:16 PM Martin Sebor <mse...@gmail.com> wrote: > > On 1/8/21 12:51 AM, Richard Biener wrote: > > On Thu, Jan 7, 2021 at 10:41 PM Martin Sebor <mse...@gmail.com> wrote: > >> > >> The test case in PR 98465 brings to light a problem we've discussed > >> before (e.g., PR 93971) where a standard container (std::string in > >> this case but the problem applies to any class that owns and manages > >> allocated memory) might trigger warnings for unreachable code. > >> The code is not eliminated due to a missing aliasing constraint: > >> because GCC doesn't know that the member pointer to the memory > >> managed by the container cannot alias other objects, it emits code > >> that can never be executed in a valid program and that's prone to > >> causing false positives. > >> > >> To illustrate, at the moment it's impossible to fold away the assert > >> below because there's no way to determine in the middle end that > >> String::s cannot point to a: > >> > >> extern char array[]; > >> > >> class String { > >> char *s; > >> public: > >> String (const char *p): s (strdup (p)) { } > >> String (const String &str): s (strdup (str.s)) { } > >> ~String () { free (s); } > >> > >> void f () { assert (s != array); } > >> }; > >> > >> The constraint is obvious to a human reader (String::s is private > >> and nothing sets it to point to array) but there's no way for GCC > >> to infer it from the code alone (at least not in general): there > >> could be member or friend functions defined in other translation > >> units that violate this assumption. > >> > >> One way to solve the problem is to explicitly declare that > >> String::s, in fact, doesn't point to any such objects and that it > >> only ever points to allocated memory. My idea for doing that is > >> to extend attribute malloc to (or add a new attribute for) pointer > >> variables to imply that the pointer only points to allocated memory. > >> > >> However, besides pointing to allocated memory, std::string can also > >> point to its own internal buffer, so the extended malloc attribute > >> couldn't be used there by itself. I think this could be solved by > >> also either extending the may_alias attribute or adding a new > >> "alias" (or some such) attribute to denote that a pointer variable > >> may point to an object or subobject. > >> > >> Putting the two together, to eliminate the assert, std::string would > >> be annotated like so: > >> > >> class string { > >> char *s __attribute__ ((malloc, may_alias (buf))); > >> char buf[8]; > >> public: > >> string (): s (buf) { } > >> string (const char *p): s (strdup (p)) { } > >> string (const string &str): s (strdup (str.s)) { } > >> ~string () { if (s != buf) free (s); } > >> > >> void f () { assert (s != array); } > >> }; > >> > >> The may_alias association with members is relative to the this pointer > >> (i.e., as if by may_alias (this->buf), as opposed to being taken as > >> may_alias (String::buf) and meaning that s might be equal to any other > >> String::s with a different this. To help avoid mistakes, setting s > >> in violation of the constraints would trigger warnings. > >> > >> If this sounds reasonable I'm prepared to prototype it, either for > >> GCC 11 if it's in scope to solve the PR and there's still time, or > >> (I suspect more likely) for GCC 12. > >> > >> Richard, what are your thoughts/concerns? > > > > I'm not sure it's feasible to make use of this attribute. First > > there's the malloc part which has difficult semantics (similar > > to restrict) when generating PTA constraints. We might see > > > > _1 = str.s; > > _2 = str.s; > > > > but are of course required to associate the same allocated > > dummy object with both pointers (as opposed to when we'd > > see two malloc calls). What would possibly work is to > > have the object keyed on the field decl, but then for > > > > _1 = p_to_str_4(D); > > _2 = _1 + offsetof-s; > > _3 = *_2; > > > > we have to somehow conservatively arrive at the same object. > > I don't see how that can work out. > > > > All the same applies to the may_alias part but I guess when the > > malloc part falls apart that's not of much interest. > > > > So I'm concerned about correctness - I'm sure you can hack > > sth together to get some testcases optimized. But I'm not sure > > you can make it correct in all cases (within the current PTA > > framework). > > Thanks for the feedback. > > Absent some source level annotation I can't think of a good way > to avoid these false positives. Do you have any other ideas? > > If not, would you be opposed to introducing these attributes to > suppress warnings (at least at first)? Besides avoiding the false > positives, implementing just that part might also be a good proof > of concept for the aliasing solution (or a confirmation of your > intuition).
I don't think "overloading" malloc and alias attributes in the suggested way is good. Thinking of a general way to communicate restrictions of what a pointer/field points to would be nice though, with all of the above caveats in mind. Richard. > > Martin > > > > > Richard. > > > >> Martin > >> > >> PS An alternate solution might be to provide a late-evaluated built-in, > >> something like > >> > >> <tri-state> __builtin_decl (T *ptr) > >> > >> that would return a <yes> answer if ptr could be determined to point > >> to a declared object or subobject, a <no> if not (e.g., it points to > >> allocated storage), and a <don't know> if it couldn't be determined. > >> The built-in would then be used in code to eliminate infeasible > >> paths. For example, a built-in like that could be used to eliminate > >> the assert in string::f(): > >> > >> void string::f () > >> { > >> if (<yes> == __builtin_decl_p (s) && s != buf) > >> __builtin_unreachable (); > >> > >> assert (s != array); > >> } > >> > >> A built-in might be more flexible but would also be harder to use > >> (and likely more error-prone). >