On Tue, Feb 21, 2012 at 4:02 PM, Tijl Coosemans <t...@coosemans.org> wrote: > On Tuesday 21 February 2012 10:19:15 Richard Guenther wrote: >> On Mon, Feb 20, 2012 at 8:55 PM, Tijl Coosemans <t...@coosemans.org> wrote: >>> On Monday 9 January 2012 10:05:08 Richard Guenther wrote: >>>> Since GCC 4.4 applying the malloc attribute to realloc-like >>>> functions does not work under the documented constraints because >>>> the contents of the memory pointed to are not properly transfered >>>> from the realloc argument (or treated as pointing to anything, >>>> like 4.3 behaved). >>>> >>>> The following adjusts documentation to reflect implementation >>>> reality (we do have an implementation detail that treats the >>>> memory blob returned for non-builtins as pointing to any global >>>> variable, but that is neither documented nor do I plan to do >>>> so - I presume it is to allow allocation + initialization >>>> routines to be marked with malloc, but even that area looks >>>> susceptible to misinterpretation to me). >>>> >>>> Any comments? >>> >>> The new text says the memory must be undefined, but gives calloc as an >>> example for which the memory is defined to be zero. Also, GCC has >>> built-ins for strdup and strndup with the malloc attribute and GLIBC >>> further adds it to wcsdup (wchar_t version of strdup) and tempnam. In >>> all of these cases the memory is defined. >>> >>> Isn't the reason the attribute doesn't apply to realloc simply because >>> the returned pointer may alias the one given as argument, rather than >>> having defined memory content? >> >> The question is really what the alias-analysis code can derive from a >> function that is declared with the malloc attribute. The most useful >> property for alias analysis would be that te non-aliasing holds >> transitively, thus reading (with any level of indirection) from the returned >> pointer does not produce memory that is aliased by any other pointer. >> That's what happens for 'malloc' (also for 'calloc' - you can't do any >> further indirections through the NULL pointers the memory holds). It >> does not happen for realloc. Currently the alias-analysis code does >> assume exactly this properly (only very slightly weakened, possibly >> because we broke some code I guess). >> >> Internally, all builtins with interesting allocation properties are handled >> explicitely, so we probably should not rely on the malloc attribute present >> on those (and maybe simply drop it there). >> >> The question is really what is useful for users, and what's the most natural >> behavior? For example >> >> int **my_initialized_malloc (int *p) >> { >> int **q = malloc (sizeof (int *)); >> *q = p; >> return q; >> } >> >> would not qualify for the 'malloc' attribute (but we've taken measures to not >> miscompile this kind of code, it seems to be a very common misconception >> to place annotate these with 'malloc'). >> >> I'm not sure how to exactly constrain the documentation for 'malloc' better. >> Maybe >> >> The @code{malloc} attribute is used to tell the compiler that a function >> may be treated as if any non-@code{NULL} pointer it returns cannot >> alias any other pointer valid when the function returns and that the memory >> does not contain any pointer value. >> >> ? Because that is what is relevant. That you can in no way extract >> a pointer value from the memory pointed to by the return value. Because >> alias analysis will assume any such extracted pointer value points >> nowhere (so, extracting a NULL pointer is ok). >> >> The reasoning why the string functions have the malloc attribute was >> probably that strings do not contain pointer values. Of course they >> can, you can store a character encoding of a pointer, copy the >> string and decode it from the copy again. We'd miscompile then >> >> int i = 1; >> int *p = &i; >> char ptr[16]; >> ... inline encode p into ptr ... >> char *x = strdup (ptr); >> int *q = ... inline decode x to q >> *q = 2; >> return i; >> >> to return 1 because we do not see that q may point to i. Of course >> we properly handle the transfer of pointers for str[n]dup, so the >> 'malloc' attribute on it is a lie... > > Thanks, that was very informative. > > Is it correct to say that the attribute applies to deep copies, but not to > shallow ones?
No, see below > > How about the following text: > > @item malloc > @cindex @code{malloc} attribute > The @code{malloc} attribute is used to tell the compiler that a pointer > returned by a function is either @code{NULL} or points to a newly > allocated object and that any pointer within that object is either > uninitialised, @code{NULL} or pointing to a newly allocated object for > which the same conditions hold recursively. The '.. or pointing to a newly allocated object for which the same conditions hold recursively' is not what is implemented. What is implemented is '.. or pointing to global memory', but I don't really want to document this as this implementation detail may change (and what is considered 'global memory' would deserve its own complicated description). > The compiler assumes that > existing variables and memory cannot be accessed through the returned > pointer which will often improve optimization. Maybe '... cannot be accessed directly or indirectly through the ...' > Standard functions with this property include @code{malloc} and > @code{calloc}. @code{realloc}-like functions do not have this > property as the returned pointer may alias the one given as argument > or the memory pointed to may contain initialised pointers. > @code{strdup}-like functions have this property as long as the string > does not encode a memory address. More generally the attribute applies > to deep memory copies, but not to shallow ones. I'd remove the last two sentences - they probably add more confusion than clarification. Does that make sense? Thanks, Richard.