On Tue, Feb 21, 2012 at 4:02 PM, Tijl Coosemans <t...@coosemans.org> wrote:
> On Tuesday 21 February 2012 10:19:15 Richard Guenther wrote:
>> On Mon, Feb 20, 2012 at 8:55 PM, Tijl Coosemans <t...@coosemans.org> wrote:
>>> On Monday 9 January 2012 10:05:08 Richard Guenther wrote:
>>>> Since GCC 4.4 applying the malloc attribute to realloc-like
>>>> functions does not work under the documented constraints because
>>>> the contents of the memory pointed to are not properly transfered
>>>> from the realloc argument (or treated as pointing to anything,
>>>> like 4.3 behaved).
>>>>
>>>> The following adjusts documentation to reflect implementation
>>>> reality (we do have an implementation detail that treats the
>>>> memory blob returned for non-builtins as pointing to any global
>>>> variable, but that is neither documented nor do I plan to do
>>>> so - I presume it is to allow allocation + initialization
>>>> routines to be marked with malloc, but even that area looks
>>>> susceptible to misinterpretation to me).
>>>>
>>>> Any comments?
>>>
>>> The new text says the memory must be undefined, but gives calloc as an
>>> example for which the memory is defined to be zero. Also, GCC has
>>> built-ins for strdup and strndup with the malloc attribute and GLIBC
>>> further adds it to wcsdup (wchar_t version of strdup) and tempnam. In
>>> all of these cases the memory is defined.
>>>
>>> Isn't the reason the attribute doesn't apply to realloc simply because
>>> the returned pointer may alias the one given as argument, rather than
>>> having defined memory content?
>>
>> The question is really what the alias-analysis code can derive from a
>> function that is declared with the malloc attribute.  The most useful
>> property for alias analysis would be that te non-aliasing holds
>> transitively, thus reading (with any level of indirection) from the returned
>> pointer does not produce memory that is aliased by any other pointer.
>> That's what happens for 'malloc' (also for 'calloc' - you can't do any
>> further indirections through the NULL pointers the memory holds).  It
>> does not happen for realloc.  Currently the alias-analysis code does
>> assume exactly this properly (only very slightly weakened, possibly
>> because we broke some code I guess).
>>
>> Internally, all builtins with interesting allocation properties are handled
>> explicitely, so we probably should not rely on the malloc attribute present
>> on those (and maybe simply drop it there).
>>
>> The question is really what is useful for users, and what's the most natural
>> behavior?  For example
>>
>> int **my_initialized_malloc (int *p)
>> {
>>   int **q = malloc (sizeof (int *));
>>   *q = p;
>>   return q;
>> }
>>
>> would not qualify for the 'malloc' attribute (but we've taken measures to not
>> miscompile this kind of code, it seems to be a very common misconception
>> to place annotate these with 'malloc').
>>
>> I'm not sure how to exactly constrain the documentation for 'malloc' better.
>> Maybe
>>
>> The @code{malloc} attribute is used to tell the compiler that a function
>> may be treated as if any non-@code{NULL} pointer it returns cannot
>> alias any other pointer valid when the function returns and that the memory
>> does not contain any pointer value.
>>
>> ?  Because that is what is relevant.  That you can in no way extract
>> a pointer value from the memory pointed to by the return value.  Because
>> alias analysis will assume any such extracted pointer value points
>> nowhere (so, extracting a NULL pointer is ok).
>>
>> The reasoning why the string functions have the malloc attribute was
>> probably that strings do not contain pointer values.  Of course they
>> can, you can store a character encoding of a pointer, copy the
>> string and decode it from the copy again.  We'd miscompile then
>>
>>  int i = 1;
>>  int *p = &i;
>>  char ptr[16];
>>  ... inline encode p into ptr ...
>>  char *x = strdup (ptr);
>>  int *q = ... inline decode x to q
>>  *q = 2;
>>  return i;
>>
>> to return 1 because we do not see that q may point to i.  Of course
>> we properly handle the transfer of pointers for str[n]dup, so the
>> 'malloc' attribute on it is a lie...
>
> Thanks, that was very informative.
>
> Is it correct to say that the attribute applies to deep copies, but not to
> shallow ones?

No, see below

>
> How about the following text:
>
> @item malloc
> @cindex @code{malloc} attribute
> The @code{malloc} attribute is used to tell the compiler that a pointer
> returned by a function is either @code{NULL} or points to a newly
> allocated object and that any pointer within that object is either
> uninitialised, @code{NULL} or pointing to a newly allocated object for
> which the same conditions hold recursively.

The '.. or pointing to a newly allocated object for which the same
conditions hold recursively' is not what is implemented.  What is
implemented is '.. or pointing to global memory', but I don't really
want to document this as this implementation detail may change
(and what is considered 'global memory' would deserve its own
complicated description).

>  The compiler assumes that
> existing variables and memory cannot be accessed through the returned
> pointer which will often improve optimization.

Maybe '... cannot be accessed directly or indirectly through the ...'

> Standard functions with this property include @code{malloc} and
> @code{calloc}.  @code{realloc}-like functions do not have this
> property as the returned pointer may alias the one given as argument
> or the memory pointed to may contain initialised pointers.
> @code{strdup}-like functions have this property as long as the string
> does not encode a memory address.  More generally the attribute applies
> to deep memory copies, but not to shallow ones.

I'd remove the last two sentences - they probably add more confusion
than clarification.

Does that make sense?

Thanks,
Richard.

Reply via email to