https://github.com/rjmccall commented:

In general, this patch needs to be clearer about what rules it's actually 
enforcing.  You're adding new command-line options, but users have to guess 
what they mean!

If you're going to be working on TBAA, would you mind adding a section to 
Clang's manual (`UsersManual.rst`) about type-based alias analysis?  We should 
start by documenting our current behavior, then document the behavior of all 
these new options.  To make this a less onerous request, let me suggest some 
starter text:

```
C and C++ forbid programmers from accessing objects using l-values that don't 
match the type of the object.  By default, Clang takes advantage of these rules 
to decide that certain pointers cannot point to the same object; this is called 
*strict aliasing* or *type-based alias analysis* (TBAA).  This can be 
completely disabled using the option ``-fno-strict-aliasing``.  
``-fno-strict-aliasing`` is the default for ``clang-cl``.

When strict aliasing is enabled, Clang uses the type-based aliasing rules from 
the appropriate standard for the current language mode.  In the C standard, the 
aliasing rules are laid out in section 6.5 (Expressions).  In the C++ standard, 
the aliasing rules are laid out in [basic.lval].  For the most part, the C and 
C++ rules coincide and can be summarized as follows:

- An object can be accessed through an l-value of character type (e.g. 
``char``).
- An object of integer type can be accessed through an l-value of different 
signedness; e.g. a ``signed short`` object can be accessed through an 
``unsigned short`` l-value.
- Otherwise, objects can only be accessed through l-values of the type of the 
object.

For the exact rules, please consult the standards.  Clang generally reserves 
the flexibility to take advantage of the exact rules for the current language 
mode, except as noted here:

- While C gives all character types the power to arbitrarily alias, C++ 
reserves this to ``char`` and ``unsigned char``.  Clang relaxes this rule in 
C++ to match the C rule.

There are several ways to load from or store to an object as if it had a 
different type without violating the strict aliasing rule.  The most explicit 
and portable is to ``memcpy`` between the object and an object of the desired 
type; for aliasing purposes, ``memcpy`` behaves as if it used loads and stores 
of character type.  Clang also supports ``__attribute__((may_alias))``, which 
can be placed on a type declaration (such as a ``struct`` or ``typedef``) to 
give that type the equivalent aliasing power of a character type.

Clang uses an implementation model in which "sufficiently obvious" aliasing 
should override type-based assumptions.  Strict aliasing means that Clang will 
assume that `int*` and `float*` parameters to a function do not alias, and it 
may reorder loads and stores to those parameters accordingly.  However, if a 
`float*` parameter to a function is cast to `int*`, Clang will understand that 
the result of the cast still aliases the original parameter, and it should not 
reorder loads and stores to those pointers. This is only a best-effort attempt 
to avoid miscompiles, and programmers should generally still aim to write code 
which does not violate the strict aliasing rules, as discussed above.

An access to a member of an aggregate type (such as a ``struct``) is considered 
to also be an access to the aggregate.  This means that there must also be an 
object of the aggregate type at that location, and it means that accesses into 
different aggregates cannot alias.  This rule can be weakened to only consider 
the final accessed type using ``-fno-struct-path-tbaa``.

<document your new options here>
```

https://github.com/llvm/llvm-project/pull/75177
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to