On 29.06.20 04:08, Richard Biener wrote:
On Mon, Jun 29, 2020 at 1:05 PM Richard Biener
<richard.guent...@gmail.com> wrote:
On Mon, Jun 29, 2020 at 11:56 AM Erick Ochoa
<erick.oc...@theobroma-systems.com> wrote:
Hello,
I have been working on link time optimization for C that may change the
size of structs (at link time). We are close to sharing the results we
have so far, but there are a couple of missing pieces left to work on:
Implementations of sizeof and offsetof that support this change in
struct layout at link time.
== What is the problem? ==
Currently, for both sizeof and offsetof, the C parser will replace these
statements with trees that correspond to the value returned by sizeof
and offsetof at parse time. For example:
// source code
struct astruct a;
memset(a, 0, sizeof(a));
// parse time
memset(a, 0, 64);
// after dead field elimination
// struct astruct is now 56 bytes long
memset(a, 0, 64); // <-- we are overwriting memory!
At link time, we really shouldn't change the value 64 since we can't and
shouldn't assume that the value 64 came from a sizeof statement. The
source code could have been written this way:
// source code
struct astruct a;
memset(a, 0, 64);
regardless of whether the struct astruct has a length of 64.
** We need to identify which trees come from sizeof statements **
== What do we want? ==
What we really want is to make sure that our transformation performs the
following changes (or no changes!) depending on the source code.
If the value for memset's argument comes from a sizeof statement:
// source code
struct astruct a;
memset(a, 0, sizeof(a));
// parse time
memset(a, 0, 64);
// after dead field elimination
memset(a, 0, 56);
However, in the case in which no sizeof is used, we want to do the
following:
// source code
struct astruct a;
memset(a, 0, 64);
// parse time
memset(a, 0, 64);
// after dead field elimination
memset(a, 0, 64);
But why do you think the difference of handling of sizeof(a) vs.
a constant is warranted? It's by no means required that
whenever semantically the size of 'a' is needed you need to
write sizeof(a) but the user can just write literal 64 here.
It's the same with malloc sites btw.
So it seems you cannot use the presence or not presence
of 'sizeof' to derive semantics.
I think the difference of handling sizeof(a) vs a constant is warranted
because whenever we delete a field from the structure, we are changing
the size of the structure.
I understand that the user can write a number instead of sizeof(a).
However, we argue that the user's "knowledge" of a type's size might be
speculation. The value of the result of the sizeof operator is
implementation defined.
Therefore the semantics of memset(a, 0, 64) is precisely memset(a, 0,
64). While the semantics of memset(a, 0, sizeof(a)) != memset(a, 0, 64)
since it depends on the compiler and target machine (and we argue the
optimizations chosen).
== How do we get what we want? ==
Ideally what we want is to:
* Be able to change the value returned by sizeof and offsetof at link time:
* possibly a global variable?
* Identify which values come from sizeof statement:
* matching identifiers?
* No re/define valid C identifiers:
* in gimple we can have an identifier we a dot in it.
* Disable constant propagation and other optimizations:
* possibly __attribute__((noipa))
* Be able to work with parallel compilation (make -j)
* Be able to work with any Makefile
* No C code generation and then compile and link gen code at the end.
So, I've been thinking about multiple options:
* Extending gimple to add support for a sizeof statement
* A function per struct generated during compilation (sizeof & offsetof)
* A variable per struct generated during compilation (sizeof and more
for offsetof)
I think extending gimple to add support for a sizeof statement gets us
all what we want, however this would involve rewriting possibly many
parts of GCC. As such, I am somewhat opposed to this.
I then thought of generating global variables during parse/time
compilation. In this scheme, I would replace sizeof statements with a
reference to a global variable (or function) that is initialized with
the value returned by the sizeof statement during parse time. At link
time we can replace initialization value if needed. For example:
// The parser is parsing a C file
// it encounters a sizeof statement
sizeof(struct astruct);
// Parsing is paused.
// Does a global variable that identifies this struct exists?
// I.e. size_t __lto.sizeof.astruct exists?
// If it doesn't create it.
size_t __lto.sizeof.astruct = 64
// Back to the parser
// instead of replacing
// sizeof(struct astruct) with 64
// replace with the following gimple:
__lto.sizeof.astruct
// Continue parsing until the end of file compilation.
// If at link time we detect that we will delete a field from astruct
// Then we will have to look at the initialization value of
// __lto.sizeof.astruct and replace it with the new value.
size_t __lto.sizeof.$identifier = 56
This strategy can be used with global functions instead of variables and
it is similar. The only differences would be we would create a global
function instead of a variable and we would call that function to obtain
the value.
For offsetof, we will need to change in the following way:
// Parser encounter offsetof
offsetof(struct astruct, b);
/ Parsing is paused.
// Does a global variable that identifies this struct AND field exists?
// The previous field has a size of 8
size_t __lto.offsetof.astruct._8 = 8
// Back to the parser
// instead of replacing
// offsetof(struct astruct, b) with 8
// replace with the following gimple:
__lto.offsetof.astruct._8
// Continue parsing until the end of file compilation.
// If at link time we detect that we will delete the previous field
// then we can rewrite all the offsetof for this struct and which refer
// to the fields that follow the deleted field
__lto.offsetof.astruct._8 = 0;
// Because the previous field was deleted for example
// All variables referring to offsets allocated
// further than the one deleted will need an update as well.
== What's the WORST CASE performance of having an unknown sizeof? ==
I was concerned that the values generated by sizeof might be used by
constant propagation and other optimizations might depend on this value.
So, in order to have an idea of the impact of this transformation might
have on code, I manually changed all sizeof statements on MCF for an
equivalent function __lto_sizeof_T() where T identifies the type. I made
all these functions static and used the attribute noipa. There was no
measurable performance degradation when compared with an untransformed
run of MCF. Of course, this is only a data point and the performance
might depend on the application/workload/architecture... but it is good
to have a data point!
== What are some known unknowns? ==
* Does noipa work for variables?
* I think attribute noipa works for functions but I am not sure if it
works for variables.
* After changing the definition of struct can we re-run all link time
optimizations?
* I would not like to sacrifice performance. Because I might have
hindered constant propagation all optimizations which depend on it might
suffer. Therefore, I was wondering if after changing the fields I can
delete the noipa attribtue and re-run all link time optimizations
somehow? (However, the experiment tells us that this might not be a
worry. Perhaps there might be other benchmarks which are more affected
by this transformation.)
* If we have multiple C files and each C file has a sizeof statement for
a single struct, then we can have multiple definitions of this
variable/function.
* We can possibly define multiple weak symbols but I don't think this
is a good idea.
* We can define static variables/functions and modify them all. But
we have to make sure that the variables/functions corresponding to the
same type have exactly the same definition. This might not be a good idea.
* I don't have sufficient experience with builtins, but maybe this is
the way to go?
* What is the best way to come up for a naming scheme for these
functions/variables?
* The identifier is a good start, but of course typedefs will be an
issue. We could instead just use the first identifier we find and then
store a map of tree -> sizeof variable/function and whenever we compare
a find a new function we compare against all previous types to see if
there is already an implementation for a type that is equal (for some
definition of equal). I already have an equality function, but this
doesn't fix coming up with a naming scheme.
We appreciate all feedback! Thanks again for your time. If you have
question comments, let me know.
You are missing contexts where C requires constant expressions
or C++ where sizeof can be used as template parameters.
I think instead of trying to invent a scheme where you can "fixup"
these you instead need to mark types/objects that had offsetof
or sizeof applied as not eligible for transformation.
Note alignof is similarly affected for say struct { long a; int b; int c; }
where alignof (b) is 8 but if you for whatever reason re-order b and c
the alignment of b might change. Generally transforming
struct { long a; int b; /* pad */ } x[] to two arrays, one long[] and one int[]
suffers from this problem and may cause correctness issues where
code expects the larger alignment that is guaranteed. Not sure
if you're yet making sure you never lower alignment of existing
memory accesses.
Richard.