Hello,

I have been working on link time optimization for C that may change the size of structs (at link time). We are close to sharing the results we have so far, but there are a couple of missing pieces left to work on:

Implementations of sizeof and offsetof that support this change in struct layout at link time.

== What is the problem? ==

Currently, for both sizeof and offsetof, the C parser will replace these statements with trees that correspond to the value returned by sizeof and offsetof at parse time. For example:

// source code
struct astruct a;
memset(a, 0, sizeof(a));

// parse time
memset(a, 0, 64);

// after dead field elimination
// struct astruct is now 56 bytes long
memset(a, 0, 64); // <-- we are overwriting memory!

At link time, we really shouldn't change the value 64 since we can't and shouldn't assume that the value 64 came from a sizeof statement. The source code could have been written this way:

// source code
struct astruct a;
memset(a, 0, 64);

regardless of whether the struct astruct has a length of 64.

** We need to identify which trees come from sizeof statements **

== What do we want? ==

What we really want is to make sure that our transformation performs the following changes (or no changes!) depending on the source code.

If the value for memset's argument comes from a sizeof statement:

// source code
struct astruct a;
memset(a, 0, sizeof(a));

// parse time
memset(a, 0, 64);

// after dead field elimination
memset(a, 0, 56);

However, in the case in which no sizeof is used, we want to do the following:

// source code
struct astruct a;
memset(a, 0, 64);

// parse time
memset(a, 0, 64);

// after dead field elimination
memset(a, 0, 64);

== How do we get what we want? ==

Ideally what we want is to:

* Be able to change the value returned by sizeof and offsetof at link time:
  * possibly a global variable?
* Identify which values come from sizeof statement:
  * matching identifiers?
* No re/define valid C identifiers:
  * in gimple we can have an identifier we a dot in it.
* Disable constant propagation and other optimizations:
  * possibly __attribute__((noipa))
* Be able to work with parallel compilation (make -j)
* Be able to work with any Makefile
  * No C code generation and then compile and link gen code at the end.

So, I've been thinking about multiple options:

* Extending gimple to add support for a sizeof statement
* A function per struct generated during compilation (sizeof & offsetof)
* A variable per struct generated during compilation (sizeof and more for offsetof)

I think extending gimple to add support for a sizeof statement gets us all what we want, however this would involve rewriting possibly many parts of GCC. As such, I am somewhat opposed to this.

I then thought of generating global variables during parse/time compilation. In this scheme, I would replace sizeof statements with a reference to a global variable (or function) that is initialized with the value returned by the sizeof statement during parse time. At link time we can replace initialization value if needed. For example:

// The parser is parsing a C file
// it encounters a sizeof statement
sizeof(struct astruct);

// Parsing is paused.
// Does a global variable that identifies this struct exists?
// I.e. size_t __lto.sizeof.astruct exists?
// If it doesn't create it.

size_t __lto.sizeof.astruct = 64

// Back to the parser
// instead of replacing
// sizeof(struct astruct) with 64
// replace with the following gimple:

__lto.sizeof.astruct

// Continue parsing until the end of file compilation.

// If at link time we detect that we will delete a field from astruct
// Then we will have to look at the initialization value of
// __lto.sizeof.astruct and replace it with the new value.

size_t __lto.sizeof.$identifier = 56

This strategy can be used with global functions instead of variables and it is similar. The only differences would be we would create a global function instead of a variable and we would call that function to obtain the value.

For offsetof, we will need to change in the following way:

// Parser encounter offsetof
offsetof(struct astruct, b);

/ Parsing is paused.
// Does a global variable that identifies this struct AND field exists?

// The previous field has a size of 8
size_t __lto.offsetof.astruct._8 = 8

// Back to the parser
// instead of replacing
// offsetof(struct astruct, b) with 8
// replace with the following gimple:

__lto.offsetof.astruct._8

// Continue parsing until the end of file compilation.

// If at link time we detect that we will delete the previous field
// then we can rewrite all the offsetof for this struct and which refer
// to the fields that follow the deleted field

__lto.offsetof.astruct._8 = 0;
// Because the previous field was deleted for example
// All variables referring to offsets allocated
// further than the one deleted will need an update as well.

== What's the WORST CASE performance of having an unknown sizeof? ==

I was concerned that the values generated by sizeof might be used by constant propagation and other optimizations might depend on this value. So, in order to have an idea of the impact of this transformation might have on code, I manually changed all sizeof statements on MCF for an equivalent function __lto_sizeof_T() where T identifies the type. I made all these functions static and used the attribute noipa. There was no measurable performance degradation when compared with an untransformed run of MCF. Of course, this is only a data point and the performance might depend on the application/workload/architecture... but it is good to have a data point!

== What are some known unknowns? ==


* Does noipa work for variables?
* I think attribute noipa works for functions but I am not sure if it works for variables. * After changing the definition of struct can we re-run all link time optimizations? * I would not like to sacrifice performance. Because I might have hindered constant propagation all optimizations which depend on it might suffer. Therefore, I was wondering if after changing the fields I can delete the noipa attribtue and re-run all link time optimizations somehow? (However, the experiment tells us that this might not be a worry. Perhaps there might be other benchmarks which are more affected by this transformation.) * If we have multiple C files and each C file has a sizeof statement for a single struct, then we can have multiple definitions of this variable/function. * We can possibly define multiple weak symbols but I don't think this is a good idea. * We can define static variables/functions and modify them all. But we have to make sure that the variables/functions corresponding to the same type have exactly the same definition. This might not be a good idea. * I don't have sufficient experience with builtins, but maybe this is the way to go? * What is the best way to come up for a naming scheme for these functions/variables? * The identifier is a good start, but of course typedefs will be an issue. We could instead just use the first identifier we find and then store a map of tree -> sizeof variable/function and whenever we compare a find a new function we compare against all previous types to see if there is already an implementation for a type that is equal (for some definition of equal). I already have an equality function, but this doesn't fix coming up with a naming scheme.

We appreciate all feedback! Thanks again for your time. If you have question comments, let me know.

Reply via email to