On Tue, Jan 31, 2017 at 08:59:27AM +0100, Richard Biener wrote: > > The problem is that at least right now the handling of the init-order is > > done in 2 separate places. > > The C++ FE emits the special libasan calls with the name of the TU at the > > beginning and end of the static initialization. > > And then the variables need to be registered with the libasan runtime from > > an even earlier constructor, and this is something that is done very late > > (where we collect all the variables). > > So we are basically telling libasan a list of dynamic initialized vars plus > when they start/end being constructed so it can catch access to uninitialized > vars during init of others?
We are telling libasan a list of all sanitized variables (those where we managed to insert the padding around them in the end) and in that list mark variables with dynamic initialization. So, we can't e.g. register vars where we for whatever reason can't add the padding around them, and this is something that is decided when the variables are finalized. Thus, constructing the list early (as a VAR_DECL with initializer) means we'd then need to be able to remove stuff from it etc. It looks much easier to me to carry this list only in the LTO data structures (basically remember which TU owns each var). > > I don't see why we need to compose that list of vars so late then. Just > generate it early, say, after the first swoop of unused var removal. Use > weak references so later removed ones get NULL. Or simply bite the > bullet of asan changing code gen (well, it does that anyway) and thus > "pin" all vars life at that point as used. > > Sounds much easier to me than carrying this over LTO... > > And I suppose the TU name is only used for diagnostics? Otherwise > the symbol name (DECL_ASSEMBLER_NAME) of the symbol could > be used as in C++ globals need to follow ODR? The way it works is that you register the sanitized variables and each var has a string for the owning TU (it is used also for diagnostics, so it is desirable to not use random strings). Then when the start of dynamic initialization for some TU is called, libasan poisons all dynamic_init global variables except those that have the owning TU string equal to the current TU. Then the construction is run (which means it will fail if it accesses a dynamically initialized variable from some other TU), and finally another libasan routine is called which will unpoison all the variables it poisoned earlier. So, for this use it is desirable the names are actually the TU names, what you pass in corresponding static initialization to the libasan functions. Jakub