On Fri, Feb 14, 2014 at 04:50:34PM +0100, Bernd Schmidt wrote: > How many offloaded functions do we really expect to have in an > executable? I don't think that's likely to be a bottleneck.
First of all, this isn't just about offloaded functions, but also any global variables that need to be mapped (for OpenMP #pragma omp declare target surrounded vars). Like functions, also the vars can be global (but, with multiple DSOs, even global can be interposed and thus not unique name), or static, at which point you can have the same name in between different TUs. > The use of a random-seed is really just a fallback, preferrably it > uses the name of first symbol defined in the current translation > unit which I think ought to be reliable enough. Many TUs don't have any non-weak global symbols at all, if the symbols they provide are all comdat etc., then you hit the random seed all the time. Encoding the random seed into data sections of the binary is a problem for build reproduceability, unless you always supply -frandom-seed=XXXX, but then it isn't really that much random (e.g. the often used -frandom-seed=$(@) or similar). If the weirdo names are only used to name symbols in .symtab, that randomness at least can be stripped off, but not if it is in data sections. So, to me this is far less reliable and against the spirit of static symbols. > >I still don't see what you find wrong on the approach with host/target > >address arrays, if you are afraid something will reorder the arrays > >(but, what would do that), one can insert indexes into both arrays as well, > >which the linker can fill in and you can then verify. > > It strikes me as really unnecessarily brittle. On the host side we'd > have multiple objects linked together in some order to produce such > a table, on the ptx side we'd have to produce the table all in one Sure. So, the linker/linker plugin orders the objects in some order and thus by concatenation of the smaller per-TU tables creates the host table, then the same linker/linker plugin just creates the to be target table with the same order, and feeds that to the offloading target compiler. > go. Factor in possibilities like function cloning and I just think > there are too many ways in which this can utterly fail. I'd rather > have something that is more robust from the start even if it's > slightly less efficient. I don't see how function cloning or anything similar can make a difference here, you have a function which is address taken and it's address is tracked in some array, such function can't be cloned (well, can be cloned for unrelated callers, but the table still keeps the original function, with the same public ABI etc.). Jakub