On 1 August 2011 22:16, Paul Berry <stereotype...@gmail.com> wrote: > > I will investigate things further in the morning and let you know what I find. >
Ok, I've spent a while looking through the linker code, and the rabbit hole goes deeper than I expected. There are a number of reasons why the linker, as currently implemented, isn't guaranteed to be produce topologically sorted output: 1. Linking doesn't actually start with an empty shader. It starts with a clone of the compilation unit that defines main() (see linker.cpp around line 836). This means that the order of functions defined in that compilation unit will be preserved regardless of the call graph. 2. When resolving a function that's declared in the compilation unit that defines main(), but defined in some other compilation unit, the linker inserts the IR for that function wherever the declaration existed. This means that the order of functions _declared_ in the compilation unit that defines main will be preserved too. Problems 1 and 2 could be addressed by not cloning the IR for the compilation unit containing main, and instead using the algorithm you described. But even if we did that, there's an additional problem: 3. Linking is performed depth first. This means that if main calls f which calls h, and main also calls g which calls h, then the linker first tries to pull in f, then h, and then finally g. Since the linker emits functions at the head of the final linked program, the final order (assuming problems 1 and 2 are fixed) would be g, h, f, main. This is the wrong order because g calls h. This problem could be addressed by making the linker operate breadth first instead of depth first. But even if we did that, there's still another problem: 4. Since the linker emits functions at the head of the final linked program, if the linker brings in a function (let's call it f()) that wasn't declared in the compilation unit that defined main, it winds up at the beginning of the linked output, _before_ any global declarations. If f() refers to a global variable, then the IR is invalid, because ir_dereference_variable nodes need to occur _after_ the variables they declare (see ir_validate.cpp line 96). This problem could be addressed by having the linker emit functions at the tail of the final linked program rather than at the head, but that would defeat the effort to make callees appear before their callers in the linker output. There's one final problem, and this to me is the clincher: 5. Since our IR groups together all overloaded functions that share the same name, there are (admittedly contrived) cases that would be impossible to topologically sort. For example: if main calls f(int) which calls g(int), and main also calls g(float) which calls f(float), this is a valid program (there's no static recursion), but there's no way to order f and g such that callees come before their respective callers. We could only address this problem by changing our IR format, or dropping the topological sort requirement. In light of everything I've discovered this morning, my recommendation is to drop the topological sort requirement. I haven't been able to find any code other than opt_dead_functions that relies on functions being topologically sorted, so unless I'm missing something, dropping the requirement is a simple matter of committing this patch. In addition, I think we should change link_functions.cpp so that it emits functions at the tail of the final linked program rather than at the head, to fix problem #4. Does this analysis seem right to you, Ian? _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev