On 9 Mar 2021, Jakub Jelinek via Binutils spake thusly: > On Tue, Mar 09, 2021 at 11:38:07AM +0000, Hannes Domani via Dwz wrote: >> Am Dienstag, 9. März 2021, 10:10:47 MEZ hat Mark Wielaard <m...@klomp.org> >> Folgendes geschrieben: >> >> > Hi Allan, >> > >> > On Tue, Mar 09, 2021 at 09:06:54AM +0100, Allan Sandfeld Jensen wrote: >> > > Btw, question for gcc/binutils >> > > >> > > Any reason the work done by tools like dwz couldn't be done in the >> > > compiler or >> > > linker? Seems a bit odd to have a post-linker that optimizes the >> > > generated >> > > code, when optimizations should already be enabled. >> > >> > >> > dwz does two kinds of optimization. First it attempts to optimize the >> > DWARF debugging information for a given object (executable or shared >> > library). Secondly it tries to put shared pieces of a list of given >> > objects into a supplemental file that gets referenced from all the >> > given object files. >> > >> > Technically the first optimization could be done by the linker. But >> > the second optimization is really a post-linker step. >> >> Related question: If it were part of binutils, maybe it could be adapted to >> optimize DWARF debugging information of PE files as well. > > dwz intentionally uses libelf, it often deals with very large amounts of > debug info that only barely fit into the address space limitations on > certain arches or physical memory for good performance, and any kind of > abstraction penalty (e.g. bfd) would make it slower and more memory hungry.
Well, it's not *impossible*. You could in theory do what CTF has done: move the dedup machinery into a library which is then called both from the linker (providing its input via BFD and explicitly not supporting multifile unification of DWARF) and from a separate dedup tool still called 'dwz' (providing its input via libelf, and flipping a switch allowing dedup across files and producing an additional output which the tool then writes to the unified file). Howver, link speed would likely be affected if dedup is on by default (it's still a concern of mine with libctf even though none of my testcases take more than a couple of seconds to dedup: I know how to shave a *lot* of time off that, I just haven't done it yet). It's quite possible that you'll save as much time by not having to write as much DWARF out as you lose deduplicating -- but the write time is usually hidden from the user anyway since writeback is usually buffered on most operating systems. So this is probably not as helpful as it might appear :( Also, it might not be acceptable to have dwz depend on a shared library provided by binutils, nor to have binutils depend (even optionally?) on a shared library provided by dwz...