On Fri, Sep 17, 2021 at 11:42 AM Iain Sandoe <i...@sandoe.co.uk> wrote:
>
> Hi Folks,
>
> > On 17 Sep 2021, at 09:23, Richard Biener <richard.guent...@gmail.com> wrote:
> >
> > On Thu, Sep 16, 2021 at 3:52 PM Iain Sandoe <i...@sandoe.co.uk> wrote:
> >>
> >>
> >>> On 16 Sep 2021, at 11:00, Martin Liška <mli...@suse.cz> wrote:
> >>>
> >>> As preparation for a new global object that will encapsulate
> >>> asm_out_file, we would need to live with a macro that will
> >>> define asm_out_file as casm->out_file and thus the name
> >>> can't be used in function arguments.
> >>
> >> So, if I understand correctly, the motivation is to be able to switch
> >> between output file streams for different categories of content?
> >>
> >> Darwin, actually already does this (manually) with a separate
> >> lto_asm_out_name for lto data (so a general solution would
> >> be great).
> >>
> >> What is the reason for associating the section pointers with the
> >> casm object?
> >>
> >> * I can understand that each instance of a casm object would have
> >> potentially a different current section (“in_section”), but it seems that
> >> as things stand the section pointers would be duplicates.
> >>
> >> * In the case that there’s reason that the sections could be different
> >>  between casm instances, then would it make sense to have a
> >>  target hook so that target-specific sections can be added to the
> >>  local list (via some indirection, I’d assume)?
> >
> > Yes, casm likely will end up with target specific state.  Note the main
> > motivation of the exercise is to develop and alternate way of funneling
> > the early debug data through the LTO pipeline, eliding the need for
> > the simple-object copying and section renaming dance.
>
> great ;-)
>
> FWIW, I did an implementation for Darwin, but not yet presented/comitted
> because…. the dependencies on the debug linker (dsymutil) mean that it
> is of limited value until I can find time to fix that up to understand the 
> input.
>
> > The idea is that at dwarf2out_early_finish time (which runs at
> > the compile stage) we write a regular pure debug-info object
> > with unmangled section names to an alternate assembler file
> > which we then assemble and include as raw byte blob in the
> > LTO IR + debug object file.  At link time the debug object
> > byte blobs can either be re-instantiated as separate input
> > object file or the linker can be taught to pick them up from
> > the existing file at a byte offset (internally it does this for
> > AR archive members for example) avoiding the extra I/O.
>
> … the more dependencies on external tool behaviour we have
> the harder it gets for non-ELF-binutils targets; I’d like to think
> about how to implement this in Mach-O…
>
> I don’t think there’s any mechanism for Mach-O to include an
> arbitrary blob in a _regular_ mach-o file, of course one could make
> it FAT in some way - but that would mean a lot of changes to the
> back end tools..

Nah, it would be GCC itself opening the object and copying the
data byte-by-byte into a special LTO data section which of course
means emitting a lot of .data in the assembler ... but well.

We could do without the copying but then we produce two files
for each object we compile - the LTO IR .o and the object with
the debug info, say, .debug.o.  The debug info will be linked into
the final object without further changes.  But we have to be
able to emit those two objects from a single cc1 invocation - that's
what this change is about (yeah, emit assembly).

The copying would just involve assembling the alternate file
and including it as data in the main assembly (and thus object)
in a way that makes it easy to either re-materialize the alternate
object file at link time or include it by reference.

> … so the path of least resistance is to do something like we do
> with the LTO already - abstract the info into a blob section with
> and index and a name table (we funnel the LTO off into a second
> file and then re-include that in finish_asm_file).
>
> not sure what xcoff etc. can do...
>
> > So for this alternate assembler file we'd have a different set
> > of sections.
>
> can you give me an example ?
> (I’m not succeeding in visualizing this yet).
>
> Note that we already have categories of sections:
>  generic (e.g. .text, .data, etc. supported by all file formats)
>  language-related (for at least C++, D, ObjC…)
>  debug
>  lto
>  back-end-specific
>
> I was wondering which of those needed to be cloned per casm
> and if it includes any of the language/back-end ones how we figure
> a mechanism to include those (I suppose the usual style of a macro-
> programmed .def would work?)

In principle all of them - we are really emitting a completely separate
and different object file (as assembler).  For the LTO debug use
we'd need all DWARF .debug_* sections.

I understand that all the special backend sections are eventually
created on-demand (there's also targetm.asm_file_start/end ...)

> >> —
> >>
> >> (of course, it would be great if one day we could abstract the asm out
> >> such that we could switch to a direct-to-object implementation)
> >
> > Small steps ;)
>
> Yes - but it’s easier to see if the small steps are in the direction we want,
> with some idea of the finishing post, right? :)
>
> I was wondering about a conceptual scenario like:
>
>  casm ==> target_asm state
>  this also conceptually “owns” the TARGET_ASM macros.
>  one could consider migrating those macros to add casm as a first argument
>  this would allow a second migration when target chose to implement the macros
>  as inline functions taking the state as a first argument...
>
>  .. or to implement casm as an abstract base class, where the target macros 
> become
>  virtual methods, with a default impl. that can be overriden by the target.
>
>  My guess is that people will say “the second one is too much overhead 
> because it
>  incurs an indirection instead of the direct inline” … I suppose that depends 
> on how
>  well we devirtualize ….

Yes, adjusting the TARGET_ASM macros is the next natural step, and I'd
do it the C way, provide the asm-out state as argument (but we already
have cfun, so casm didn't seem so gross).

> >>> I've built all cross compilers with the change and
> >>> can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> A native bootstrap fails early in stage1 for x86_64-darwin (I’ll take a 
> >> look
> >> at fixing the issues once the patch series settles down)
>
> JFTR, I applied the first two patches and then a couple of tweaks and it did
> bootstrap on Darwin.
>
> thanks
> Iain
>
>

Reply via email to