On 10/06/2016 09:53 PM, David Malcolm wrote:
A benefit of keeping the INSN_UIDs is that if you've spent half an hour
single-stepping through RTL modification and know that INSN_UID 1045 is
the one that gets corrupted, you can have that insn have UID 1045 in
the testcase.  Otherwise the UIDs get determined by the parser, and can
change if other insns get inserted.

That's true, but you wouldn't construct a testcase until you're done debugging, and this reason is lost for any testcase that has been committed to the repository for more than say a week. I think we want a format that is lean and abstracts away unnecessary details.

Here's a revised version of that example
(gcc/testsuite/rtl.dg/aarch64/pr71779.rtl), with the things you mention
removed, and with some indentation to aid readability:

;; { dg-do compile { target aarch64-*-* } }
;; { dg-options "-fsingle-pass=rtl-cse1 -fdump-rtl-cse1" }

(function "fragment"
  (insn-chain
    ;; MEM[(struct isl_obj *)&obj1] = &isl_obj_map_vtable;
    (insn 2 (set (reg:SI 480)
                 (high:SI (symbol_ref:SI ("isl_obj_map_vtable")
                              [flags 0xc0] <var_decl isl_obj_map_vtable>)))
      y.c:12702)
    (insn 2 (set (reg/f:SI 479)
                 (lo_sum:SI (reg:SI 480)
                            (symbol_ref:SI ("isl_obj_map_vtable")
                               [flags 0xc0]
                               <var_decl 0x7fa0363ea240 isl_obj_map_vtable>)))
      y.c:12702
      (expr_list:REG_EQUAL (symbol_ref:SI ("isl_obj_map_vtable")
                            [flags 0xc0]
                            <var_decl isl_obj_map_vtable>)))
    (insn 2 (set (reg:DI 481)
                 (subreg:DI (reg/f:SI 479) 0)) y.c:12702)
    (insn 2 (set (zero_extract:DI (reg/v:DI 191 [ obj1D.17368 ])
                                  (const_int 32 [0x20])
                                  (const_int 0 [0]))
                 (reg:DI 481)) y.c:12702)
    ;; Extra insn, to avoid all of the above from being deleted by DCE
    (insn 2 (set (mem:DI (reg:DI 191) [1 i+0 S4 A32])
                         (const_int 1 [0x1])))

  ) ;; insn-chain
) ;; function

This is with the optional cfg and crtl directives omitted.

Does the above look acceptable?

One thing this doesn't show is CODE_LABELs, which will need some sort of unique identifier.

There's still the 0x7fa0363ea240 for the var-decl. You want the dump-noaddr flag.

The "2" values after each "insn" are the basic block indices.  Should
these be omitted also?

I think they probably don't buy us much if we have block notes.

What about file names/line numbers? We'll probably want some tests for these, but for most others they will be irrelevant.

I think maybe you want a separate compact form of insns and notes (cinsn/cnote maybe), with a flag selecting which gets written out in the dumper. The reader could then read them in like any other rtx code, and there'd be a postprocessing stage to reconstruct the insn chain.

My hope here is to that the input format is something that existing gcc
developers will be comfortable reading, which is why I opted for
parsing the existing output - and I'm nervous about moving away too
much from the existing format.

Hey, I'm as existing a gcc developer as they come, and I don't much like staring at unpruned rtl dumps. They've gotten worse over the years too.

Some other possible changes: locations could do with a wrapper, and/or
to quote the filename, since otherwise we can't cope with spaces in
filenames).  So instead of:

    (insn 2 (set (reg:DI 481)
                 (subreg:DI (reg/f:SI 479)
0)) y.c:12702)

we could have:

    (insn 2 (set (reg:DI 481)
                 (subreg:DI (reg/f:SI 479) 0)) "y.c":12702)

or:

    (insn 2 (set (reg:DI 481)
                 (subreg:DI (reg/f:SI 479)
0)) (srcloc "y.c" 12702))

Any preferences on these?

I suspect just putting it in quotes is good enough. Might be worth doing that in general.

There was also a testcase where virtual regs still occurred with
register number, I think it would be best to disallow this and add
the
ability to parse "(reg:DI virtual-outgoing-args)".

When we discussing pseudos you expressed a need to keep printing the
regno for REGs, so that the dump format is usable in the debugger, for
array indices etc.  Hence I'd prefer to keep printing the regno for all
regnos, including virtuals.

This comes back to different needs for long-lived testcases vs. debugging something. As suggested above, I think you need a flag for different output formats, and for vregs the ability to parse "(reg:M virtual-something)".

Also I think I'd prefer testcases submitted separately from the RTL
frontend. It seems we have two different frontend approaches here,
one
RTL frontend and one modification for the C frontend - which do you
prefer?

Is it acceptable to do both?  I'd like to have an actual RTL frontend,
though this punts on handling structs etc, whereas Richi seems to
prefer the modification to the C frontend, since it better supports
structs, aliasing, etc.  At this point, whichever gets me past patch
review...

Does the C one have an advantage in terms of meaningful decls in MEM_ATTRs/SYMBOL_REFs?


Bernd

Reply via email to