I've updated the RTL frontend to the new "compact" dump format and have it "mostly working", for some definition of that term [1]. I'm focusing on the "__RTL" extension to cc1, rather than having a true "rtl1" frontend.
I've run into an issue with insn recognition relating to pointer equality of rtx. I have a test case for "final", which contains this x86_64 code (shown here in compact form): (cinsn (set (mem/v:BLK (scratch:DI) [0 A8]) (unspec:BLK [ (mem/v:BLK (scratch:DI) [0 A8]) ] UNSPEC_MEMORY_BLOCKAGE)) "test.c":2 (nil)) This fails inside "final" with: gcc.dg/rtl/x86_64/final.c:130:1: error: unrecognizable insn: (insn 5 4 6 2 (set (mem/v:BLK (scratch:DI) [0 A8]) (unspec:BLK [ (mem/v:BLK (scratch:DI) [0 A8]) ] UNSPEC_MEMORY_BLOCKAGE)) -1 (nil)) (the error message is showing the non-compact form of the loaded insn). The pertinent part of i386.md is this: ;; Do not schedule instructions accessing memory across this point. (define_expand "memory_blockage" [(set (match_dup 0) (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))] "" { operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); MEM_VOLATILE_P (operands[0]) = 1; }) (define_insn "*memory_blockage" [(set (match_operand:BLK 0) (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BLOCKAGE))] "" "" [(set_attr "length" "0")]) If I run that directly in memory, the insn *is* recognized: (gdb) call gen_memory_blockage () $19 = (insn 33 0 0 (set (mem/v:BLK (scratch:DI) [0 A8]) (unspec:BLK [ (mem/v:BLK (scratch:DI) [0 A8]) ] UNSPEC_MEMORY_BLOCKAGE)) -1 (nil)) (gdb) call recog_memoized ((rtx_insn *)$19) $20 = 695 (gdb) call debug ($19) (insn 33 0 0 (set (mem/v:BLK (scratch:DI) [0 A8]) (unspec:BLK [ (mem/v:BLK (scratch:DI) [0 A8]) ] UNSPEC_MEMORY_BLOCKAGE)) 695 {*memory_blockage} (nil)) By contrast, if I load the rtx from a dump file, the insn is *not* recognized. Stepping through insn-recog.c revealed the problem is that the (match_dup 0) fails when run on a reconstructed-from-dump copy: 44388 case 17: 44389 if (GET_MODE (x2) != BLKmode) 44390 return -1; 44391 x3 = XVECEXP (x2, 0, 0); 44392 if (!rtx_equal_p (x3, operands[0])) 44393 return -1; 44394 return 695; /* *memory_blockage */ (gdb) call debug (x3) (mem/v:BLK (scratch:DI) [0 A8]) (gdb) call debug (operands[0]) (mem/v:BLK (scratch:DI) [0 A8]) (gdb) p x3 $40 = (rtx) 0x7ffff19ee150 (gdb) p operands[0] $41 = (rtx) 0x7ffff19ee120 (gdb) call rtx_equal_p ($40, $41) $42 = 0 The issue is that on loading from a file, we have two distinct (scratch:DI) rtx. rtl.c:rtx_equal_p has: case DEBUG_EXPR: case VALUE: case SCRATCH: CASE_CONST_UNIQUE: return 0; i.e. SCRATCH values always compare as non-equal, unless they have pointer equality. It works when built directly from insn-emit.c within gen_memory_blockage: 74308 emit_insn (gen_rtx_SET (operand0, 74309 gen_rtx_UNSPEC (BLKmode, 74310 gen_rtvec (1, 74311 copy_rtx (operand0)), 74312 17))); since rtl.c:copy_rtx has: case SCRATCH: /* SCRATCH must be shared because they represent distinct values. */ return orig; So we have a case where a gen_* builds a pattern that contains a graph, i.e. where a SCRATCH appears in two places: (insn 5 4 6 2 (set (mem/v:BLK (scratch:DI) [0 A8]) ^^^^^^^^^^ (unspec:BLK [ (mem/v:BLK (scratch:DI) [0 A8]) ^^^^^^^^^^ ] UNSPEC_MEMORY_BLOCKAGE)) -1 (nil)) and for which the recognizer will only recognize the insn if that graph-like property is preserved: that we have pointer-equality for the two (scratch:DI). I'm wondering what the best course of action for the RTL frontend is. Some ideas: (a) dump the insn name even in compact mode, and do a string-match to look it up when reading dumps. This would isolate us somewhat from .md file changes, but if we lose the memoized value, it can't be re-recognized. This feels like papering over the problem. (b) for all codes for which rtx_equal_p requires pointer equality, add some kind of extra ID to the dump, allowing the loader to reconstruct the graph. This could be the pointer itself: (scratch:DI [0x7ffff19ee150]) (scratch:DI [ptr:0x7ffff19ee150]) or somesuch, but it would perhaps be better to use some kind of more human-friendly ID e.g. (scratch:DI [ptr-idx: 42]) or similar, rather than subject ourselves to raw hex values. Probably need an extra flag to print-rtl.c, to avoid making the dumps more verbose for everyone who doesn't want this (if so, does "compact" mode need renaming...) Or just do this for SCRATCH, maybe? (c) some kind of special-casing? (ugh) (d) something I haven't thought of I'm leaning towards (b). Thoughts? Dave [1] FWIW, a work-in-progress patch kit is at: https://dmalcolm.fedorapeople.org/gcc/2016-10-19/rtl-frontend-v143-relative-to-r241136/ though it's not yet ready for review.