> On Apr 4, 2016, at 07:51, Maciej W. Rozycki <ma...@linux-mips.org> wrote:
> 
> On Thu, 31 Mar 2016, Jake Hamby wrote:
> 
>> There's one more thing that's broken in the VAX backend which I'd 
>> *really* like to fix: GCC can't compile many of its own files at -O2, as 
>> well as a few other .c files in the NetBSD tree, because it can't expand 
>> an insn pattern. The Makefiles have hacks to add "-O0" on VAX, and when 
>> I remove that workaround, I get GCC assertion failures, all of the form:
>> 
>> /home/netbsd/current/src/external/gpl3/gcc/dist/gcc/builtins.c: In function 
>> 'void maybe_emit_chk_warning(tree, built_in_function)':
>> /home/netbsd/current/src/external/gpl3/gcc/dist/gcc/builtins.c:11549:1: 
>> error: unrecognizable insn:
>> }
>> ^
>> (insn 295 294 296 25 (set (reg:SI 111)
>>        (subreg:SI (mem:DI (plus:SI (mult:SI (reg:SI 109)
>>                        (const_int 8 [0x8]))
>>                    (reg/f:SI 45 [ D.85015 ])) [7 *_98+0 S8 A32]) 4)) 
>> /home/netbsd/current/src/external/gpl3/gcc/dist/gcc/wide-int.h:799 -1
>>     (nil))
>> /home/netbsd/current/src/external/gpl3/gcc/dist/gcc/builtins.c:11549:1: 
>> internal compiler error: in extract_insn, at recog.c:2343
>> 0xbd0365 _fatal_insn(char const*, rtx_def const*, char const*, int, char 
>> const*)
>>      
>> /home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/rtl-error.c:110
>> 0xbd03fa _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
>>      
>> /home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/rtl-error.c:118
>> 0xb92a2d extract_insn(rtx_insn*)
>>      
>> /home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/recog.c:2343
>> 0x9612cd instantiate_virtual_regs_in_insn
>>      
>> /home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/function.c:1598
>> 0x9612cd instantiate_virtual_regs
>>      
>> /home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/function.c:1966
>> 0x9612cd execute
>>      
>> /home/netbsd/current/src/tools/gcc/../../external/gpl3/gcc/dist/gcc/function.c:2015
>> 
>> The failures all seem to be related to trying to read a value from an 
>> array of 64-bit values and loading it into a 32-bit register. It seems 
>> like there should be a special insn defined for this sort of array 
>> access, since VAX has mova* and pusha* variants to set a value from an 
>> address plus an index into a byte, word, long, or 64-bit array (it uses 
>> movab/pushab, put not the other variants). The addressing modes, 
>> constraints, and predicates all get very complicated, and I still need 
>> to understand a bit better what is actually required, and what could be 
>> simplified and cleaned up.
> 
> Please note however that this RTL instruction is a memory reference, not 
> an address load.  So the suitable hardware instruction would be MOVAQ for 
> an indexed DI mode memory load.  If used to set a SI mode subreg at byte 
> number 4 (this would be the second hardware register of a pair a DI mode 
> value is held in) as seen here, it would have to address the immediately 
> preceding register however (so you can't load R0 this way) and it would 
> clobber it.
> 
> So offhand I think you need an RTL instruction splitter to express this, 
> and then avoid fetching 64 bits worth of data from memory -- for the sake 
> of matching the indexed addressing mode -- where you only need 32 bits.  
> At the hardware instruction level I'd use a scratch register (as with 
> MOVAQ you'd have to waste one anyway) to scale the index and then use 
> MOVAL instead with the modified index.  Where no index is used it gets 
> simpler even as you can just bump up the displacement according to the 
> subreg offset.

Thanks for the info! I've discovered a few additional clues which should help, 
namely the optimizer pass that's introducing the problem. Through process of 
elimination, I discovered that adding "-fno-tree-ter" will prevent the 
unrecognizable insn error. Strangely, I can't cause the problem by using 
"-ftree-ter" and -O0, which seems odd, unless the code is checking directly for 
a -O flag.

Here's an example of a similar line of code (it seems to be triggered by 
accessing a 64-bit int array embedded inside a struct) that fails w/ -O, but 
succeeded with the addition of -fno-tree-ter (assembly output with "-dP"):

#(insn 682 680 686 (set (reg:DI 0 %r0 [orig:136 D.5219 ] [136])
#        (mem:DI (plus:SI (plus:SI (mult:SI (reg/v:SI 11 %r11 [orig:138 j ] 
[138])
#                        (const_int 8 [0x8]))
#                    (reg/v/f:SI 6 %r6 [orig:55 dp ] [55]))
#                (const_int 112 [0x70])) [5 MEM[base: dp_3, index: _569, step: 
8, offset: 112B]+0 S8 A32])) /home/netbsd/current/src/sbin/fsck_ffs/pass1.c:345 
11 {movdi}
#     (expr_list:REG_EQUIV (mem:DI (plus:SI (plus:SI (mult:SI (reg/v:SI 11 %r11 
[orig:138 j ] [138])
#                        (const_int 8 [0x8]))
#                    (reg/v/f:SI 6 %r6 [orig:55 dp ] [55]))
#                (const_int 112 [0x70])) [5 MEM[base: dp_3, index: _569, step: 
8, offset: 112B]+0 S8 A32])
#        (nil)))
        movq 112(%r6)[%r11],%r0 # 682   movdi


That's a pretty complex instruction: base address + (array offset * 8) plus a 
constant offset! It's the attempt to transform this into setting a reg:SI from 
a subreg:SI from the mem:DI that it fails to find a matching insn.

I'll definitely look into the areas you mentioned, because I think you're right 
about where the basic problem is. It sounds like there isn't a bug in the 
particular optimization pass, but the addressing mode is so complicated that 
when you change the movq to a movl, you suddenly can't do the array offset 
correctly in one insn. I also think you're absolutely correct about the need 
for a scratch register, and that the movaq/movq version would have wasted one 
anyway.

The other problem area I want to fix is why the generated move instruction to 
overwrite the return address in expand_eh_return() is being deleted by the 
optimizer (my guess is that it's a bad dead store elimination, but I could be 
off-base). That's the part I hacked around to get C++ exceptions working for 
now with the RTX_FRAME_RELATED_P line I added in gcc/except.c:

#ifdef EH_RETURN_HANDLER_RTX
      rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
      RTX_FRAME_RELATED_P (insn) = 1;   // XXX FIXME in VAX backend?
#else

That hack shouldn't be necessary, and it introduces other problems (adds 
unnecessary .cfi metadata and caused "-Os" compiles to fail for me). So there 
has to be something else going on, especially since other platforms define 
EH_RETURN_HANDLER_RTX and seem to depend on the same behavior for their 
__builtin_eh_return(). But then most of those platforms put the return address 
in a register, or relative to %sp, not %fp. I could easily see some 
machine-independent part of the code thinking the frame-pointer reference meant 
it was a local variable store and not important.

I'll continue to clean up the diffs that I've been working on, and send out 
something when I've made more progress. I like the "cc" attr code that I've 
added to fix the overaggressive elimination of CC0 tests, but the problem is 
that now it's too conservative, because many insns are defined as calls into C 
code which may or may not generate insns that set the condition codes. I have a 
few ideas on how to clean up and refactor that code, but first I had to 
convince myself that most of what's in there now are actually useful 
optimizations, and they seem to be.

Thanks for the help!
-Jake

Reply via email to