Hi all,

For several years I’ve been eager to find the time to fix the bugs in C++ 
exceptions on VAX to get them working on NetBSD, because they’ve been broken 
for many years and it looked like only a few changes were needed to get them 
working. Without C++ exceptions, the NetBSD test suite can’t be run. The good 
news is that I was able to fix all the bugs in the VAX machine description to 
make C++ exceptions work in GCC 4.8.5 (version unimportant). I wrote a blog 
post explaining the bugs, with patches:

> http://jakehamby.com/?p=7

Here’s a short summary, with the diffs in text form at the end of this email.

1) Replace #define FRAME_POINTER_CFA_OFFSET(FNDECL) 0 with #define 
ARG_POINTER_CFA_OFFSET(FNDECL) 0 in gcc/config/vax/elf.h and 
gcc/config/vax/vax.h. This changes the definition of __builtin_dwarf_cfa() to 
return %ap instead of %fp, which correctly points to CFA. Previously, the stack 
unwinder was crashing in _Unwind_RaiseException() trying to follow bad pointers 
from the initial CFA.

2) Define EH_RETURN_DATA_REGNO(N) to include only R2 and R3 (instead of R2-R5) 
and add code to vax_expand_prologue() in gcc/config/vax/vax.c to add R2-R3 to 
the procedure entry mask but only if crtl->calls_eh_return is set. This fixes a 
crash when the stack unwinder tried to write values to R2 and R3 in the 
previous stack frame via __builtin_eh_return_data_regno (0) and 
__builtin_eh_return_data_regno (1).

3) Removed definitions of EH_RETURN_STACKADJ_RTX and STARTING_FRAME_OFFSET from 
gcc/config/vax/elf.h. It’s not necessary to remember the stack adjustment or to 
waste four bytes on every stack frame for a value that’s not needed. Also 
remove the suspicious changes in gcc/config/vax/vax.md to the definitions of 
call_pop and call_value regarding DW_CFA_GNU_args_size and EH unwinding. I 
reverted to the previous versions from an older version of GCC, adding a few 
useful comments that had been removed.

4) The last bug is the one I understand the least. I’m hoping someone reading 
this can implement a correct fix. What I was seeing after making all the 
previous changes to fix the other bugs is that my test program failed to catch 
any exceptions, but instead returned normally to the original return path.

Investigation revealed that GCC was correctly generating the necessary move 
instruction to copy the second parameter passed to __builtin_eh_return() into 
the return address, because EH_RETURN_HANDLER_RTX had been defined correctly in 
config/vax/elf.h. Here’s what the call looks like in gcc/except.c:

#ifdef EH_RETURN_HANDLER_RTX
      rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
#else
      error ("__builtin_eh_return not supported on this target");
#endif

The problem was that the optimizer is deleting the final move instruction when 
I compile with -O or higher. The assembly code at -O0 (no optimization) 
generated for the __builtin_eh_return() call at the end of 
_Unwind_RaiseException() looked like:

        calls $2,_Unwind_DebugHook
        movl -12(%fp),%r1
        movl %r1,16(%fp)
        ret
        .cfi_endproc

But then when I compiled with -O1 or -O2, all I saw was:

        calls $2,_Unwind_DebugHook
        ret
        .cfi_endproc

This was a mystery for me and I don’t know enough about how the final peephole 
optimizer works to really track down why it thinks it can remove the move call 
to store the previous return address. My workaround was to add a call to 
RTX_FRAME_RELATED_P (insn) = 1; after the emit_move_insn() in gcc/except.c, 
which was used in vax_expand_prologue() to mark the procedure entry mask.

By making this change, the optimizer no longer removes the call to write the 
value to the previous stack pointer, but it adds an extra line of .cfi 
exception info, which seems unnecessary since the code is immediately going to 
return from the call and any adjustment made by the DWARF stack unwinder will 
already have been done. Here’s what the optimized code looks like with the 
patch (%r6 had been loaded earlier):

        calls $2,_Unwind_DebugHook
        movl %r6,16(%fp)
        .cfi_offset 6, -36
        ret
        .cfi_endproc

With that final change, C++ exception handling now finally works on NetBSD/vax, 
and I was able to successfully run the vast majority of the tests in the ATF 
testsuite, which had been completely inaccessible when I started due to both 
atf-run and atf-report immediately dumping core due to the bad pointers that I 
fixed. Now I have a bunch of new bugs to track down fixes for, but I think this 
was the hardest set of problems that needed to be solved to bring NetBSD on VAX 
up to the level of the other NetBSD ports.

Here are the diffs I have so far. They should apply to any recent version of 
GCC (tested on GCC 4.8.5). With the exception of the hack to gcc/except.c, the 
other diffs are ready to submit to NetBSD as well as to upstream GCC. The fix 
I’d like to see for the final problem I discovered of the emit_move_insn() 
being deleted by the optimizer would be another patch to one of the files in 
the gcc/config/vax directory to explain to the optimizer that writing to 
16(%fp) is important and not something to be deleted from the epilogue (perhaps 
it thinks it’s writing to a local variable in the frame that’s about to be 
destroyed?).

I didn’t see any indication that any other GCC ports required anything special 
to tell the optimizer not to delete the move instruction to 
EH_RETURN_HANDLER_RTX, so the other suspicion I have is that there may be a bug 
specific to VAX’s peephole optimizer or related functions. Any ideas?

Index: dist/gcc/except.c
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/except.c,v
retrieving revision 1.3
diff -u -r1.3 except.c
--- dist/gcc/except.c   23 Sep 2015 03:39:10 -0000      1.3
+++ dist/gcc/except.c   21 Mar 2016 20:58:33 -0000
@@ -2207,7 +2207,8 @@
 #endif
     {
 #ifdef EH_RETURN_HANDLER_RTX
-      emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+      rtx insn = emit_move_insn (EH_RETURN_HANDLER_RTX, crtl->eh.ehr_handler);
+      RTX_FRAME_RELATED_P (insn) = 1;
 #else
       error ("__builtin_eh_return not supported on this target");
 #endif
Index: dist/gcc/config/vax/elf.h
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/config/vax/elf.h,v
retrieving revision 1.3
diff -u -r1.3 elf.h
--- dist/gcc/config/vax/elf.h   23 Sep 2015 03:39:18 -0000      1.3
+++ dist/gcc/config/vax/elf.h   21 Mar 2016 20:58:33 -0000
@@ -45,18 +45,8 @@
    count pushed by the CALLS and before the start of the saved registers.  */
 #define INCOMING_FRAME_SP_OFFSET 0
 
-/* Offset from the frame pointer register value to the top of the stack.  */
-#define FRAME_POINTER_CFA_OFFSET(FNDECL) 0
-
-/* We use R2-R5 (call-clobbered) registers for exceptions.  */
-#define EH_RETURN_DATA_REGNO(N) ((N) < 4 ? (N) + 2 : INVALID_REGNUM)
-
-/* Place the top of the stack for the DWARF2 EH stackadj value.  */
-#define EH_RETURN_STACKADJ_RTX                                         \
-  gen_rtx_MEM (SImode,                                                 \
-              plus_constant (Pmode,                                    \
-                             gen_rtx_REG (Pmode, FRAME_POINTER_REGNUM),\
-                             -4))
+/* We use R2-R3 (call-clobbered) registers for exceptions.  */
+#define EH_RETURN_DATA_REGNO(N) ((N) < 2 ? (N) + 2 : INVALID_REGNUM)
 
 /* Simple store the return handler into the call frame.  */
 #define EH_RETURN_HANDLER_RTX                                          \
@@ -66,10 +56,6 @@
                              16))
 
 
-/* Reserve the top of the stack for exception handler stackadj value.  */
-#undef STARTING_FRAME_OFFSET
-#define STARTING_FRAME_OFFSET -4
-
 /* The VAX wants no space between the case instruction and the jump table.  */
 #undef  ASM_OUTPUT_BEFORE_CASE_LABEL
 #define ASM_OUTPUT_BEFORE_CASE_LABEL(FILE, PREFIX, NUM, TABLE)
Index: dist/gcc/config/vax/vax.c
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/config/vax/vax.c,v
retrieving revision 1.3
diff -u -r1.3 vax.c
--- dist/gcc/config/vax/vax.c   23 Sep 2015 03:39:18 -0000      1.3
+++ dist/gcc/config/vax/vax.c   21 Mar 2016 20:58:33 -0000
@@ -164,7 +164,8 @@
 
   offset = 20;
   for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
-    if (df_regs_ever_live_p (regno) && !call_used_regs[regno])
+    if ((df_regs_ever_live_p (regno) && !call_used_regs[regno])
+       || (crtl->calls_eh_return && regno >= 2 && regno < 4))
       {
         mask |= 1 << regno;
         offset += 4;
Index: dist/gcc/config/vax/vax.h
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/config/vax/vax.h,v
retrieving revision 1.3
diff -u -r1.3 vax.h
--- dist/gcc/config/vax/vax.h   23 Sep 2015 03:39:18 -0000      1.3
+++ dist/gcc/config/vax/vax.h   21 Mar 2016 20:58:33 -0000
@@ -168,12 +168,12 @@
 /* Base register for access to local variables of the function.  */
 #define FRAME_POINTER_REGNUM VAX_FP_REGNUM
 
-/* Offset from the frame pointer register value to the top of stack.  */
-#define FRAME_POINTER_CFA_OFFSET(FNDECL) 0
-
 /* Base register for access to arguments of the function.  */
 #define ARG_POINTER_REGNUM VAX_AP_REGNUM
 
+/* Offset from the argument pointer register value to the CFA.  */
+#define ARG_POINTER_CFA_OFFSET(FNDECL) 0
+
 /* Register in which static-chain is passed to a function.  */
 #define STATIC_CHAIN_REGNUM 0
 
Index: dist/gcc/config/vax/vax.md
===================================================================
RCS file: /cvsroot/src/external/gpl3/gcc.old/dist/gcc/config/vax/vax.md,v
retrieving revision 1.3
diff -u -r1.3 vax.md
--- dist/gcc/config/vax/vax.md  23 Sep 2015 03:39:18 -0000      1.3
+++ dist/gcc/config/vax/vax.md  21 Mar 2016 20:58:33 -0000
@@ -1306,6 +1309,11 @@
   ""
   "decl %0\;jgequ %l1")
 

+;; Note that operand 1 is total size of args, in bytes,
+;; and what the call insn wants is the number of words.
+;; It is used in the call instruction as a byte, but in the addl2 as
+;; a word.  Since the only time we actually use it in the call instruction
+;; is when it is a constant, SImode (for addl2) is the proper mode.
 (define_expand "call_pop"
   [(parallel [(call (match_operand:QI 0 "memory_operand" "")
                    (match_operand:SI 1 "const_int_operand" ""))
@@ -1314,24 +1322,17 @@
                            (match_operand:SI 3 "immediate_operand" "")))])]
   ""
 {
-  gcc_assert (INTVAL (operands[3]) <= 255 * 4 && INTVAL (operands[3]) % 4 == 
0);
-
-  /* Operand 1 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[1] = GEN_INT (INTVAL (operands[3]) + 4);
+  gcc_assert (INTVAL (operands[1]) <= 255 * 4);
+  operands[1] = GEN_INT ((INTVAL (operands[1]) + 3) / 4);
 })
 
 (define_insn "*call_pop"
   [(call (match_operand:QI 0 "memory_operand" "m")
         (match_operand:SI 1 "const_int_operand" "n"))
    (set (reg:SI VAX_SP_REGNUM) (plus:SI (reg:SI VAX_SP_REGNUM)
-                                       (match_operand:SI 2 "immediate_operand" 
"i")))]
+                            (match_operand:SI 2 "immediate_operand" "i")))]
   ""
-{
-  operands[1] = GEN_INT ((INTVAL (operands[1]) - 4) / 4);
-  return "calls %1,%0";
-})
+  "calls %1,%0")
 
 (define_expand "call_value_pop"
   [(parallel [(set (match_operand 0 "" "")
@@ -1342,12 +1343,8 @@
                            (match_operand:SI 4 "immediate_operand" "")))])]
   ""
 {
-  gcc_assert (INTVAL (operands[4]) <= 255 * 4 && INTVAL (operands[4]) % 4 == 
0);
-
-  /* Operand 2 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[2] = GEN_INT (INTVAL (operands[4]) + 4);
+  gcc_assert (INTVAL (operands[2]) <= 255 * 4);
+  operands[2] = GEN_INT ((INTVAL (operands[2]) + 3) / 4);
 })
 
 (define_insn "*call_value_pop"
@@ -1357,47 +1354,20 @@
    (set (reg:SI VAX_SP_REGNUM) (plus:SI (reg:SI VAX_SP_REGNUM)
                                        (match_operand:SI 3 "immediate_operand" 
"i")))]
   ""
-  "*
-{
-  operands[2] = GEN_INT ((INTVAL (operands[2]) - 4) / 4);
-  return \"calls %2,%1\";
-}")
-
-(define_expand "call"
-  [(call (match_operand:QI 0 "memory_operand" "")
-      (match_operand:SI 1 "const_int_operand" ""))]
-  ""
-  "
-{
-  /* Operand 1 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[1] = GEN_INT (INTVAL (operands[1]) + 4);
-}")
+  "calls %2,%1")
 
-(define_insn "*call"
-   [(call (match_operand:QI 0 "memory_operand" "m")
-         (match_operand:SI 1 "const_int_operand" ""))]
+;; Define another set of these for the case of functions with no operands.
+;; These will allow the optimizers to do a slightly better job.
+(define_insn "call"
+  [(call (match_operand:QI 0 "memory_operand" "m")
+        (const_int 0))]
   ""
   "calls $0,%0")
 
-(define_expand "call_value"
-  [(set (match_operand 0 "" "")
-      (call (match_operand:QI 1 "memory_operand" "")
-           (match_operand:SI 2 "const_int_operand" "")))]
-  ""
-  "
-{
-  /* Operand 2 is the number of bytes to be popped by DW_CFA_GNU_args_size
-     during EH unwinding.  We must include the argument count pushed by
-     the calls instruction.  */
-  operands[2] = GEN_INT (INTVAL (operands[2]) + 4);
-}")
-
-(define_insn "*call_value"
+(define_insn "call_value"
   [(set (match_operand 0 "" "")
        (call (match_operand:QI 1 "memory_operand" "m")
-             (match_operand:SI 2 "const_int_operand" "")))]
+             (const_int 0)))]
   ""
   "calls $0,%1")
 

Reply via email to