On Thu, 2018-07-19 at 08:31 +0100, Richard Sandiford wrote: > > > @@ -4706,8 +4730,11 @@ aarch64_process_components (sbitmap > > components, bool prologue_p) > > while (regno != last_regno) > > { > > /* AAPCS64 section 5.1.2 requires only the bottom 64 bits to be saved > > - so DFmode for the vector registers is enough. */ > > - machine_mode mode = GP_REGNUM_P (regno) ? E_DImode : E_DFmode; > > + so DFmode for the vector registers is enough. For simd functions > > + we want to save the entire register. */ > > + machine_mode mode = GP_REGNUM_P (regno) ? E_DImode > > + : (aarch64_simd_function_p (cfun->decl) ? E_TFmode : E_DFmode); > This condition also occurs in aarch64_push_regs and aarch64_pop_regs. > It'd probably be worth splitting it out into a subfunction. > > I think you also need to handle the writeback cases, which should work > for Q registers too. This will mean extra loadwb_pair and storewb_pair > patterns. > > LGTM otherwise FWIW.
Yes, I see where I missed this in aarch64_push_regs and aarch64_pop_regs. I think that is why the second of Wilco's two examples (f2) is wrong. I am unclear about exactly what is meant by writeback and why we have it and how that and callee_adjust are used. Any chance someone could help me understand this part of the prologue/epilogue code better? The comments in aarch64.c/aarch64.h aren't really helping me understand what the code is doing or why it is doing it. Steve Ellcey sell...@cavium.com