Steve Ellcey <sell...@cavium.com> writes: > This is a patch 4 to support the Aarch64 SIMD ABI [1] in GCC. > > It defines a new target hook targetm.check_part_clobbered that > takes a rtx_insn and checks to see if it is a call to a function > that may clobber partial registers. It returns true by default, > which results in the current behaviour, but if we can determine > that the function will not do any partial clobbers (like the > Aarch64 SIMD functions) then it returns false.
Sorry, have a feeling this is going to be at least partly going back on what I said before, but... The patch only really deals with one user of the part-clobbered info, namely LRA. And as it happens, that caller does have access to the relevant call insns (which was a concern before), since you walk them in: /* Check to see if any call might do a partial clobber. */ partial_clobber_in_bb = false; FOR_BB_INSNS_REVERSE_SAFE (bb, curr_insn, next) { if (CALL_P (curr_insn) && targetm.check_part_clobbered (curr_insn)) { partial_clobber_in_bb = true; break; } } Since we're looking at the call insns anyway, we could have a hook that "jousts" two calls and picks the one that preserves *fewer* registers. This would mean that loop produces a single instruction that conservatively describes the call-preserved registers. We could then stash that instruction in lra_reg instead of the current check_part_clobbered boolean. The hook should by default be a null pointer, so that we can avoid the instruction walk on targets that don't need it. That would mean that LRA would always have a call instruction to hand when asking about call-preserved information. So I think we should add an insn parameter to targetm.hard_regno_call_part_clobbered, with a null insn selecting the defaul behaviour. I know it's going to be a pain to update all callers and targets, sorry. This would also cope with the fact that, when SVE is enabled, SIMD functions *do* still part-clobber the registers, just in a wider mode. The current patch doesn't handle that, and it would be hard to fix without pessimistically treating the functions as clobbering above 64 bits rather 128 bits. (Really, it would be good to overhaul the whole handling of ABIs so that we have all the information about an ABI in one structure and can ask "what ABI does this call use"? But that's a lot of work. The above should be good enough as long as the call-preserved behaviour of ABIs follows a total ordering, which it does for AArch64.) Thanks, Richard