On 27/10/15 14:50, H.J. Lu wrote:
On Tue, Oct 27, 2015 at 7:34 AM, Ramana Radhakrishnan
<ramana.radhakrish...@foss.arm.com> wrote:
OK, then it's fairly x86-64 specific optimization, because we can't do "call
*mem" in
aarch64 and some other targets.
It is a fairly x86_64 specific optimization and doesn't apply to AArch64.
The question really is what impact does removing the generic code handling have
on aarch64 - is it a no-op or not for the existing -fno-plt implementation in
the AArch64 backend ? The only case that is of interest is the bit below in
calls.c and it looks like that may well be redundant with the logic in the
backend already, but I have not done the full analysis to convince myself that
the code in the backend is sufficient.
- && (!flag_plt
- || lookup_attribute ("noplt", DECL_ATTRIBUTES (fndecl_or_type)))
- && !targetm.binds_local_p (fndecl_or_type))
-fno-plt is a backend specific optimization and should be handled
in backend.
The removing of those generic code has broken aarch64.
Actually those code in calls.c shouldn't prevent such "call *mem"
opportunity on x86-64 because the combine pass
should combine "load reg, symbol + call reg" back into "call *mem" on
x86-64 as there is related define_insn.
the testcases in PR67215 and included in your patch, all of which are
loops, failed because either RTL PRE or loop pass will
hoist address calculation pattern as invariant out of loop into another
basic block different with the call_insn. while combine
pass only work within basic block scope, thus we have missed such
combine opportunity on x86-64.
I am not sure anyone has done experiment before on extend combine pass
to larger scope.