https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82012
--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 29 Aug 2017, krebbel at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82012 > > --- Comment #5 from Andreas Krebbel <krebbel at gcc dot gnu.org> --- > (In reply to rguent...@suse.de from comment #4) > > Not sure. The user might be deliberately expecting an error when > > such function is called from wrong target context. The function > > might contain inline assembly which violates the callers ABI > > (in this case it might contain hard-float code?). > > In that case the backend would trigger an error. Because floating point > registers or rather no instruction dealing with them would be enabled. It > probably would not be a nice one though. > > > Not sure what the libitm use of soft-float is about here. > > We have to prevent call-saved FPRs from being used between the libitm > transaction begin and the target dependent routine saving the registers. I see. For the particular code libitm could simply avoid the always-inline C++ wrapper and directly call the atomic. > > I'd say it is valid to inline any function not using FP into > > a function that differs in "soft-float" state? Similar to the > > patch I did to the x86 backend allowing -mfpmath differences > > in that case. > > Whether a function does not use FPRs is not easy to figure out. We would just > go on and let probably the register allocator complain about no FPRs being > available. IPA computes this for us, but only somewhat as it looks for FP expressions only (hopefully taking all inline asm as containing FP expressions) but not disallowing FP value loads/stores. See i386.c:ix86_can_inline_p: else if (caller_opts->x_ix86_fpmath != callee_opts->x_ix86_fpmath /* If the calle doesn't use FP expressions differences in ix86_fpmath can be ignored. We are called from FEs for multi-versioning call optimization, so beware of ipa_fn_summaries not available. */ && (! ipa_fn_summaries || ipa_fn_summaries->get (cgraph_node::get (callee))->fp_expressions)) this means the RA would have to load/store to non-FPR registers which is what soft-float should guarantee. I think even FP expressions should work, dispatching to soft-float routines? So IPA should compute a inline_asm flag (but even that would need to name FPRs in the constraints explicitely to be a problem I guess). > > Would probably fix this particular case. > > > > Consider a flag enabling some vector features, -mfancy-vect, building > > a TU with said flag and > > > > inline void __attribute__((always_inline)) foo () > > { > > __builtin_fancy_vect_insn (); > > } > > > > void __attribute__((target("no-fancy-vect"))) > > { > > return foo (); > > } > > > > with the pre-patched default hook we'd happily inline foo () here > > (it doesn't have a target attribute!). > > If the extra functionality would be pulled in via builtin the backend > expand_builtin function is supposed to complain about insufficient target > flags. This only works if the inlining happened before builtin expansion > though. > > > Note at runtime such inlining should be always valid(?) (arm folks > > make thumb vs. non-thumb as an example - not sure if the linker > > needs to insert special dispatch code when transitioning, so it > > might not be ok in that case!). But as insn patterns are > > usually guarded with some insn-enablement conditions we'd ICE. > > I think the problem is how we make sure to detect if a feature disabled by the > caller is being used in the callee. I think it should work for builtins (check > flags in expand_builtin) and it often works for target flags which change the > set of available registers (e.g. soft-float/hard-float on s390). > > In inline assemblies it works with soft-float as long as the register > allocator > is required to allocate an FPR. So we get an error when calling foo2. But > unfortunately not for foo which only clobbers an FPR: > > void __attribute__((always_inline)) > foo () > { > asm volatile ("lzdr %%f0" : : : "%f0"); > } > > void __attribute__((always_inline)) > foo2 () > { > double a = 1.0; > asm volatile ("%0" : : "f" (a)); > } > > void __attribute__((target("soft-float"))) > bar () > { > foo (); > } > > It should also work for most of the inline assemblies. The assembler would > complain about the special feature instruction not being available with the > current set of options as long as the compile options are passed to the > assembler. However, it would not help for instructions specified with .long or > .insn in the asm snippet. > > t3.c: > > void __attribute__((always_inline)) > foo () > { > asm volatile ("vaf %v0,%v0,%v0"); // z13 instruction > } > > void __attribute__((target("arch=zEC12"))) > bar () > { > foo (); > } > > cc1plus -O3 t3.c -march=z13 > as t3.s > t3.c: Assembler messages: > t3.c:4: Error: Unrecognized opcode: `vaf' > > This works because we pass the function specific options with gas pseudo > commands to binutils: > > .machinemode zarch > .machine "zEC12" > .globl _Z3barv > .type _Z3barv, @function > _Z3barv: > .LFB1: > .cfi_startproc > #APP > # 4 "t3.c" 1 > vaf %v0,%v0,%v0 > # 0 "" 2 > #NO_APP > br %r14 > > > So for S/390 I currently would tend to always allow inlining regardless of the > target attributes hoping that the majority of problems would be catched (with > obscure error messages mostly). ICEs you mean, or assembler errors. On x86 we avoid the inlining to not run into obscure ICEs (but got hit with inline failure errors with -flto due to the "bugs"). Ok, then do that (for always-inline functions I guess). Richard.