On 29/06/16 23:08, Eric Botcazou wrote: > Index: config/aarch64/aarch64.h > =================================================================== > --- config/aarch64/aarch64.h (revision 237789) > +++ config/aarch64/aarch64.h (working copy) > @@ -779,6 +779,9 @@ typedef struct > correctly. */ > #define TRAMPOLINE_SECTION text_section > > +/* Use custom descriptors instead of trampolines when possible. */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1 > +
Eric, If I understand how this is supposed to work then this is not future-proof against changes to the architecture. The bottom two bits in both AArch32 (arm) and AArch64 are reserved for future use by the architecture; they must not be used by software for tricks like this. As has already been seen in AArch32 state, bit-0 is used to indicate the ARM/Thumb ISA selection. The patch to arm.h is similarly problematic in this regard. R. > Hi, > > this patch implements generic support for the elimination of stack > trampolines > and, consequently, of the need to make the stack executable when pointers to > nested functions are used. That's done on a per-language and per-target > basis > (i.e. there is 1 language hook and 1 target hook to parameterize it) and > there > are no changes whatsoever in code generation if both are not turned on (and > the patch implements a -ftrampolines option to let the user override them). > > The idea is based on the fact that, for targets using function descriptors as > per their ABI like IA-64, AIX or VMS platforms, stack trampolines > "degenerate" > into descriptors built at run time on the stack and thus made up of data > only, > which in turn means that the stack doesn't need to be made executable. > > This descriptor-based scheme is implemented generically for nested functions, > i.e. the nested function lowering pass builds generic descriptors instead of > trampolines on the stack when encountering pointers to nested functions, > which > means that there are 2 kinds of pointers to functions and therefore a > run-time > identification mechanism is needed for indirect calls to distinguish them. > > Because of that, enabling the support breaks binary compatibility (for code > manipulating pointers to nested functions). That's OK for Ada and nested > functions are first-class citizens in the language anyway so we really need > this, but not for C so for example Ada doesn't use it at the interface with C > (when objects have "convention C" in Ada parlance). > > This was bootstrapped/regtested on x86_64-suse-linux but AdaCore has been > using it on native platforms (Linux, Windows, Solaris, etc) for years. > > OK for the mainline? > > > 2016-06-29 Eric Botcazou <ebotca...@adacore.com> > > PR ada/37139 > PR ada/67205 > * common.opt (-ftrampolines): New option. > * doc/invoke.texi (Code Gen Options): Document it. > * doc/tm.texi.in (Trampolines): Add TARGET_CUSTOM_FUNCTION_DESCRIPTORS > * doc/tm.texi: Regenerate. > * builtins.def: Add init_descriptor and adjust_descriptor. > * builtins.c (expand_builtin_init_trampoline): Do not issue a warning > on platforms with descriptors. > (expand_builtin_init_descriptor): New function. > (expand_builtin_adjust_descriptor): Likewise. > (expand_builtin) <BUILT_IN_INIT_DESCRIPTOR>: New case. > <BUILT_IN_ADJUST_DESCRIPTOR>: Likewise. > * calls.c (prepare_call_address): Remove SIBCALLP parameter and add > FLAGS parameter. Deal with indirect calls by descriptor and adjust. > Set STATIC_CHAIN_REG_P on the static chain register, if any. > (call_expr_flags): Set ECF_BY_DESCRIPTOR for calls by descriptor. > (expand_call): Likewise. Move around call to prepare_call_address > and pass all flags to it. > * cfgexpand.c (expand_call_stmt): Reinstate CALL_EXPR_BY_DESCRIPTOR. > * gimple.h (enum gf_mask): New GF_CALL_BY_DESCRIPTOR value. > (gimple_call_set_by_descriptor): New setter. > (gimple_call_by_descriptor_p): New getter. > * gimple.c (gimple_build_call_from_tree): Set CALL_EXPR_BY_DESCRIPTOR. > (gimple_call_flags): Deal with GF_CALL_BY_DESCRIPTOR. > * langhooks.h (struct lang_hooks): Add custom_function_descriptors. > * langhooks-def.h (LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS): Define. > (LANG_HOOKS_INITIALIZER): Add LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS. > * rtl.h (STATIC_CHAIN_REG_P): New macro. > * rtlanal.c (find_first_parameter_load): Skip static chain registers. > * target.def (custom_function_descriptors): New POD hook. > * tree.h (FUNC_ADDR_BY_DESCRIPTOR): New flag on ADDR_EXPR. > (CALL_EXPR_BY_DESCRIPTOR): New flag on CALL_EXPR. > * tree-core.h (ECF_BY_DESCRIPTOR): New mask. > Document FUNC_ADDR_BY_DESCRIPTOR and CALL_EXPR_BY_DESCRIPTOR. > * tree.c (make_node_stat) <tcc_declaration>: Set function alignment to > DEFAULT_FUNCTION_ALIGNMENT instead of FUNCTION_BOUNDARY. > (build_common_builtin_nodes): Initialize init_descriptor and > adjust_descriptor. > * tree-nested.c: Include target.h. > (struct nesting_info): Add 'any_descr_created' field. > (get_descriptor_type): New function. > (lookup_element_for_decl): New function extracted from... > (create_field_for_decl): Likewise. > (lookup_tramp_for_decl): ...here. Adjust. > (lookup_descr_for_decl): New function. > (convert_tramp_reference_op): Deal with descriptors. > (build_init_call_stmt): New function extracted from... > (finalize_nesting_tree_1): ...here. Adjust and deal with descriptors. > * defaults.h (DEFAULT_FUNCTION_ALIGNMENT): Define. > (TRAMPOLINE_ALIGNMENT): Set to above instead of FUNCTION_BOUNDARY. > * config/aarch64/aarch64.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS):Define > * config/alpha/alpha.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise. > * config/arm/arm.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise. > * config/arm/arm.c (arm_function_ok_for_sibcall): Return false for an > indirect call by descriptor if all the argument registers are used. > * config/i386/i386.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Define. > * config/ia64/ia64.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise. > * config/mips/mips.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise. > * config/pa/pa.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise. > * config/rs6000/rs6000.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS):Likewise > * config/sparc/sparc.h (TARGET_CUSTOM_FUNCTION_DESCRIPTORS): Likewise. > ada/ > * gcc-interface/misc.c (LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS):Define > * gcc-interface/trans.c (Attribute_to_gnu) <Attr_Access>: Deal with > a zero TARGET_CUSTOM_FUNCTION_DESCRIPTORSspecially for 'Code_Address. > Otherwise, if TARGET_CUSTOM_FUNCTION_DESCRIPTORS is positive, set > FUNC_ADDR_BY_DESCRIPTOR for 'Access/'Unrestricted_Access of nested > subprograms if the type can use an internal representation. > (call_to_gnu): Likewise, but set CALL_EXPR_BY_DESCRIPTOR on indirect > calls if the type can use an internal representation. > > > 2016-06-29 Eric Botcazou <ebotca...@adacore.com> > > * gnat.dg/trampoline3.adb: New test. > * gnat.dg/trampoline4.adb: Likewise. > > > p.diff > > > Index: ada/gcc-interface/misc.c > =================================================================== > --- ada/gcc-interface/misc.c (revision 237848) > +++ ada/gcc-interface/misc.c (working copy) > @@ -1416,6 +1416,8 @@ get_lang_specific (tree node) > #define LANG_HOOKS_EH_PERSONALITY gnat_eh_personality > #undef LANG_HOOKS_DEEP_UNSHARING > #define LANG_HOOKS_DEEP_UNSHARING true > +#undef LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS > +#define LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS true > > struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER; > > Index: ada/gcc-interface/trans.c > =================================================================== > --- ada/gcc-interface/trans.c (revision 237850) > +++ ada/gcc-interface/trans.c (working copy) > @@ -1702,6 +1702,17 @@ Attribute_to_gnu (Node_Id gnat_node, tre > > if (TREE_CODE (gnu_expr) == ADDR_EXPR) > TREE_NO_TRAMPOLINE (gnu_expr) = TREE_CONSTANT (gnu_expr) = 1; > + > + /* On targets for which function symbols denote a descriptor, the > + code address is stored within the first slot of the descriptor > + so we do an additional dereference: > + result = *((result_type *) result) > + where we expect result to be of some pointer type already. */ > + if (targetm.calls.custom_function_descriptors == 0) > + gnu_result > + = build_unary_op (INDIRECT_REF, NULL_TREE, > + convert (build_pointer_type (gnu_result_type), > + gnu_result)); > } > > /* For 'Access, issue an error message if the prefix is a C++ method > @@ -1728,10 +1739,19 @@ Attribute_to_gnu (Node_Id gnat_node, tre > /* Also check the inlining status. */ > check_inlining_for_nested_subprog (TREE_OPERAND (gnu_expr, 0)); > > - /* Check that we're not violating the No_Implicit_Dynamic_Code > - restriction. Be conservative if we don't know anything > - about the trampoline strategy for the target. */ > - Check_Implicit_Dynamic_Code_Allowed (gnat_node); > + /* Moreover, for 'Access or 'Unrestricted_Access with non- > + foreign-compatible representation, mark the ADDR_EXPR so > + that we can build a descriptor instead of a trampoline. */ > + if ((attribute == Attr_Access > + || attribute == Attr_Unrestricted_Access) > + && targetm.calls.custom_function_descriptors > 0 > + && Can_Use_Internal_Rep (Etype (gnat_node))) > + FUNC_ADDR_BY_DESCRIPTOR (gnu_expr) = 1; > + > + /* Otherwise, we need to check that we are not violating the > + No_Implicit_Dynamic_Code restriction. */ > + else if (targetm.calls.custom_function_descriptors != 0) > + Check_Implicit_Dynamic_Code_Allowed (gnat_node); > } > } > break; > @@ -4228,6 +4248,7 @@ Call_to_gnu (Node_Id gnat_node, tree *gn > tree gnu_after_list = NULL_TREE; > tree gnu_retval = NULL_TREE; > tree gnu_call, gnu_result; > + bool by_descriptor = false; > bool went_into_elab_proc = false; > bool pushed_binding_level = false; > Entity_Id gnat_formal; > @@ -4267,7 +4288,15 @@ Call_to_gnu (Node_Id gnat_node, tree *gn > type the access type is pointing to. Otherwise, get the formals from > the > entity being called. */ > if (Nkind (Name (gnat_node)) == N_Explicit_Dereference) > - gnat_formal = First_Formal_With_Extras (Etype (Name (gnat_node))); > + { > + gnat_formal = First_Formal_With_Extras (Etype (Name (gnat_node))); > + > + /* If the access type doesn't require foreign-compatible > representation, > + be prepared for descriptors. */ > + if (targetm.calls.custom_function_descriptors > 0 > + && Can_Use_Internal_Rep (Etype (Prefix (Name (gnat_node))))) > + by_descriptor = true; > + } > else if (Nkind (Name (gnat_node)) == N_Attribute_Reference) > /* Assume here that this must be 'Elab_Body or 'Elab_Spec. */ > gnat_formal = Empty; > @@ -4668,6 +4697,7 @@ Call_to_gnu (Node_Id gnat_node, tree *gn > > gnu_call > = build_call_vec (gnu_result_type, gnu_subprog_addr, gnu_actual_vec); > + CALL_EXPR_BY_DESCRIPTOR (gnu_call) = by_descriptor; > set_expr_location_from_node (gnu_call, gnat_node); > > /* If we have created a temporary for the return value, initialize it. */ > Index: builtins.c > =================================================================== > --- builtins.c (revision 237789) > +++ builtins.c (working copy) > @@ -4621,8 +4621,9 @@ expand_builtin_init_trampoline (tree exp > { > trampolines_created = 1; > > - warning_at (DECL_SOURCE_LOCATION (t_func), OPT_Wtrampolines, > - "trampoline generated for nested function %qD", t_func); > + if (targetm.calls.custom_function_descriptors != 0) > + warning_at (DECL_SOURCE_LOCATION (t_func), OPT_Wtrampolines, > + "trampoline generated for nested function %qD", t_func); > } > > return const0_rtx; > @@ -4644,6 +4645,57 @@ expand_builtin_adjust_trampoline (tree e > return tramp; > } > > +/* Expand a call to the builtin descriptor initialization routine. > + A descriptor is made up of a couple of pointers to the static > + chain and the code entry in this order. */ > + > +static rtx > +expand_builtin_init_descriptor (tree exp) > +{ > + tree t_descr, t_func, t_chain; > + rtx m_descr, r_descr, r_func, r_chain; > + > + if (!validate_arglist (exp, POINTER_TYPE, POINTER_TYPE, POINTER_TYPE, > + VOID_TYPE)) > + return NULL_RTX; > + > + t_descr = CALL_EXPR_ARG (exp, 0); > + t_func = CALL_EXPR_ARG (exp, 1); > + t_chain = CALL_EXPR_ARG (exp, 2); > + > + r_descr = expand_normal (t_descr); > + m_descr = gen_rtx_MEM (BLKmode, r_descr); > + MEM_NOTRAP_P (m_descr) = 1; > + > + r_func = expand_normal (t_func); > + r_chain = expand_normal (t_chain); > + > + /* Generate insns to initialize the descriptor. */ > + emit_move_insn (adjust_address_nv (m_descr, Pmode, 0), r_chain); > + emit_move_insn (adjust_address_nv (m_descr, Pmode, UNITS_PER_WORD), > r_func); > + > + return const0_rtx; > +} > + > +/* Expand a call to the builtin descriptor adjustment routine. */ > + > +static rtx > +expand_builtin_adjust_descriptor (tree exp) > +{ > + rtx tramp; > + > + if (!validate_arglist (exp, POINTER_TYPE, VOID_TYPE)) > + return NULL_RTX; > + > + tramp = expand_normal (CALL_EXPR_ARG (exp, 0)); > + > + /* Unalign the descriptor to allow runtime identification. */ > + tramp > + = plus_constant (Pmode, tramp, > targetm.calls.custom_function_descriptors); > + > + return force_operand (tramp, NULL_RTX); > +} > + > /* Expand the call EXP to the built-in signbit, signbitf or signbitl > function. The function first checks whether the back end provides > an insn to implement signbit for the respective mode. If not, it > @@ -6221,6 +6273,11 @@ expand_builtin (tree exp, rtx target, rt > case BUILT_IN_ADJUST_TRAMPOLINE: > return expand_builtin_adjust_trampoline (exp); > > + case BUILT_IN_INIT_DESCRIPTOR: > + return expand_builtin_init_descriptor (exp); > + case BUILT_IN_ADJUST_DESCRIPTOR: > + return expand_builtin_adjust_descriptor (exp); > + > case BUILT_IN_FORK: > case BUILT_IN_EXECL: > case BUILT_IN_EXECV: > Index: builtins.def > =================================================================== > --- builtins.def (revision 237789) > +++ builtins.def (working copy) > @@ -856,6 +856,8 @@ DEF_C99_BUILTIN (BUILT_IN__EXIT2, > DEF_BUILTIN_STUB (BUILT_IN_INIT_TRAMPOLINE, "__builtin_init_trampoline") > DEF_BUILTIN_STUB (BUILT_IN_INIT_HEAP_TRAMPOLINE, > "__builtin_init_heap_trampoline") > DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, "__builtin_adjust_trampoline") > +DEF_BUILTIN_STUB (BUILT_IN_INIT_DESCRIPTOR, "__builtin_init_descriptor") > +DEF_BUILTIN_STUB (BUILT_IN_ADJUST_DESCRIPTOR, "__builtin_adjust_descriptor") > DEF_BUILTIN_STUB (BUILT_IN_NONLOCAL_GOTO, "__builtin_nonlocal_goto") > > /* Implementing __builtin_setjmp. */ > Index: calls.c > =================================================================== > --- calls.c (revision 237789) > +++ calls.c (working copy) > @@ -183,18 +183,73 @@ static void restore_fixed_argument_area > > rtx > prepare_call_address (tree fndecl_or_type, rtx funexp, rtx > static_chain_value, > - rtx *call_fusage, int reg_parm_seen, int sibcallp) > + rtx *call_fusage, int reg_parm_seen, int flags) > { > /* Make a valid memory address and copy constants through pseudo-regs, > but not for a constant address if -fno-function-cse. */ > if (GET_CODE (funexp) != SYMBOL_REF) > - /* If we are using registers for parameters, force the > - function address into a register now. */ > - funexp = ((reg_parm_seen > - && targetm.small_register_classes_for_mode_p (FUNCTION_MODE)) > - ? force_not_mem (memory_address (FUNCTION_MODE, funexp)) > - : memory_address (FUNCTION_MODE, funexp)); > - else if (! sibcallp) > + { > + /* If it's an indirect call by descriptor, generate code to perform > + runtime identification of the pointer and load the descriptor. */ > + if ((flags & ECF_BY_DESCRIPTOR) && !flag_trampolines) > + { > + const int bit_val = targetm.calls.custom_function_descriptors; > + rtx call_lab = gen_label_rtx (); > + > + gcc_assert (fndecl_or_type && TYPE_P (fndecl_or_type)); > + fndecl_or_type > + = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL, NULL_TREE, > + fndecl_or_type); > + DECL_STATIC_CHAIN (fndecl_or_type) = 1; > + rtx chain = targetm.calls.static_chain (fndecl_or_type, false); > + > + /* Avoid long live ranges around function calls. */ > + funexp = copy_to_mode_reg (Pmode, funexp); > + > + if (REG_P (chain)) > + emit_insn (gen_rtx_CLOBBER (VOIDmode, chain)); > + > + /* Emit the runtime identification pattern. */ > + rtx mask = gen_rtx_AND (Pmode, funexp, GEN_INT (bit_val)); > + emit_cmp_and_jump_insns (mask, const0_rtx, EQ, NULL_RTX, Pmode, 1, > + call_lab); > + > + /* Statically predict the branch to very likely taken. */ > + rtx_insn *insn = get_last_insn (); > + if (JUMP_P (insn)) > + predict_insn_def (insn, PRED_BUILTIN_EXPECT, TAKEN); > + > + /* Load the descriptor. */ > + rtx mem = gen_rtx_MEM (Pmode, > + plus_constant (Pmode, funexp, - bit_val)); > + MEM_NOTRAP_P (mem) = 1; > + emit_move_insn (chain, mem); > + mem = gen_rtx_MEM (Pmode, > + plus_constant (Pmode, funexp, > + UNITS_PER_WORD - bit_val)); > + MEM_NOTRAP_P (mem) = 1; > + emit_move_insn (funexp, mem); > + > + emit_label (call_lab); > + > + if (REG_P (chain)) > + { > + use_reg (call_fusage, chain); > + STATIC_CHAIN_REG_P (chain) = 1; > + } > + > + /* Make sure we're not going to be overwritten below. */ > + gcc_assert (!static_chain_value); > + } > + > + /* If we are using registers for parameters, force the > + function address into a register now. */ > + funexp = ((reg_parm_seen > + && targetm.small_register_classes_for_mode_p (FUNCTION_MODE)) > + ? force_not_mem (memory_address (FUNCTION_MODE, funexp)) > + : memory_address (FUNCTION_MODE, funexp)); > + } > + else if (!(flags & ECF_SIBCALL)) > { > if (!NO_FUNCTION_CSE && optimize && ! flag_no_function_cse) > funexp = force_reg (Pmode, funexp); > @@ -211,7 +266,10 @@ prepare_call_address (tree fndecl_or_typ > > emit_move_insn (chain, static_chain_value); > if (REG_P (chain)) > - use_reg (call_fusage, chain); > + { > + use_reg (call_fusage, chain); > + STATIC_CHAIN_REG_P (chain) = 1; > + } > } > > return funexp; > @@ -792,11 +850,13 @@ call_expr_flags (const_tree t) > flags = internal_fn_flags (CALL_EXPR_IFN (t)); > else > { > - t = TREE_TYPE (CALL_EXPR_FN (t)); > - if (t && TREE_CODE (t) == POINTER_TYPE) > - flags = flags_from_decl_or_type (TREE_TYPE (t)); > + tree type = TREE_TYPE (CALL_EXPR_FN (t)); > + if (type && TREE_CODE (type) == POINTER_TYPE) > + flags = flags_from_decl_or_type (TREE_TYPE (type)); > else > flags = 0; > + if (CALL_EXPR_BY_DESCRIPTOR (t)) > + flags |= ECF_BY_DESCRIPTOR; > } > > return flags; > @@ -2633,6 +2693,8 @@ expand_call (tree exp, rtx target, int i > { > fntype = TREE_TYPE (TREE_TYPE (addr)); > flags |= flags_from_decl_or_type (fntype); > + if (CALL_EXPR_BY_DESCRIPTOR (exp)) > + flags |= ECF_BY_DESCRIPTOR; > } > rettype = TREE_TYPE (exp); > > @@ -3344,6 +3406,13 @@ expand_call (tree exp, rtx target, int i > if (STRICT_ALIGNMENT) > store_unaligned_arguments_into_pseudos (args, num_actuals); > > + /* Prepare the address of the call. This must be done before any > + register parameters is loaded for find_first_parameter_load to > + work properly in the presence of descriptors. */ > + funexp = prepare_call_address (fndecl ? fndecl : fntype, funexp, > + static_chain_value, &call_fusage, > + reg_parm_seen, flags); > + > /* Now store any partially-in-registers parm. > This is the last place a block-move can happen. */ > if (reg_parm_seen) > @@ -3454,10 +3523,6 @@ expand_call (tree exp, rtx target, int i > } > > after_args = get_last_insn (); > - funexp = prepare_call_address (fndecl ? fndecl : fntype, funexp, > - static_chain_value, &call_fusage, > - reg_parm_seen, pass == 0); > - > load_register_parameters (args, num_actuals, &call_fusage, flags, > pass == 0, &sibcall_failure); > > Index: cfgexpand.c > =================================================================== > --- cfgexpand.c (revision 237789) > +++ cfgexpand.c (working copy) > @@ -2636,6 +2636,7 @@ expand_call_stmt (gcall *stmt) > else > CALL_FROM_THUNK_P (exp) = gimple_call_from_thunk_p (stmt); > CALL_EXPR_VA_ARG_PACK (exp) = gimple_call_va_arg_pack_p (stmt); > + CALL_EXPR_BY_DESCRIPTOR (exp) = gimple_call_by_descriptor_p (stmt); > SET_EXPR_LOCATION (exp, gimple_location (stmt)); > CALL_WITH_BOUNDS_P (exp) = gimple_call_with_bounds_p (stmt); > > Index: common.opt > =================================================================== > --- common.opt (revision 237789) > +++ common.opt (working copy) > @@ -2303,6 +2303,10 @@ ftracer > Common Report Var(flag_tracer) Optimization > Perform superblock formation via tail duplication. > > +ftrampolines > +Common Report Var(flag_trampolines) Init(0) > +Always generate trampolines for pointers to nested functions > + > ; Zero means that floating-point math operations cannot generate a > ; (user-visible) trap. This is the case, for example, in nonstop > ; IEEE 754 arithmetic. > Index: config/aarch64/aarch64.h > =================================================================== > --- config/aarch64/aarch64.h (revision 237789) > +++ config/aarch64/aarch64.h (working copy) > @@ -779,6 +779,9 @@ typedef struct > correctly. */ > #define TRAMPOLINE_SECTION text_section > > +/* Use custom descriptors instead of trampolines when possible. */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1 > + > /* To start with. */ > #define BRANCH_COST(SPEED_P, PREDICTABLE_P) \ > (aarch64_branch_cost (SPEED_P, PREDICTABLE_P)) > Index: config/alpha/alpha.h > =================================================================== > --- config/alpha/alpha.h (revision 237789) > +++ config/alpha/alpha.h (working copy) > @@ -996,3 +996,6 @@ extern long alpha_auto_offset; > #define NO_IMPLICIT_EXTERN_C > > #define TARGET_SUPPORTS_WIDE_INT 1 > + > +/* Use custom descriptors instead of trampolines when possible if not VMS. > */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS (TARGET_ABI_OPEN_VMS ? 0 : 1) > Index: config/arm/arm.c > =================================================================== > --- config/arm/arm.c (revision 237789) > +++ config/arm/arm.c (working copy) > @@ -6781,6 +6781,29 @@ arm_function_ok_for_sibcall (tree decl, > && DECL_WEAK (decl)) > return false; > > + /* We cannot do a tailcall for an indirect call by descriptor if all the > + argument registers are used because the only register left to load the > + address is IP and it will already contain the static chain. */ > + if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines) > + { > + tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp))); > + CUMULATIVE_ARGS cum; > + cumulative_args_t cum_v; > + > + arm_init_cumulative_args (&cum, fntype, NULL_RTX, NULL_TREE); > + cum_v = pack_cumulative_args (&cum); > + > + for (tree t = TYPE_ARG_TYPES (fntype); t; t = TREE_CHAIN (t)) > + { > + tree type = TREE_VALUE (t); > + if (!VOID_TYPE_P (type)) > + arm_function_arg_advance (cum_v, TYPE_MODE (type), type, true); > + } > + > + if (!arm_function_arg (cum_v, SImode, integer_type_node, true)) > + return false; > + } > + > /* Everything else is ok. */ > return true; > } > Index: config/arm/arm.h > =================================================================== > --- config/arm/arm.h (revision 237789) > +++ config/arm/arm.h (working copy) > @@ -1632,6 +1632,10 @@ typedef struct > > /* Alignment required for a trampoline in bits. */ > #define TRAMPOLINE_ALIGNMENT 32 > + > +/* Use custom descriptors instead of trampolines when possible, but > + we cannot use bit #0 because it is the ARM/Thumb selection bit. */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 2 > > /* Addressing modes, and classification of registers for them. */ > #define HAVE_POST_INCREMENT 1 > Index: config/i386/i386.h > =================================================================== > --- config/i386/i386.h (revision 237789) > +++ config/i386/i386.h (working copy) > @@ -2660,6 +2660,9 @@ extern void debug_dispatch_window (int); > > #define TARGET_SUPPORTS_WIDE_INT 1 > > +/* Use custom descriptors instead of trampolines when possible. */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1 > + > /* > Local variables: > version-control: t > Index: config/ia64/ia64.h > =================================================================== > --- config/ia64/ia64.h (revision 237789) > +++ config/ia64/ia64.h (working copy) > @@ -1714,4 +1714,7 @@ struct GTY(()) machine_function > /* Switch on code for querying unit reservations. */ > #define CPU_UNITS_QUERY 1 > > +/* IA-64 already uses descriptors for its standard calling sequence. */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 0 > + > /* End of ia64.h */ > Index: config/mips/mips.h > =================================================================== > --- config/mips/mips.h (revision 237789) > +++ config/mips/mips.h (working copy) > @@ -3413,3 +3413,6 @@ struct GTY(()) machine_function { > #define ENABLE_LD_ST_PAIRS \ > (TARGET_LOAD_STORE_PAIRS && (TUNE_P5600 || TUNE_I6400) \ > && !TARGET_MICROMIPS && !TARGET_FIX_24K) > + > +/* Use custom descriptors instead of trampolines when possible. */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1 > Index: config/pa/pa.h > =================================================================== > --- config/pa/pa.h (revision 237789) > +++ config/pa/pa.h (working copy) > @@ -1313,3 +1313,6 @@ do { > \ > seven and four instructions, respectively. */ > #define MAX_PCREL17F_OFFSET \ > (flag_pic ? (TARGET_HPUX ? 198164 : 221312) : 240000) > + > +/* HP-PA already uses descriptors for its standard calling sequence. */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 0 > Index: config/rs6000/rs6000.h > =================================================================== > --- config/rs6000/rs6000.h (revision 237789) > +++ config/rs6000/rs6000.h (working copy) > @@ -2894,3 +2894,6 @@ extern GTY(()) tree rs6000_builtin_types > extern GTY(()) tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT]; > > #define TARGET_SUPPORTS_WIDE_INT 1 > + > +/* Use custom descriptors instead of trampolines when possible if not AIX. > */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS (DEFAULT_ABI == ABI_AIX ? 0 : 1) > Index: config/sparc/sparc.h > =================================================================== > --- config/sparc/sparc.h (revision 237789) > +++ config/sparc/sparc.h (working copy) > @@ -1817,3 +1817,6 @@ extern int sparc_indent_opcode; > #define SPARC_LOW_FE_EXCEPT_VALUES 0 > > #define TARGET_SUPPORTS_WIDE_INT 1 > + > +/* Use custom descriptors instead of trampolines when possible. */ > +#define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 1 > Index: defaults.h > =================================================================== > --- defaults.h (revision 237789) > +++ defaults.h (working copy) > @@ -1080,9 +1080,18 @@ see the files COPYING3 and COPYING.RUNTI > #define CASE_VECTOR_PC_RELATIVE 0 > #endif > > +/* Force minimum alignment to be able to use the least significant bits > + for distinguishing descriptor addresses from code addresses. */ > +#define DEFAULT_FUNCTION_ALIGNMENT \ > + (lang_hooks.custom_function_descriptors \ > + && targetm.calls.custom_function_descriptors > 0 \ > + ? MAX (FUNCTION_BOUNDARY, \ > + 2 * targetm.calls.custom_function_descriptors * BITS_PER_UNIT)\ > + : FUNCTION_BOUNDARY) > + > /* Assume that trampolines need function alignment. */ > #ifndef TRAMPOLINE_ALIGNMENT > -#define TRAMPOLINE_ALIGNMENT FUNCTION_BOUNDARY > +#define TRAMPOLINE_ALIGNMENT DEFAULT_FUNCTION_ALIGNMENT > #endif > > /* Register mappings for target machines without register windows. */ > Index: doc/invoke.texi > =================================================================== > --- doc/invoke.texi (revision 237789) > +++ doc/invoke.texi (working copy) > @@ -498,7 +498,7 @@ Objective-C and Objective-C++ Dialects}. > -fverbose-asm -fpack-struct[=@var{n}] @gol > -fleading-underscore -ftls-model=@var{model} @gol > -fstack-reuse=@var{reuse_level} @gol > --ftrapv -fwrapv @gol > +-ftrampolines -ftrapv -fwrapv @gol > -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} @gol > -fstrict-volatile-bitfields -fsync-libcalls} > > @@ -11546,6 +11546,31 @@ unit, or if @option{-fpic} is not given > The default without @option{-fpic} is @samp{initial-exec}; with > @option{-fpic} the default is @samp{global-dynamic}. > > +@item -ftrampolines > +@opindex ftrampolines > +Always generate trampolines for pointers to nested functions. > + > +A trampoline is a small piece of data or code that is created at run > +time on the stack when the address of a nested function is taken, and > +is used to call the nested function indirectly. For some targets, it > +is made up of data only and thus requires no special treatment. But, > +for most targets, it is made up of code and thus requires the stack > +to be made executable in order for the program to work properly. > + > +@option{-fno-trampolines} is enabled by default to let the compiler avoid > +generating them if it computes that this is safe, on a case by case basis, > +and replace them with descriptors. Descriptors are always made up of data > +only, but the generated code must be prepared to deal with them. > + > +This option has no effects for any other languages than Ada as of this > +writing. Moreover, code compiled with @option{-ftrampolines} and code > +compiled with @option{-fno-trampolines} are not binary compatible if > +nested functions are present. This option must therefore be used on > +a program-wide basis and be manipulated with extreme care. > + > +This option has no effects for targets whose trampolines are made up of > +data only, for example IA-64 targets, AIX or VMS platforms. > + > @item -fvisibility=@r{[}default@r{|}internal@r{|}hidden@r{|}protected@r{]} > @opindex fvisibility > Set the default ELF image symbol visibility to the specified option---all > Index: doc/tm.texi > =================================================================== > --- doc/tm.texi (revision 237789) > +++ doc/tm.texi (working copy) > @@ -5181,6 +5181,25 @@ be returned; otherwise @var{addr} should > If this hook is not defined, @var{addr} will be used for function calls. > @end deftypefn > > +@deftypevr {Target Hook} int TARGET_CUSTOM_FUNCTION_DESCRIPTORS > +This hook should be defined to a power of 2 if the target will benefit > +from the use of custom descriptors for nested functions instead of the > +standard trampolines. Such descriptors are created at run time on the > +stack and made up of data only, but they are non-standard so the generated > +code must be prepared to deal with them. This hook should be defined to 0 > +if the target uses function descriptors for its standard calling sequence, > +like for example HP-PA or IA-64. Using descriptors for nested functions > +eliminates the need for trampolines that reside on the stack and require > +it to be made executable. > + > +The value of the macro is used to parameterize the run-time identification > +scheme implemented to distinguish descriptors from function addresses: it > +gives the number of bytes by which their address is shifted in comparison > +with function addresses. The value of 1 will generally work, unless it is > +already used by the target for a similar purpose, like for example on ARM > +where it is used to distinguish Thumb functions from ARM ones. > +@end deftypevr > + > Implementing trampolines is difficult on many machines because they have > separate instruction and data caches. Writing into a stack location > fails to clear the memory in the instruction cache, so when the program > Index: doc/tm.texi.in > =================================================================== > --- doc/tm.texi.in (revision 237789) > +++ doc/tm.texi.in (working copy) > @@ -3947,6 +3947,8 @@ is used for aligning trampolines. > > @hook TARGET_TRAMPOLINE_ADJUST_ADDRESS > > +@hook TARGET_CUSTOM_FUNCTION_DESCRIPTORS > + > Implementing trampolines is difficult on many machines because they have > separate instruction and data caches. Writing into a stack location > fails to clear the memory in the instruction cache, so when the program > Index: gimple.c > =================================================================== > --- gimple.c (revision 237789) > +++ gimple.c (working copy) > @@ -373,6 +373,7 @@ gimple_build_call_from_tree (tree t) > gimple_call_set_from_thunk (call, CALL_FROM_THUNK_P (t)); > gimple_call_set_va_arg_pack (call, CALL_EXPR_VA_ARG_PACK (t)); > gimple_call_set_nothrow (call, TREE_NOTHROW (t)); > + gimple_call_set_by_descriptor (call, CALL_EXPR_BY_DESCRIPTOR (t)); > gimple_set_no_warning (call, TREE_NO_WARNING (t)); > gimple_call_set_with_bounds (call, CALL_WITH_BOUNDS_P (t)); > > @@ -1386,6 +1387,9 @@ gimple_call_flags (const gimple *stmt) > if (stmt->subcode & GF_CALL_NOTHROW) > flags |= ECF_NOTHROW; > > + if (stmt->subcode & GF_CALL_BY_DESCRIPTOR) > + flags |= ECF_BY_DESCRIPTOR; > + > return flags; > } > > Index: gimple.h > =================================================================== > --- gimple.h (revision 237789) > +++ gimple.h (working copy) > @@ -146,6 +146,7 @@ enum gf_mask { > GF_CALL_CTRL_ALTERING = 1 << 7, > GF_CALL_WITH_BOUNDS = 1 << 8, > GF_CALL_MUST_TAIL_CALL = 1 << 9, > + GF_CALL_BY_DESCRIPTOR = 1 << 10, > GF_OMP_PARALLEL_COMBINED = 1 << 0, > GF_OMP_PARALLEL_GRID_PHONY = 1 << 1, > GF_OMP_TASK_TASKLOOP = 1 << 0, > @@ -3357,6 +3358,26 @@ gimple_call_alloca_for_var_p (gcall *s) > return (s->subcode & GF_CALL_ALLOCA_FOR_VAR) != 0; > } > > +/* If BY_DESCRIPTOR_P is true, GIMPLE_CALL S is an indirect call for which > + pointers to nested function are descriptors instead of trampolines. */ > + > +static inline void > +gimple_call_set_by_descriptor (gcall *s, bool by_descriptor_p) > +{ > + if (by_descriptor_p) > + s->subcode |= GF_CALL_BY_DESCRIPTOR; > + else > + s->subcode &= ~GF_CALL_BY_DESCRIPTOR; > +} > + > +/* Return true if S is a by-descriptor call. */ > + > +static inline bool > +gimple_call_by_descriptor_p (gcall *s) > +{ > + return (s->subcode & GF_CALL_BY_DESCRIPTOR) != 0; > +} > + > /* Copy all the GF_CALL_* flags from ORIG_CALL to DEST_CALL. */ > > static inline void > Index: langhooks-def.h > =================================================================== > --- langhooks-def.h (revision 237789) > +++ langhooks-def.h (working copy) > @@ -120,6 +120,7 @@ extern bool lhd_omp_mappable_type (tree) > #define LANG_HOOKS_BLOCK_MAY_FALLTHRU hook_bool_const_tree_true > #define LANG_HOOKS_EH_USE_CXA_END_CLEANUP false > #define LANG_HOOKS_DEEP_UNSHARING false > +#define LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS false > > /* Attribute hooks. */ > #define LANG_HOOKS_ATTRIBUTE_TABLE NULL > @@ -319,7 +320,8 @@ extern void lhd_end_section (void); > LANG_HOOKS_EH_PROTECT_CLEANUP_ACTIONS, \ > LANG_HOOKS_BLOCK_MAY_FALLTHRU, \ > LANG_HOOKS_EH_USE_CXA_END_CLEANUP, \ > - LANG_HOOKS_DEEP_UNSHARING \ > + LANG_HOOKS_DEEP_UNSHARING, \ > + LANG_HOOKS_CUSTOM_FUNCTION_DESCRIPTORS \ > } > > #endif /* GCC_LANG_HOOKS_DEF_H */ > Index: langhooks.h > =================================================================== > --- langhooks.h (revision 237789) > +++ langhooks.h (working copy) > @@ -505,6 +505,10 @@ struct lang_hooks > gimplification. */ > bool deep_unsharing; > > + /* True if this language may use custom descriptors for nested functions > + instead of trampolines. */ > + bool custom_function_descriptors; > + > /* Whenever you add entries here, make sure you adjust langhooks-def.h > and langhooks.c accordingly. */ > }; > Index: rtl.h > =================================================================== > --- rtl.h (revision 237789) > +++ rtl.h (working copy) > @@ -317,6 +317,7 @@ struct GTY((desc("0"), tag("0"), > 1 in a CONCAT is VAL_EXPR_IS_COPIED in var-tracking.c. > 1 in a VALUE is SP_BASED_VALUE_P in cselib.c. > 1 in a SUBREG generated by LRA for reload insns. > + 1 in a REG if this is a static chain register. > 1 in a CALL for calls instrumented by Pointer Bounds Checker. */ > unsigned int jump : 1; > /* In a CODE_LABEL, part of the two-bit alternate entry field. > @@ -2264,6 +2265,10 @@ do { > \ > : (SIGN) == SRP_SIGNED ? SUBREG_PROMOTED_SIGNED_P (RTX) \ > : SUBREG_PROMOTED_UNSIGNED_P (RTX)) > > +/* True if the REG is the static chain register for some CALL_INSN. */ > +#define STATIC_CHAIN_REG_P(RTX) \ > + (RTL_FLAG_CHECK1 ("STATIC_CHAIN_REG_P", (RTX), REG)->jump) > + > /* True if the subreg was generated by LRA for reload insns. Such > subregs are valid only during LRA. */ > #define LRA_SUBREG_P(RTX) \ > Index: rtlanal.c > =================================================================== > --- rtlanal.c (revision 237789) > +++ rtlanal.c (working copy) > @@ -3914,7 +3914,8 @@ find_first_parameter_load (rtx_insn *cal > parm.nregs = 0; > for (p = CALL_INSN_FUNCTION_USAGE (call_insn); p; p = XEXP (p, 1)) > if (GET_CODE (XEXP (p, 0)) == USE > - && REG_P (XEXP (XEXP (p, 0), 0))) > + && REG_P (XEXP (XEXP (p, 0), 0)) > + && !STATIC_CHAIN_REG_P (XEXP (XEXP (p, 0), 0))) > { > gcc_assert (REGNO (XEXP (XEXP (p, 0), 0)) < FIRST_PSEUDO_REGISTER); > > Index: target.def > =================================================================== > --- target.def (revision 237789) > +++ target.def (working copy) > @@ -4723,6 +4723,26 @@ be returned; otherwise @var{addr} should > If this hook is not defined, @var{addr} will be used for function calls.", > rtx, (rtx addr), NULL) > > +DEFHOOKPOD > +(custom_function_descriptors, > + "This hook should be defined to a power of 2 if the target will benefit\n\ > +from the use of custom descriptors for nested functions instead of the\n\ > +standard trampolines. Such descriptors are created at run time on the\n\ > +stack and made up of data only, but they are non-standard so the generated\n\ > +code must be prepared to deal with them. This hook should be defined to 0\n\ > +if the target uses function descriptors for its standard calling sequence,\n\ > +like for example HP-PA or IA-64. Using descriptors for nested functions\n\ > +eliminates the need for trampolines that reside on the stack and require\n\ > +it to be made executable.\n\ > +\n\ > +The value of the macro is used to parameterize the run-time identification\n\ > +scheme implemented to distinguish descriptors from function addresses: it\n\ > +gives the number of bytes by which their address is shifted in comparison\n\ > +with function addresses. The value of 1 will generally work, unless it is\n\ > +already used by the target for a similar purpose, like for example on ARM\n\ > +where it is used to distinguish Thumb functions from ARM ones.", > + int, -1) > + > /* Return the number of bytes of its own arguments that a function > pops on returning, or 0 if the function pops no arguments and the > caller must therefore pop them all after the function returns. */ > Index: testsuite/gnat.dg/trampoline3.adb > =================================================================== > --- testsuite/gnat.dg/trampoline3.adb (revision 0) > +++ testsuite/gnat.dg/trampoline3.adb (working copy) > @@ -0,0 +1,22 @@ > +-- { dg-do compile { target *-*-linux* } } > +-- { dg-options "-gnatws" } > + > +procedure Trampoline3 is > + > + A : Integer; > + > + type FuncPtr is access function (I : Integer) return Integer; > + > + function F (I : Integer) return Integer is > + begin > + return A + I; > + end F; > + > + P : FuncPtr := F'Access; > + I : Integer; > + > +begin > + I := P(0); > +end; > + > +-- { dg-final { scan-assembler-not "GNU-stack.*x" } } > Index: testsuite/gnat.dg/trampoline4.adb > =================================================================== > --- testsuite/gnat.dg/trampoline4.adb (revision 0) > +++ testsuite/gnat.dg/trampoline4.adb (working copy) > @@ -0,0 +1,22 @@ > +-- { dg-do compile { target *-*-linux* } } > +-- { dg-options "-ftrampolines -gnatws" } > + > +procedure Trampoline4 is > + > + A : Integer; > + > + type FuncPtr is access function (I : Integer) return Integer; > + > + function F (I : Integer) return Integer is > + begin > + return A + I; > + end F; > + > + P : FuncPtr := F'Access; > + I : Integer; > + > +begin > + I := P(0); > +end; > + > +-- { dg-final { scan-assembler "GNU-stack.*x" } } > Index: tree-core.h > =================================================================== > --- tree-core.h (revision 237789) > +++ tree-core.h (working copy) > @@ -90,6 +90,9 @@ struct die_struct; > /* Nonzero if this call is into the transaction runtime library. */ > #define ECF_TM_BUILTIN (1 << 13) > > +/* Nonzero if this is an indirect call by descriptor. */ > +#define ECF_BY_DESCRIPTOR (1 << 14) > + > /* Call argument flags. */ > /* Nonzero if the argument is not dereferenced recursively, thus only > directly reachable memory is read or written. */ > @@ -1216,6 +1219,12 @@ struct GTY(()) tree_base { > > REF_REVERSE_STORAGE_ORDER in > BIT_FIELD_REF, MEM_REF > + > + FUNC_ADDR_BY_DESCRIPTOR in > + ADDR_EXPR > + > + CALL_EXPR_BY_DESCRIPTOR in > + CALL_EXPR > */ > > struct GTY(()) tree_typed { > Index: tree-nested.c > =================================================================== > --- tree-nested.c (revision 237789) > +++ tree-nested.c (working copy) > @@ -21,6 +21,7 @@ > #include "system.h" > #include "coretypes.h" > #include "backend.h" > +#include "target.h" > #include "rtl.h" > #include "tree.h" > #include "gimple.h" > @@ -103,6 +104,7 @@ struct nesting_info > > bool any_parm_remapped; > bool any_tramp_created; > + bool any_descr_created; > char static_chain_added; > }; > > @@ -486,12 +488,40 @@ get_trampoline_type (struct nesting_info > return trampoline_type; > } > > -/* Given DECL, a nested function, find or create a field in the non-local > - frame structure for a trampoline for this function. */ > +/* Build or return the type used to represent a nested function descriptor. > */ > + > +static GTY(()) tree descriptor_type; > > static tree > -lookup_tramp_for_decl (struct nesting_info *info, tree decl, > - enum insert_option insert) > +get_descriptor_type (struct nesting_info *info) > +{ > + tree t; > + > + if (descriptor_type) > + return descriptor_type; > + > + t = build_index_type (build_int_cst (NULL_TREE, 2 * UNITS_PER_WORD - 1)); > + t = build_array_type (char_type_node, t); > + t = build_decl (DECL_SOURCE_LOCATION (info->context), > + FIELD_DECL, get_identifier ("__data"), t); > + SET_DECL_ALIGN (t, BITS_PER_WORD); > + DECL_USER_ALIGN (t) = 1; > + > + descriptor_type = make_node (RECORD_TYPE); > + TYPE_NAME (descriptor_type) = get_identifier ("__builtin_descriptor"); > + TYPE_FIELDS (descriptor_type) = t; > + layout_type (descriptor_type); > + DECL_CONTEXT (t) = descriptor_type; > + > + return descriptor_type; > +} > + > +/* Given DECL, a nested function, find or create an element in the > + var map for this function. */ > + > +static tree > +lookup_element_for_decl (struct nesting_info *info, tree decl, > + enum insert_option insert) > { > if (insert == NO_INSERT) > { > @@ -501,19 +531,73 @@ lookup_tramp_for_decl (struct nesting_in > > tree *slot = &info->var_map->get_or_insert (decl); > if (!*slot) > - { > - tree field = make_node (FIELD_DECL); > - DECL_NAME (field) = DECL_NAME (decl); > - TREE_TYPE (field) = get_trampoline_type (info); > - TREE_ADDRESSABLE (field) = 1; > + *slot = build_tree_list (NULL_TREE, NULL_TREE); > > - insert_field_into_struct (get_frame_type (info), field); > - *slot = field; > + return (tree) *slot; > +} > + > +/* Given DECL, a nested function, create a field in the non-local > + frame structure for this function. */ > + > +static tree > +create_field_for_decl (struct nesting_info *info, tree decl, tree type) > +{ > + tree field = make_node (FIELD_DECL); > + DECL_NAME (field) = DECL_NAME (decl); > + TREE_TYPE (field) = type; > + TREE_ADDRESSABLE (field) = 1; > + insert_field_into_struct (get_frame_type (info), field); > + return field; > +} > + > +/* Given DECL, a nested function, find or create a field in the non-local > + frame structure for a trampoline for this function. */ > + > +static tree > +lookup_tramp_for_decl (struct nesting_info *info, tree decl, > + enum insert_option insert) > +{ > + tree elt, field; > + > + elt = lookup_element_for_decl (info, decl, insert); > + if (!elt) > + return NULL_TREE; > + > + field = TREE_PURPOSE (elt); > > + if (!field && insert == INSERT) > + { > + field = create_field_for_decl (info, decl, get_trampoline_type (info)); > + TREE_PURPOSE (elt) = field; > info->any_tramp_created = true; > } > > - return *slot; > + return field; > +} > + > +/* Given DECL, a nested function, find or create a field in the non-local > + frame structure for a descriptor for this function. */ > + > +static tree > +lookup_descr_for_decl (struct nesting_info *info, tree decl, > + enum insert_option insert) > +{ > + tree elt, field; > + > + elt = lookup_element_for_decl (info, decl, insert); > + if (!elt) > + return NULL_TREE; > + > + field = TREE_VALUE (elt); > + > + if (!field && insert == INSERT) > + { > + field = create_field_for_decl (info, decl, get_descriptor_type (info)); > + TREE_VALUE (elt) = field; > + info->any_descr_created = true; > + } > + > + return field; > } > > /* Build or return the field within the non-local frame state that holds > @@ -2303,6 +2387,7 @@ convert_tramp_reference_op (tree *tp, in > struct walk_stmt_info *wi = (struct walk_stmt_info *) data; > struct nesting_info *const info = (struct nesting_info *) wi->info, *i; > tree t = *tp, decl, target_context, x, builtin; > + bool descr; > gcall *call; > > *walk_subtrees = 0; > @@ -2337,7 +2422,14 @@ convert_tramp_reference_op (tree *tp, in > we need to insert the trampoline. */ > for (i = info; i->context != target_context; i = i->outer) > continue; > - x = lookup_tramp_for_decl (i, decl, INSERT); > + > + /* Decide whether to generate a descriptor or a trampoline. */ > + descr = FUNC_ADDR_BY_DESCRIPTOR (t) && !flag_trampolines; > + > + if (descr) > + x = lookup_descr_for_decl (i, decl, INSERT); > + else > + x = lookup_tramp_for_decl (i, decl, INSERT); > > /* Compute the address of the field holding the trampoline. */ > x = get_frame_field (info, target_context, x, &wi->gsi); > @@ -2346,7 +2438,10 @@ convert_tramp_reference_op (tree *tp, in > > /* Do machine-specific ugliness. Normally this will involve > computing extra alignment, but it can really be anything. */ > - builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE); > + if (descr) > + builtin = builtin_decl_implicit (BUILT_IN_ADJUST_DESCRIPTOR); > + else > + builtin = builtin_decl_implicit (BUILT_IN_ADJUST_TRAMPOLINE); > call = gimple_build_call (builtin, 1, x); > x = init_tmp_var_with_call (info, &wi->gsi, call); > > @@ -2820,6 +2915,27 @@ fold_mem_refs (tree *const &e, void *dat > return true; > } > > +/* Given DECL, a nested function, build an initialization call for FIELD, > + the trampoline or descriptor for DECL, using FUNC as the function. */ > + > +static gcall * > +build_init_call_stmt (struct nesting_info *info, tree decl, tree field, > + tree func) > +{ > + tree arg1, arg2, arg3, x; > + > + gcc_assert (DECL_STATIC_CHAIN (decl)); > + arg3 = build_addr (info->frame_decl); > + > + arg2 = build_addr (decl); > + > + x = build3 (COMPONENT_REF, TREE_TYPE (field), > + info->frame_decl, field, NULL_TREE); > + arg1 = build_addr (x); > + > + return gimple_build_call (func, 3, arg1, arg2, arg3); > +} > + > /* Do "everything else" to clean up or complete state collected by the > various > walking passes -- create a field to hold the frame base address, lay out > the > types and decls, generate code to initialize the frame decl, store > critical > @@ -2965,23 +3081,32 @@ finalize_nesting_tree_1 (struct nesting_ > struct nesting_info *i; > for (i = root->inner; i ; i = i->next) > { > - tree arg1, arg2, arg3, x, field; > + tree field, x; > > field = lookup_tramp_for_decl (root, i->context, NO_INSERT); > if (!field) > continue; > > - gcc_assert (DECL_STATIC_CHAIN (i->context)); > - arg3 = build_addr (root->frame_decl); > + x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE); > + stmt = build_init_call_stmt (root, i->context, field, x); > + gimple_seq_add_stmt (&stmt_list, stmt); > + } > + } > > - arg2 = build_addr (i->context); > + /* If descriptors were created, then we need to initialize them. */ > + if (root->any_descr_created) > + { > + struct nesting_info *i; > + for (i = root->inner; i ; i = i->next) > + { > + tree field, x; > > - x = build3 (COMPONENT_REF, TREE_TYPE (field), > - root->frame_decl, field, NULL_TREE); > - arg1 = build_addr (x); > + field = lookup_descr_for_decl (root, i->context, NO_INSERT); > + if (!field) > + continue; > > - x = builtin_decl_implicit (BUILT_IN_INIT_TRAMPOLINE); > - stmt = gimple_build_call (x, 3, arg1, arg2, arg3); > + x = builtin_decl_implicit (BUILT_IN_INIT_DESCRIPTOR); > + stmt = build_init_call_stmt (root, i->context, field, x); > gimple_seq_add_stmt (&stmt_list, stmt); > } > } > Index: tree.c > =================================================================== > --- tree.c (revision 237789) > +++ tree.c (working copy) > @@ -1019,7 +1019,7 @@ make_node_stat (enum tree_code code MEM_ > { > if (code == FUNCTION_DECL) > { > - SET_DECL_ALIGN (t, FUNCTION_BOUNDARY); > + SET_DECL_ALIGN (t, DEFAULT_FUNCTION_ALIGNMENT); > DECL_MODE (t) = FUNCTION_MODE; > } > else > @@ -10567,12 +10567,19 @@ build_common_builtin_nodes (void) > BUILT_IN_INIT_HEAP_TRAMPOLINE, > "__builtin_init_heap_trampoline", > ECF_NOTHROW | ECF_LEAF); > + local_define_builtin ("__builtin_init_descriptor", ftype, > + BUILT_IN_INIT_DESCRIPTOR, > + "__builtin_init_descriptor", ECF_NOTHROW | ECF_LEAF); > > ftype = build_function_type_list (ptr_type_node, ptr_type_node, NULL_TREE); > local_define_builtin ("__builtin_adjust_trampoline", ftype, > BUILT_IN_ADJUST_TRAMPOLINE, > "__builtin_adjust_trampoline", > ECF_CONST | ECF_NOTHROW); > + local_define_builtin ("__builtin_adjust_descriptor", ftype, > + BUILT_IN_ADJUST_DESCRIPTOR, > + "__builtin_adjust_descriptor", > + ECF_CONST | ECF_NOTHROW); > > ftype = build_function_type_list (void_type_node, > ptr_type_node, ptr_type_node, NULL_TREE); > Index: tree.h > =================================================================== > --- tree.h (revision 237789) > +++ tree.h (working copy) > @@ -967,6 +967,16 @@ extern void omp_clause_range_check_faile > #define REF_REVERSE_STORAGE_ORDER(NODE) \ > (TREE_CHECK2 (NODE, BIT_FIELD_REF, MEM_REF)->base.default_def_flag) > > + /* In an ADDR_EXPR, indicates that this is a pointer to nested function > + represented by a descriptor instead of a trampoline. */ > +#define FUNC_ADDR_BY_DESCRIPTOR(NODE) \ > + (TREE_CHECK (NODE, ADDR_EXPR)->base.default_def_flag) > + > +/* In a CALL_EXPR, indicates that this is an indirect call for which > + pointers to nested function are descriptors instead of trampolines. */ > +#define CALL_EXPR_BY_DESCRIPTOR(NODE) \ > + (TREE_CHECK (NODE, CALL_EXPR)->base.default_def_flag) > + > /* These flags are available for each language front end to use internally. > */ > #define TREE_LANG_FLAG_0(NODE) \ > (TREE_NOT_CHECK2 (NODE, TREE_VEC, SSA_NAME)->base.u.bits.lang_flag_0) >