On Thu, Aug 18, 2011 at 1:03 AM, Richard Guenther <richard.guent...@gmail.com> wrote: > On Wed, Aug 17, 2011 at 7:54 PM, Sriraman Tallam <tmsri...@google.com> wrote: >> On Wed, Aug 17, 2011 at 12:37 AM, Richard Guenther >> <richard.guent...@gmail.com> wrote: >>> On Tue, Aug 16, 2011 at 10:50 PM, Sriraman Tallam <tmsri...@google.com> >>> wrote: >>>> Support for getting CPU type and feature information at run-time. >>>> >>>> The following patch provides support for finding the platform type at >>>> run-time, like cpu type and features supported. The multi-versioning >>>> framework will use the builtins added to dispatch the right function >>>> version. Please refer to http://gcc.gnu.org/ml/gcc/2011-08/msg00298.html >>>> for details on function multi-versioning usability. >>> >>> Please provide an overview why you need the new builtins, >> >> For multi-versioning, the compiler can call the appropriate builtin >> to dispatch the right version. The builtin call will later get folded. >> >> For example, >> >> int __attribute__ version ("sse4_1") >> compute () >> { >> // Do sse4_1 specific impkementation. >> } >> >> int >> compute () >> { >> // Generic implementation >> } >> >> The compiler will check if the target supports the attribute and then >> convert a call to compute () into this: >> >> if (__builtin_target_supports_sse4_1 ()) >> compute_sse4_1 (); // Call to the SSE4_1 implementation >> else >> compute_generic (); // Call to the generic implementation >> >> Further, having it as builtin function allows it to be overridden by >> the programmer. For instance, the programmer can override it to >> identify newer CPU types not yet supported. Having these builtins >> makes it convenient to identify platform type and features in general. >> >> why you need >>> a separate pass to fold them (instead of just expanding them) and why >> >> I can move it into builtins.c along with where other builtins are >> folded and remove the separate pass. My intention originally was to >> fold them as early as possible, in this case after multi-versioning >> but I guess this is not a requirement. > > Yes, they should be folded by targetm.fold_builtin instead. The Frontend > should simply fold the tests at the time it creates them, that's as early > as possible (gimplification will also re-fold all builtin function calls). > >>> you are creating >>> vars behind the back of GCC: >> >> The flow I had in mind was to have functions in libgcc which will use >> CPUID to get target features and set global vars corresponding to the >> features. So, the builtin should be folded by into the appropriate >> variable in libgcc. > > Hm, but then the variable should reside in libgcc and you'd only need > an extern variant in the varpool. I'm not sure separate constructors > (possibly in each module ...) would be better than a single one in > libgcc that would get run unconditionally. > >>> >>> + /* Set finalized to 1, otherwise it asserts in function "write_symbol" in >>> + lto-streamer-out.c. */ >>> + vnode->finalized = 1; >>> >>> where I think you miss a varpool_finalize_node call somewhere. Why >>> isn't this all done at target init time >> >> I wanted to do this on demand. If none of the new builtins are called >> in the program, I do not need to to do this at all. In summary, libgcc >> has a function called __cpu_indicator_init which does the work of >> determining target features and setting the appropriate globals. If >> the new builtins are called, gcc will call __cpu_indicator_init in a >> constructor so that it is called exactly once. Then, gcc will fold the >> builtin to the appropriate global variable. > > I see, but this sounds like premature optimization to me, no? Considering > you'd do this in each module and our inability to merge those constructors > at link time. If we put __cpu_indicator, the constructor and the assorted > support into a separate module inside libgcc.a could we arrange it in a way > that if __cpu_indicator is not referenced from the program that piece isn't > linked in? (not sure if that is possible with constructors)
Ok, so two things. I create the constructor as a comdat. So, it is created by gcc in every module but at link time only one copy will be kept. So, it is going to be called only once and that is not a problem. The other thing is that I can eliminate all of this code gen in gcc for and mark this as a constructor in libgcc which means it will always be linked in and always be called once at run-time. There is no easy way right now to garbage collect unreferenced ctors at run-time. I do not have a strong opinion on this and I can do the latter. > > Richard. > >> >> ? If you don't mark the >>> variable as to be preserved >>> like you do cgraph will optimize it all away if it isn't needed. >> >>> >>> Richard. >>> >>>> * tree-pass.h (pass_tree_fold_builtin_target): New pass. >>>> * builtins.def (BUILT_IN_TARGET_SUPPORTS_CMOV): New builtin. >>>> (BUILT_IN_TARGET_SUPPORTS_MMX): New builtin. >>>> (BUILT_IN_TARGET_SUPPORTS_POPCOUNT): New builtin. >>>> (BUILT_IN_TARGET_SUPPORTS_SSE): New builtin. >>>> (BUILT_IN_TARGET_SUPPORTS_SSE2): New builtin. >>>> (BUILT_IN_TARGET_SUPPORTS_SSE3): New builtin. >>>> (BUILT_IN_TARGET_SUPPORTS_SSSE3): New builtin. >>>> (BUILT_IN_TARGET_SUPPORTS_SSE4_1): New builtin. >>>> (BUILT_IN_TARGET_SUPPORTS_SSE4_2): New builtin. >>>> (BUILT_IN_TARGET_IS_AMD): New builtin. >>>> (BUILT_IN_TARGET_IS_INTEL): New builtin. >>>> (BUILT_IN_TARGET_IS_COREI7_NEHALEM): New builtin. >>>> (BUILT_IN_TARGET_IS_COREI7_WESTMERE): New builtin. >>>> (BUILT_IN_TARGET_IS_COREI7_SANDYBRIDGE): New builtin. >>>> (BUILT_IN_TARGET_IS_AMDFAM10_BARCELONA): New builtin. >>>> (BUILT_IN_TARGET_IS_AMDFAM10_SHANGHAI): New builtin. >>>> (BUILT_IN_TARGET_IS_AMDFAM10_ISTANBUL): New builtin. >>>> * mversn-dispatch.c (do_fold_builtin_target): New function. >>>> (gate_fold_builtin_target): New function. >>>> (pass_tree_fold_builtin_target): New pass. >>>> * timevar.def (TV_FOLD_BUILTIN_TARGET): New var. >>>> * passes.c (init_optimization_passes): Add new pass to pass list. >>>> * config/i386/i386.c (build_struct_with_one_bit_fields): New >>>> function. >>>> (make_var_decl): New function. >>>> (get_field_from_struct): New function. >>>> (make_constructor_to_get_target_type): New function. >>>> (fold_builtin_target): New function. >>>> (ix86_fold_builtin): New function. >>>> (TARGET_FOLD_BUILTIN): New macro. >>>> >>>> * gcc.dg/builtin_target.c: New test. >>>> >>>> * config/i386/i386-cpuinfo.c: New file. >>>> * config/i386/t-cpuinfo: New file. >>>> * config.host: Add t-cpuinfo to link i386-cpuinfo.o with libgcc >>>> >>>> Index: libgcc/config.host >>>> =================================================================== >>>> --- libgcc/config.host (revision 177767) >>>> +++ libgcc/config.host (working copy) >>>> @@ -609,7 +609,7 @@ case ${host} in >>>> i[34567]86-*-linux* | x86_64-*-linux* | \ >>>> i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | \ >>>> i[34567]86-*-gnu*) >>>> - tmake_file="${tmake_file} t-tls" >>>> + tmake_file="${tmake_file} t-tls i386/t-cpuinfo" >>>> if test "$libgcc_cv_cfi" = "yes"; then >>>> tmake_file="${tmake_file} t-stack i386/t-stack-i386" >>>> fi >>>> Index: libgcc/config/i386/t-cpuinfo >>>> =================================================================== >>>> --- libgcc/config/i386/t-cpuinfo (revision 0) >>>> +++ libgcc/config/i386/t-cpuinfo (revision 0) >>>> @@ -0,0 +1,2 @@ >>>> +# This is an endfile >>>> +LIB2ADD += $(srcdir)/config/i386/i386-cpuinfo.c >>>> Index: libgcc/config/i386/i386-cpuinfo.c >>>> =================================================================== >>>> --- libgcc/config/i386/i386-cpuinfo.c (revision 0) >>>> +++ libgcc/config/i386/i386-cpuinfo.c (revision 0) >>>> @@ -0,0 +1,275 @@ >>>> +/* Copyright (C) 2011 Free Software Foundation, Inc. >>>> + * Contributed by Sriraman Tallam <tmsri...@google.com>. >>>> + * >>>> + * This file is free software; you can redistribute it and/or modify it >>>> + * under the terms of the GNU General Public License as published by the >>>> + * Free Software Foundation; either version 3, or (at your option) any >>>> + * later version. >>>> + * >>>> + * This file is distributed in the hope that it will be useful, but >>>> + * WITHOUT ANY WARRANTY; without even the implied warranty of >>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >>>> + * General Public License for more details. >>>> + * >>>> + * Under Section 7 of GPL version 3, you are granted additional >>>> + * permissions described in the GCC Runtime Library Exception, version >>>> + * 3.1, as published by the Free Software Foundation. >>>> + * >>>> + * You should have received a copy of the GNU General Public License and >>>> + * a copy of the GCC Runtime Library Exception along with this program; >>>> + * see the files COPYING3 and COPYING.RUNTIME respectively. If not, see >>>> + * <http://www.gnu.org/licenses/>. >>>> + * >>>> + * >>>> + * This code is adapted from gcc/config/i386/driver-i386.c. The CPUID >>>> + * instruction is used to figure out the cpu type and supported features. >>>> + * GCC runs __cpu_indicator_init from a constructor which sets the members >>>> + * of __cpu_model and __cpu_features. >>>> + */ >>>> + >>>> +#include <string.h> >>>> + >>>> +#ifdef __GNUC__ >>>> +#include "cpuid.h" >>>> + >>>> +enum processor_type >>>> +{ >>>> + PROCESSOR_PENTIUM = 0, >>>> + PROCESSOR_CORE2, >>>> + PROCESSOR_COREI7_NEHALEM, >>>> + PROCESSOR_COREI7_WESTMERE, >>>> + PROCESSOR_COREI7_SANDYBRIDGE, >>>> + PROCESSOR_INTEL_GENERIC, >>>> + PROCESSOR_AMDFAM10_BARCELONA, >>>> + PROCESSOR_AMDFAM10_SHANGHAI, >>>> + PROCESSOR_AMDFAM10_ISTANBUL, >>>> + PROCESSOR_AMDFAM10_GENERIC, >>>> + PROCESSOR_AMD_GENERIC, >>>> + PROCESSOR_GENERIC, >>>> + PROCESSOR_max >>>> +}; >>>> + >>>> +enum vendor_signatures >>>> +{ >>>> + SIG_INTEL = 0x756e6547 /* Genu */, >>>> + SIG_AMD = 0x68747541 /* Auth */ >>>> +}; >>>> + >>>> + >>>> +/* Features supported. */ >>>> + >>>> +struct __processor_features >>>> +{ >>>> + unsigned int __cpu_cmov : 1; >>>> + unsigned int __cpu_mmx : 1; >>>> + unsigned int __cpu_popcnt : 1; >>>> + unsigned int __cpu_sse : 1; >>>> + unsigned int __cpu_sse2 : 1; >>>> + unsigned int __cpu_sse3 : 1; >>>> + unsigned int __cpu_ssse3 : 1; >>>> + unsigned int __cpu_sse4_1 : 1; >>>> + unsigned int __cpu_sse4_2 : 1; >>>> +}; >>>> + >>>> +/* Flags exported. */ >>>> + >>>> +struct __processor_model >>>> +{ >>>> + unsigned int __cpu_is_amd : 1; >>>> + unsigned int __cpu_is_intel : 1; >>>> + unsigned int __cpu_is_corei7_nehalem : 1; >>>> + unsigned int __cpu_is_corei7_westmere : 1; >>>> + unsigned int __cpu_is_corei7_sandybridge : 1; >>>> + unsigned int __cpu_is_amdfam10_barcelona : 1; >>>> + unsigned int __cpu_is_amdfam10_shanghai : 1; >>>> + unsigned int __cpu_is_amdfam10_istanbul : 1; >>>> +}; >>>> + >>>> +enum processor_type __cpu_type = PROCESSOR_GENERIC; >>>> +struct __processor_features __cpu_features; >>>> +struct __processor_model __cpu_model; >>>> + >>>> +static void >>>> +get_amd_cpu (unsigned int family, unsigned int model) >>>> +{ >>>> + switch (family) >>>> + { >>>> + case 0x10: >>>> + switch (model) >>>> + { >>>> + case 0x2: >>>> + __cpu_type = PROCESSOR_AMDFAM10_BARCELONA; >>>> + __cpu_model.__cpu_is_amdfam10_barcelona = 1; >>>> + break; >>>> + case 0x4: >>>> + __cpu_type = PROCESSOR_AMDFAM10_SHANGHAI; >>>> + __cpu_model.__cpu_is_amdfam10_shanghai = 1; >>>> + break; >>>> + case 0x8: >>>> + __cpu_type = PROCESSOR_AMDFAM10_ISTANBUL; >>>> + __cpu_model.__cpu_is_amdfam10_istanbul = 1; >>>> + break; >>>> + default: >>>> + __cpu_type = PROCESSOR_AMDFAM10_GENERIC; >>>> + break; >>>> + } >>>> + break; >>>> + default: >>>> + __cpu_type = PROCESSOR_AMD_GENERIC; >>>> + } >>>> +} >>>> + >>>> +static void >>>> +get_intel_cpu (unsigned int family, unsigned int model, unsigned int >>>> brand_id) >>>> +{ >>>> + /* Parse family and model only if brand ID is 0. */ >>>> + if (brand_id == 0) >>>> + { >>>> + switch (family) >>>> + { >>>> + case 0x5: >>>> + __cpu_type = PROCESSOR_PENTIUM; >>>> + break; >>>> + case 0x6: >>>> + switch (model) >>>> + { >>>> + case 0x1a: >>>> + case 0x1e: >>>> + case 0x1f: >>>> + case 0x2e: >>>> + /* Nehalem. */ >>>> + __cpu_type = PROCESSOR_COREI7_NEHALEM; >>>> + __cpu_model.__cpu_is_corei7_nehalem = 1; >>>> + break; >>>> + case 0x25: >>>> + case 0x2c: >>>> + case 0x2f: >>>> + /* Westmere. */ >>>> + __cpu_type = PROCESSOR_COREI7_WESTMERE; >>>> + __cpu_model.__cpu_is_corei7_westmere = 1; >>>> + break; >>>> + case 0x2a: >>>> + /* Sandy Bridge. */ >>>> + __cpu_type = PROCESSOR_COREI7_SANDYBRIDGE; >>>> + __cpu_model.__cpu_is_corei7_sandybridge = 1; >>>> + break; >>>> + case 0x17: >>>> + case 0x1d: >>>> + /* Penryn. */ >>>> + case 0x0f: >>>> + /* Merom. */ >>>> + __cpu_type = PROCESSOR_CORE2; >>>> + break; >>>> + default: >>>> + __cpu_type = PROCESSOR_INTEL_GENERIC; >>>> + break; >>>> + } >>>> + break; >>>> + default: >>>> + /* We have no idea. */ >>>> + __cpu_type = PROCESSOR_INTEL_GENERIC; >>>> + break; >>>> + } >>>> + } >>>> +} >>>> + >>>> +static void >>>> +get_available_features (unsigned int ecx, unsigned int edx) >>>> +{ >>>> + __cpu_features.__cpu_cmov = (edx & bit_CMOV) ? 1 : 0; >>>> + __cpu_features.__cpu_mmx = (edx & bit_MMX) ? 1 : 0; >>>> + __cpu_features.__cpu_sse = (edx & bit_SSE) ? 1 : 0; >>>> + __cpu_features.__cpu_sse2 = (edx & bit_SSE2) ? 1 : 0; >>>> + __cpu_features.__cpu_popcnt = (ecx & bit_POPCNT) ? 1 : 0; >>>> + __cpu_features.__cpu_sse3 = (ecx & bit_SSE3) ? 1 : 0; >>>> + __cpu_features.__cpu_ssse3 = (ecx & bit_SSSE3) ? 1 : 0; >>>> + __cpu_features.__cpu_sse4_1 = (ecx & bit_SSE4_1) ? 1 : 0; >>>> + __cpu_features.__cpu_sse4_2 = (ecx & bit_SSE4_2) ? 1 : 0; >>>> +} >>>> + >>>> +/* A noinline function calling __get_cpuid. Having many calls to >>>> + cpuid in one function in 32-bit mode causes GCC to complain: >>>> + "can’t find a register in class ‘CLOBBERED_REGS’". This is >>>> + related to PR rtl-optimization 44174. */ >>>> + >>>> +static int __attribute__ ((noinline)) >>>> +__get_cpuid_output (unsigned int __level, >>>> + unsigned int *__eax, unsigned int *__ebx, >>>> + unsigned int *__ecx, unsigned int *__edx) >>>> +{ >>>> + return __get_cpuid (__level, __eax, __ebx, __ecx, __edx); >>>> +} >>>> + >>>> +/* This function will be linked in to binaries that need to look up >>>> + CPU information. */ >>>> + >>>> +void >>>> +__cpu_indicator_init(void) >>>> +{ >>>> + unsigned int eax, ebx, ecx, edx; >>>> + >>>> + int max_level = 5; >>>> + unsigned int vendor; >>>> + unsigned int model, family, brand_id; >>>> + >>>> + memset (&__cpu_features, 0, sizeof (struct __processor_features)); >>>> + memset (&__cpu_model, 0, sizeof (struct __processor_model)); >>>> + >>>> + /* Assume cpuid insn present. Run in level 0 to get vendor id. */ >>>> + if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx)) >>>> + return; >>>> + >>>> + vendor = ebx; >>>> + max_level = eax; >>>> + >>>> + if (max_level < 1) >>>> + return; >>>> + >>>> + if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx)) >>>> + return; >>>> + >>>> + model = (eax >> 4) & 0x0f; >>>> + family = (eax >> 8) & 0x0f; >>>> + brand_id = ebx & 0xff; >>>> + >>>> + /* Adjust model and family for Intel CPUS. */ >>>> + if (vendor == SIG_INTEL) >>>> + { >>>> + unsigned int extended_model, extended_family; >>>> + >>>> + extended_model = (eax >> 12) & 0xf0; >>>> + extended_family = (eax >> 20) & 0xff; >>>> + if (family == 0x0f) >>>> + { >>>> + family += extended_family; >>>> + model += extended_model; >>>> + } >>>> + else if (family == 0x06) >>>> + model += extended_model; >>>> + } >>>> + >>>> + /* Find CPU model. */ >>>> + >>>> + if (vendor == SIG_AMD) >>>> + { >>>> + __cpu_model.__cpu_is_amd = 1; >>>> + get_amd_cpu (family, model); >>>> + } >>>> + else if (vendor == SIG_INTEL) >>>> + { >>>> + __cpu_model.__cpu_is_intel = 1; >>>> + get_intel_cpu (family, model, brand_id); >>>> + } >>>> + >>>> + /* Find available features. */ >>>> + get_available_features (ecx, edx); >>>> +} >>>> + >>>> +#else >>>> + >>>> +void >>>> +__cpu_indicator_init(void) >>>> +{ >>>> +} >>>> + >>>> +#endif /* __GNUC__ */ >>>> Index: gcc/tree-pass.h >>>> =================================================================== >>>> --- gcc/tree-pass.h (revision 177767) >>>> +++ gcc/tree-pass.h (working copy) >>>> @@ -449,6 +449,7 @@ extern struct gimple_opt_pass pass_split_functions >>>> extern struct gimple_opt_pass pass_feedback_split_functions; >>>> extern struct gimple_opt_pass pass_threadsafe_analyze; >>>> extern struct gimple_opt_pass pass_tree_convert_builtin_dispatch; >>>> +extern struct gimple_opt_pass pass_tree_fold_builtin_target; >>>> >>>> /* IPA Passes */ >>>> extern struct simple_ipa_opt_pass pass_ipa_lower_emutls; >>>> Index: gcc/testsuite/gcc.dg/builtin_target.c >>>> =================================================================== >>>> --- gcc/testsuite/gcc.dg/builtin_target.c (revision 0) >>>> +++ gcc/testsuite/gcc.dg/builtin_target.c (revision 0) >>>> @@ -0,0 +1,49 @@ >>>> +/* This test checks if the __builtin_target_* calls are recognized. */ >>>> + >>>> +/* { dg-do run } */ >>>> + >>>> +int >>>> +fn1 () >>>> +{ >>>> + if (__builtin_target_supports_cmov () < 0) >>>> + return -1; >>>> + if (__builtin_target_supports_mmx () < 0) >>>> + return -1; >>>> + if (__builtin_target_supports_popcount () < 0) >>>> + return -1; >>>> + if (__builtin_target_supports_sse () < 0) >>>> + return -1; >>>> + if (__builtin_target_supports_sse2 () < 0) >>>> + return -1; >>>> + if (__builtin_target_supports_sse3 () < 0) >>>> + return -1; >>>> + if (__builtin_target_supports_ssse3 () < 0) >>>> + return -1; >>>> + if (__builtin_target_supports_sse4_1 () < 0) >>>> + return -1; >>>> + if (__builtin_target_supports_sse4_2 () < 0) >>>> + return -1; >>>> + if (__builtin_target_is_amd () < 0) >>>> + return -1; >>>> + if (__builtin_target_is_intel () < 0) >>>> + return -1; >>>> + if (__builtin_target_is_corei7_nehalem () < 0) >>>> + return -1; >>>> + if (__builtin_target_is_corei7_westmere () < 0) >>>> + return -1; >>>> + if (__builtin_target_is_corei7_sandybridge () < 0) >>>> + return -1; >>>> + if (__builtin_target_is_amdfam10_barcelona () < 0) >>>> + return -1; >>>> + if (__builtin_target_is_amdfam10_shanghai () < 0) >>>> + return -1; >>>> + if (__builtin_target_is_amdfam10_istanbul () < 0) >>>> + return -1; >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +int main () >>>> +{ >>>> + return fn1 (); >>>> +} >>>> Index: gcc/builtins.def >>>> =================================================================== >>>> --- gcc/builtins.def (revision 177767) >>>> +++ gcc/builtins.def (working copy) >>>> @@ -763,6 +763,25 @@ DEF_BUILTIN (BUILT_IN_EMUTLS_REGISTER_COMMON, >>>> /* Multiversioning builtin dispatch hook. */ >>>> DEF_GCC_BUILTIN (BUILT_IN_DISPATCH, "dispatch", >>>> BT_FN_INT_PTR_FN_INT_PTR_PTR_VAR, ATTR_NULL) >>>> >>>> +/* Builtins to determine target type and features at run-time. */ >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_CMOV, "target_supports_cmov", >>>> BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_MMX, "target_supports_mmx", >>>> BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_POPCOUNT, >>>> "target_supports_popcount", BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_SSE, "target_supports_sse", >>>> BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_SSE2, "target_supports_sse2", >>>> BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_SSE3, "target_supports_sse3", >>>> BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_SSSE3, "target_supports_ssse3", >>>> BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_SSE4_1, >>>> "target_supports_sse4_1", BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_SUPPORTS_SSE4_2, >>>> "target_supports_sse4_2", BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_IS_AMD, "target_is_amd", BT_FN_INT, >>>> ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_IS_INTEL, "target_is_intel", BT_FN_INT, >>>> ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_IS_COREI7_NEHALEM, >>>> "target_is_corei7_nehalem", BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_IS_COREI7_WESTMERE, >>>> "target_is_corei7_westmere", BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_IS_COREI7_SANDYBRIDGE, >>>> "target_is_corei7_sandybridge", BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_IS_AMDFAM10_BARCELONA, >>>> "target_is_amdfam10_barcelona", BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_IS_AMDFAM10_SHANGHAI, >>>> "target_is_amdfam10_shanghai", BT_FN_INT, ATTR_NULL) >>>> +DEF_GCC_BUILTIN (BUILT_IN_TARGET_IS_AMDFAM10_ISTANBUL, >>>> "target_is_amdfam10_istanbul", BT_FN_INT, ATTR_NULL) >>>> + >>>> /* Exception support. */ >>>> DEF_BUILTIN_STUB (BUILT_IN_UNWIND_RESUME, "__builtin_unwind_resume") >>>> DEF_BUILTIN_STUB (BUILT_IN_CXA_END_CLEANUP, "__builtin_cxa_end_cleanup") >>>> Index: gcc/mversn-dispatch.c >>>> =================================================================== >>>> --- gcc/mversn-dispatch.c (revision 177767) >>>> +++ gcc/mversn-dispatch.c (working copy) >>>> @@ -135,6 +135,7 @@ along with GCC; see the file COPYING3. If not see >>>> #include "output.h" >>>> #include "vecprim.h" >>>> #include "gimple-pretty-print.h" >>>> +#include "target.h" >>>> >>>> typedef struct cgraph_node* NODEPTR; >>>> DEF_VEC_P (NODEPTR); >>>> @@ -1764,3 +1765,103 @@ struct gimple_opt_pass pass_tree_convert_builtin_d >>>> TODO_update_ssa | TODO_verify_ssa >>>> } >>>> }; >>>> + >>>> +/* Fold calls to __builtin_target_* */ >>>> + >>>> +static unsigned int >>>> +do_fold_builtin_target (void) >>>> +{ >>>> + basic_block bb; >>>> + gimple_stmt_iterator gsi; >>>> + >>>> + /* Go through each stmt looking for __builtin_target_* calls */ >>>> + FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (current_function_decl)) >>>> + { >>>> + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) >>>> + { >>>> + gimple stmt = gsi_stmt (gsi); >>>> + gimple assign_stmt; >>>> + tree call_decl; >>>> + tree lhs_retval; >>>> + tree folded_val; >>>> + >>>> + tree ssa_var, tmp_var; >>>> + gimple init_stmt; >>>> + >>>> + if (!is_gimple_call (stmt)) >>>> + continue; >>>> + >>>> + call_decl = gimple_call_fndecl (stmt); >>>> + >>>> + /* Check if it is a __builtin_target_* call. */ >>>> + >>>> + if (call_decl == NULL >>>> + || DECL_NAME (call_decl) == NULL_TREE >>>> + || DECL_BUILT_IN_CLASS (call_decl) != BUILT_IN_NORMAL >>>> + || strstr (IDENTIFIER_POINTER (DECL_NAME (call_decl)), >>>> + "__builtin_target") == NULL) >>>> + continue; >>>> + >>>> + /* If the lhs is NULL there is no need to fold the call. */ >>>> + lhs_retval = gimple_call_lhs(stmt); >>>> + if (lhs_retval == NULL) >>>> + continue; >>>> + >>>> + /* Call the target hook to fold the builtin */ >>>> + folded_val = targetm.fold_builtin(call_decl, 0, NULL, false); >>>> + >>>> + /* If the target does not support the builtin then fold it to >>>> zero. */ >>>> + if (folded_val == NULL_TREE) >>>> + folded_val = build_zero_cst (unsigned_type_node); >>>> + >>>> + /* Type cast unsigned value to integer */ >>>> + tmp_var = create_tmp_var (unsigned_type_node, NULL); >>>> + init_stmt = gimple_build_assign (tmp_var, folded_val); >>>> + ssa_var = make_ssa_name (tmp_var, init_stmt); >>>> + gimple_assign_set_lhs (init_stmt, ssa_var); >>>> + mark_symbols_for_renaming (init_stmt); >>>> + >>>> + assign_stmt = gimple_build_assign_with_ops (NOP_EXPR, >>>> lhs_retval, ssa_var, 0); >>>> + mark_symbols_for_renaming(assign_stmt); >>>> + >>>> + gsi_insert_after_without_update (&gsi, assign_stmt, >>>> GSI_SAME_STMT); >>>> + gsi_insert_after_without_update (&gsi, init_stmt, GSI_SAME_STMT); >>>> + /* Delete the original call. */ >>>> + gsi_remove(&gsi, true); >>>> + } >>>> + } >>>> + >>>> + return 0; >>>> +} >>>> + >>>> +static bool >>>> +gate_fold_builtin_target (void) >>>> +{ >>>> + return true; >>>> +} >>>> + >>>> +/* Pass to fold __builtin_target_* functions */ >>>> + >>>> +struct gimple_opt_pass pass_tree_fold_builtin_target = >>>> +{ >>>> + { >>>> + GIMPLE_PASS, >>>> + "fold_builtin_target", /* name */ >>>> + gate_fold_builtin_target, /* gate */ >>>> + do_fold_builtin_target, /* execute */ >>>> + NULL, /* sub */ >>>> + NULL, /* next */ >>>> + 0, /* static_pass_number */ >>>> + TV_FOLD_BUILTIN_TARGET, /* tv_id */ >>>> + PROP_cfg, /* properties_required */ >>>> + PROP_cfg, /* properties_provided */ >>>> + 0, /* properties_destroyed */ >>>> + 0, /* todo_flags_start */ >>>> + TODO_dump_func | /* todo_flags_finish */ >>>> + TODO_cleanup_cfg | >>>> + TODO_update_ssa | >>>> + TODO_verify_ssa >>>> + } >>>> +}; >>>> + >>>> + >>>> Index: gcc/timevar.def >>>> =================================================================== >>>> --- gcc/timevar.def (revision 177767) >>>> +++ gcc/timevar.def (working copy) >>>> @@ -124,6 +124,7 @@ DEFTIMEVAR (TV_PARSE_INMETH , "parser inl >>>> DEFTIMEVAR (TV_TEMPLATE_INST , "template instantiation") >>>> DEFTIMEVAR (TV_INLINE_HEURISTICS , "inline heuristics") >>>> DEFTIMEVAR (TV_MVERSN_DISPATCH , "multiversion dispatch") >>>> +DEFTIMEVAR (TV_FOLD_BUILTIN_TARGET , "fold __builtin_target calls") >>>> DEFTIMEVAR (TV_INTEGRATION , "integration") >>>> DEFTIMEVAR (TV_TREE_GIMPLIFY , "tree gimplify") >>>> DEFTIMEVAR (TV_TREE_EH , "tree eh") >>>> Index: gcc/passes.c >>>> =================================================================== >>>> --- gcc/passes.c (revision 177767) >>>> +++ gcc/passes.c (working copy) >>>> @@ -1249,6 +1249,8 @@ init_optimization_passes (void) >>>> { >>>> struct opt_pass **p = &pass_ipa_multiversion_dispatch.pass.sub; >>>> NEXT_PASS (pass_tree_convert_builtin_dispatch); >>>> + /* Fold calls to __builtin_target_*. */ >>>> + NEXT_PASS (pass_tree_fold_builtin_target); >>>> /* Rebuilding cgraph edges is necessary as the above passes change >>>> the call graph. Otherwise, future optimizations use the old >>>> call graph and make wrong decisions sometimes.*/ >>>> Index: gcc/config/i386/i386.c >>>> =================================================================== >>>> --- gcc/config/i386/i386.c (revision 177767) >>>> +++ gcc/config/i386/i386.c (working copy) >>>> @@ -58,6 +58,8 @@ along with GCC; see the file COPYING3. If not see >>>> #include "sched-int.h" >>>> #include "sbitmap.h" >>>> #include "fibheap.h" >>>> +#include "tree-flow.h" >>>> +#include "tree-pass.h" >>>> >>>> enum upper_128bits_state >>>> { >>>> @@ -7867,6 +7869,338 @@ ix86_build_builtin_va_list (void) >>>> return ret; >>>> } >>>> >>>> +/* Returns a struct type with name NAME and number of fields equal to >>>> + NUM_FIELDS. Each field is a unsigned int bit field of length 1 bit. */ >>>> + >>>> +static tree >>>> +build_struct_with_one_bit_fields (int num_fields, const char *name) >>>> +{ >>>> + int i; >>>> + char field_name [10]; >>>> + tree field = NULL_TREE, field_chain = NULL_TREE; >>>> + tree type = make_node (RECORD_TYPE); >>>> + >>>> + strcpy (field_name, "k_field"); >>>> + >>>> + for (i = 0; i < num_fields; i++) >>>> + { >>>> + /* Name the fields, 0_field, 1_field, ... */ >>>> + field_name [0] = '0' + i; >>>> + field = build_decl (UNKNOWN_LOCATION, FIELD_DECL, >>>> + get_identifier (field_name), unsigned_type_node); >>>> + DECL_BIT_FIELD (field) = 1; >>>> + DECL_SIZE (field) = bitsize_one_node; >>>> + if (field_chain != NULL_TREE) >>>> + DECL_CHAIN (field) = field_chain; >>>> + field_chain = field; >>>> + } >>>> + finish_builtin_struct (type, name, field_chain, NULL_TREE); >>>> + return type; >>>> +} >>>> + >>>> +/* Returns a VAR_DECL of type TYPE and name NAME. */ >>>> + >>>> +static tree >>>> +make_var_decl (tree type, const char *name) >>>> +{ >>>> + tree new_decl; >>>> + struct varpool_node *vnode; >>>> + >>>> + new_decl = build_decl (UNKNOWN_LOCATION, >>>> + VAR_DECL, >>>> + get_identifier(name), >>>> + type); >>>> + >>>> + DECL_EXTERNAL (new_decl) = 1; >>>> + TREE_STATIC (new_decl) = 1; >>>> + TREE_PUBLIC (new_decl) = 1; >>>> + DECL_INITIAL (new_decl) = 0; >>>> + DECL_ARTIFICIAL (new_decl) = 0; >>>> + DECL_PRESERVE_P (new_decl) = 1; >>>> + >>>> + make_decl_one_only (new_decl, DECL_ASSEMBLER_NAME (new_decl)); >>>> + assemble_variable (new_decl, 0, 0, 0); >>>> + >>>> + vnode = varpool_node (new_decl); >>>> + gcc_assert (vnode != NULL); >>>> + /* Set finalized to 1, otherwise it asserts in function "write_symbol" >>>> in >>>> + lto-streamer-out.c. */ >>>> + vnode->finalized = 1; >>>> + >>>> + return new_decl; >>>> +} >>>> + >>>> +/* Traverses the chain of fields in STRUCT_TYPE and returns the FIELD_NUM >>>> + numbered field. */ >>>> + >>>> +static tree >>>> +get_field_from_struct (tree struct_type, int field_num) >>>> +{ >>>> + int i; >>>> + tree field = TYPE_FIELDS (struct_type); >>>> + >>>> + for (i = 0; i < field_num; i++, field = DECL_CHAIN(field)) >>>> + { >>>> + gcc_assert (field != NULL_TREE); >>>> + } >>>> + >>>> + return field; >>>> +} >>>> + >>>> +/* Create a new static constructor that calls __cpu_indicator_init () >>>> + function defined in libgcc/config/i386-cpuinfo.c which runs cpuid >>>> + to figure out the type of the target. */ >>>> + >>>> +static tree >>>> +make_constructor_to_get_target_type (const char *name) >>>> +{ >>>> + tree decl, type, t; >>>> + gimple_seq seq; >>>> + basic_block new_bb; >>>> + tree old_current_function_decl; >>>> + >>>> + tree __cpu_indicator_int_decl; >>>> + gimple constructor_body; >>>> + >>>> + >>>> + type = build_function_type_list (void_type_node, NULL_TREE); >>>> + >>>> + /* Make a call stmt to __cpu_indicator_init */ >>>> + __cpu_indicator_int_decl = build_fn_decl ("__cpu_indicator_init", type); >>>> + constructor_body = gimple_build_call (__cpu_indicator_int_decl, 0); >>>> + DECL_EXTERNAL (__cpu_indicator_int_decl) = 1; >>>> + >>>> + decl = build_fn_decl (name, type); >>>> + >>>> + DECL_NAME (decl) = get_identifier (name); >>>> + SET_DECL_ASSEMBLER_NAME (decl, DECL_NAME (decl)); >>>> + gcc_assert (cgraph_node (decl) != NULL); >>>> + >>>> + TREE_USED (decl) = 1; >>>> + DECL_ARTIFICIAL (decl) = 1; >>>> + DECL_IGNORED_P (decl) = 0; >>>> + TREE_PUBLIC (decl) = 0; >>>> + DECL_UNINLINABLE (decl) = 1; >>>> + DECL_EXTERNAL (decl) = 0; >>>> + DECL_CONTEXT (decl) = NULL_TREE; >>>> + DECL_INITIAL (decl) = make_node (BLOCK); >>>> + DECL_STATIC_CONSTRUCTOR (decl) = 1; >>>> + TREE_READONLY (decl) = 0; >>>> + DECL_PURE_P (decl) = 0; >>>> + >>>> + /* This is a comdat. */ >>>> + make_decl_one_only (decl, DECL_ASSEMBLER_NAME (decl)); >>>> + >>>> + /* Build result decl and add to function_decl. */ >>>> + t = build_decl (UNKNOWN_LOCATION, RESULT_DECL, NULL_TREE, >>>> void_type_node); >>>> + DECL_ARTIFICIAL (t) = 1; >>>> + DECL_IGNORED_P (t) = 1; >>>> + DECL_RESULT (decl) = t; >>>> + >>>> + gimplify_function_tree (decl); >>>> + >>>> + /* Build CFG for this function. */ >>>> + >>>> + old_current_function_decl = current_function_decl; >>>> + push_cfun (DECL_STRUCT_FUNCTION (decl)); >>>> + current_function_decl = decl; >>>> + init_empty_tree_cfg_for_function (DECL_STRUCT_FUNCTION (decl)); >>>> + cfun->curr_properties |= >>>> + (PROP_gimple_lcf | PROP_gimple_leh | PROP_cfg | PROP_referenced_vars | >>>> + PROP_ssa); >>>> + new_bb = create_empty_bb (ENTRY_BLOCK_PTR); >>>> + make_edge (ENTRY_BLOCK_PTR, new_bb, EDGE_FALLTHRU); >>>> + >>>> + /* XXX: Not sure if the edge commented below is necessary. If I add >>>> this >>>> + edge, it fails in gimple_verify_flow_info in tree-cfg.c in condition >>>> : >>>> + " if (e->flags & EDGE_FALLTHRU)" >>>> + during -fprofile-generate. >>>> + Otherwise, it is fine. Deleting this edge does not break anything. >>>> + Commenting this so that it is clear I am intentionally not doing >>>> this.*/ >>>> + /* make_edge (new_bb, EXIT_BLOCK_PTR, EDGE_FALLTHRU); */ >>>> + >>>> + seq = gimple_seq_alloc_with_stmt (constructor_body); >>>> + >>>> + set_bb_seq (new_bb, seq); >>>> + gimple_set_bb (constructor_body, new_bb); >>>> + >>>> + /* Set the lexical block of the constructor body. Fails the inliner >>>> + other wise. */ >>>> + gimple_set_block (constructor_body, DECL_INITIAL (decl)); >>>> + >>>> + /* This call is very important if this pass runs when the IR is in >>>> + SSA form. It breaks things in strange ways otherwise. */ >>>> + init_tree_ssa (DECL_STRUCT_FUNCTION (decl)); >>>> + /* add_referenced_var (version_selector_var); */ >>>> + >>>> + cgraph_add_new_function (decl, true); >>>> + cgraph_call_function_insertion_hooks (cgraph_node (decl)); >>>> + cgraph_mark_needed_node (cgraph_node (decl)); >>>> + >>>> + pop_cfun (); >>>> + current_function_decl = old_current_function_decl; >>>> + return decl; >>>> +} >>>> + >>>> +/* FNDECL is a __builtin_target_* call that is folded into an integer >>>> defined >>>> + in libgcc/config/i386/i386-cpuinfo.c */ >>>> + >>>> +static tree >>>> +fold_builtin_target (tree fndecl) >>>> +{ >>>> + /* This is the order of bit-fields in __processor_features in >>>> + i386-cpuinfo.c */ >>>> + enum processor_features >>>> + { >>>> + F_CMOV = 0, >>>> + F_MMX, >>>> + F_POPCNT, >>>> + F_SSE, >>>> + F_SSE2, >>>> + F_SSE3, >>>> + F_SSSE3, >>>> + F_SSE4_1, >>>> + F_SSE4_2, >>>> + F_MAX >>>> + }; >>>> + >>>> + /* This is the order of bit-fields in __processor_model in >>>> + i386-cpuinfo.c */ >>>> + enum processor_model >>>> + { >>>> + M_AMD = 0, >>>> + M_INTEL, >>>> + M_COREI7_NEHALEM, >>>> + M_COREI7_WESTMERE, >>>> + M_COREI7_SANDYBRIDGE, >>>> + M_AMDFAM10_BARCELONA, >>>> + M_AMDFAM10_SHANGHAI, >>>> + M_AMDFAM10_ISTANBUL, >>>> + M_MAX >>>> + }; >>>> + >>>> + static tree __processor_features_type = NULL_TREE; >>>> + static tree __cpu_features_var = NULL_TREE; >>>> + static tree __processor_model_type = NULL_TREE; >>>> + static tree __cpu_model_var = NULL_TREE; >>>> + static tree ctor_decl = NULL_TREE; >>>> + static tree field; >>>> + static tree which_struct; >>>> + >>>> + /* Make a call to __cpu_indicatior_init in a constructor. >>>> + Function __cpu_indicator_init is defined in i386-cpuinfo.c. */ >>>> + if (ctor_decl == NULL_TREE) >>>> + ctor_decl = make_constructor_to_get_target_type >>>> + ("__cpu_indicator_init_ctor"); >>>> + >>>> + if (__processor_features_type == NULL_TREE) >>>> + __processor_features_type = build_struct_with_one_bit_fields (F_MAX, >>>> + "__processor_features"); >>>> + >>>> + if (__processor_model_type == NULL_TREE) >>>> + __processor_model_type = build_struct_with_one_bit_fields (M_MAX, >>>> + "__processor_model"); >>>> + >>>> + if (__cpu_features_var == NULL_TREE) >>>> + __cpu_features_var = make_var_decl (__processor_features_type, >>>> + "__cpu_features"); >>>> + >>>> + if (__cpu_model_var == NULL_TREE) >>>> + __cpu_model_var = make_var_decl (__processor_model_type, >>>> + "__cpu_model"); >>>> + >>>> + /* Look at fndecl code to identify the field requested. */ >>>> + switch (DECL_FUNCTION_CODE (fndecl)) >>>> + { >>>> + case BUILT_IN_TARGET_SUPPORTS_CMOV: >>>> + field = get_field_from_struct (__processor_features_type, F_CMOV); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_SUPPORTS_MMX: >>>> + field = get_field_from_struct (__processor_features_type, F_MMX); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_SUPPORTS_POPCOUNT: >>>> + field = get_field_from_struct (__processor_features_type, F_POPCNT); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_SUPPORTS_SSE: >>>> + field = get_field_from_struct (__processor_features_type, F_SSE); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_SUPPORTS_SSE2: >>>> + field = get_field_from_struct (__processor_features_type, F_SSE2); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_SUPPORTS_SSE3: >>>> + field = get_field_from_struct (__processor_features_type, F_SSE3); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_SUPPORTS_SSSE3: >>>> + field = get_field_from_struct (__processor_features_type, F_SSE3); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_SUPPORTS_SSE4_1: >>>> + field = get_field_from_struct (__processor_features_type, F_SSE4_1); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_SUPPORTS_SSE4_2: >>>> + field = get_field_from_struct (__processor_features_type, F_SSE4_2); >>>> + which_struct = __cpu_features_var; >>>> + break; >>>> + case BUILT_IN_TARGET_IS_AMD: >>>> + field = get_field_from_struct (__processor_model_type, M_AMD);; >>>> + which_struct = __cpu_model_var; >>>> + break; >>>> + case BUILT_IN_TARGET_IS_INTEL: >>>> + field = get_field_from_struct (__processor_model_type, M_INTEL);; >>>> + which_struct = __cpu_model_var; >>>> + break; >>>> + case BUILT_IN_TARGET_IS_COREI7_NEHALEM: >>>> + field = get_field_from_struct (__processor_model_type, >>>> M_COREI7_NEHALEM);; >>>> + which_struct = __cpu_model_var; >>>> + break; >>>> + case BUILT_IN_TARGET_IS_COREI7_WESTMERE: >>>> + field = get_field_from_struct (__processor_model_type, >>>> M_COREI7_WESTMERE);; >>>> + which_struct = __cpu_model_var; >>>> + break; >>>> + case BUILT_IN_TARGET_IS_COREI7_SANDYBRIDGE: >>>> + field = get_field_from_struct (__processor_model_type, >>>> M_COREI7_SANDYBRIDGE);; >>>> + which_struct = __cpu_model_var; >>>> + break; >>>> + case BUILT_IN_TARGET_IS_AMDFAM10_BARCELONA: >>>> + field = get_field_from_struct (__processor_model_type, >>>> M_AMDFAM10_BARCELONA);; >>>> + which_struct = __cpu_model_var; >>>> + break; >>>> + case BUILT_IN_TARGET_IS_AMDFAM10_SHANGHAI: >>>> + field = get_field_from_struct (__processor_model_type, >>>> M_AMDFAM10_SHANGHAI);; >>>> + which_struct = __cpu_model_var; >>>> + break; >>>> + case BUILT_IN_TARGET_IS_AMDFAM10_ISTANBUL: >>>> + field = get_field_from_struct (__processor_model_type, >>>> M_AMDFAM10_ISTANBUL);; >>>> + which_struct = __cpu_model_var; >>>> + break; >>>> + default: >>>> + return NULL_TREE; >>>> + } >>>> + >>>> + return build3 (COMPONENT_REF, TREE_TYPE (field), which_struct, field, >>>> NULL_TREE); >>>> +} >>>> + >>>> +/* Folds __builtin_target_* builtins. */ >>>> + >>>> +static tree >>>> +ix86_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, >>>> + tree *args ATTRIBUTE_UNUSED, bool ignore >>>> ATTRIBUTE_UNUSED) >>>> +{ >>>> + const char *decl_name = IDENTIFIER_POINTER (DECL_NAME (fndecl)); >>>> + if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL >>>> + && strstr(decl_name, "__builtin_target") != NULL) >>>> + return fold_builtin_target (fndecl); >>>> + >>>> + return NULL_TREE; >>>> +} >>>> + >>>> /* Worker function for TARGET_SETUP_INCOMING_VARARGS. */ >>>> >>>> static void >>>> @@ -35097,6 +35431,9 @@ ix86_autovectorize_vector_sizes (void) >>>> #undef TARGET_BUILD_BUILTIN_VA_LIST >>>> #define TARGET_BUILD_BUILTIN_VA_LIST ix86_build_builtin_va_list >>>> >>>> +#undef TARGET_FOLD_BUILTIN >>>> +#define TARGET_FOLD_BUILTIN ix86_fold_builtin >>>> + >>>> #undef TARGET_ENUM_VA_LIST_P >>>> #define TARGET_ENUM_VA_LIST_P ix86_enum_va_list >>>> >>>> >>>> -- >>>> This patch is available for review at http://codereview.appspot.com/4893046 >>>> >>> >> >