Re: [PATCH] Add infrastructure to merge standard builtin enums with backend builtins
On Sat, Aug 27, 2011 at 12:12 AM, Mike Stump wrote: > On Aug 26, 2011, at 7:19 AM, Michael Meissner wrote: The alternative is something like what Kenney and Mike are doing in their private port, where they have new syntax in the MD file for builtins. >>> >>> But are those user-exposed builtins? Certainly interesting to combine >>> builtin definition and the instruction it expands to. >> >> Yes, these are user exposed builtins. Massive amounts of user exposed >> builtins >> (Mike said he needs 13 bits for the builtin index). I think it would be >> better >> if Mike comments on this. > > I gave the quick intro yesterday. You wind up specifying the built-ins that > you have, and the generator does things like assign enum values, create a > file that appears the builtins into the user name space from the __builtin_ > namespace, generate compilation test cases for all the built-ins with all > different types they support. Generate executable testcases to ensure > everything works flawlessly. We have mods to the overload builtin mechanism > so that one can do things like: > > template > T foo(T x, T y) { > x = add(x, y); > return x; > } > > Or, if you perfer the C version: > > int fooi(int x, int y) { > return add(x, y); > } > > short foos(short x, short y) { > return add(x, y); > } > > and have it work out just fine when T is instantiated with all the various > types that are supported by the hardware, and it works in C. This permits a > nice api for the machine builtins, as you don't have to mangle in types into > the builtin-name. The system is complete enough to handle the needs of > anything coming down the pike in the next decade. It can handle input/output > parameters that have register assignments. It can handle reference > parameters (like the input/output parameters, but these are done as values in > memory. The generator builds up _all_ the types one needs, handles all the > registration and all the wiring up for codegen. There is a mechanism to > remap arguments going to the rtl generators, so the operand ordering of the > builtin doesn't have to match the operand ordering of the md pattern for the > semantics that back the builtin. There is a beefy macro system built into > the generator so that you can have nice simple patterns and it is beefier > than the iterators one can use today. So, for example, we have: > > (define_special_iterator imath3 [add sub mul]) > > to define some built-ins that are regular with respect to the operation, but, > this isn't a code nor mode iterator, it just iterators the pattern with the > string substituted. For machines with any regularity, the patterns wind up > being smaller and easier to maintain. I'd be happy to answer questions about > it. Maybe you can even post the code somewhere? Richard.
[PATCH, i386]: Fix PR50202, ICE: in final_scan_insn, at final.c:2709 (could not split insn) with __builtin_ia32_pcmpistri128
Hello! Attached patch fixes corner case with -fno-dse, where the insn has all outputs unused. Do not bother with the insn in this case and simply delete it from splitter. 2011-08-27 Uros Bizjak PR target/50202 * config/i386/sse.md (sse4_2_pcmpestr): Emit NOTE_INSN_DELETED note when all outputs are unused. (sse4_2_pcmpistr): Ditto. testsuite/ChangeLog: 2011-08-27 Uros Bizjak PR target/50202 * gcc.target/i386/pr50202.c: New test. Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. Committed to mainline SVN and to 4.6. Uros. Index: config/i386/sse.md === --- config/i386/sse.md (revision 178129) +++ config/i386/sse.md (working copy) @@ -9734,6 +9734,9 @@ operands[2], operands[3], operands[4], operands[5], operands[6])); + if (!(flags || ecx || xmm0)) +emit_note (NOTE_INSN_DELETED); + DONE; } [(set_attr "type" "sselog") @@ -9861,6 +9864,9 @@ emit_insn (gen_sse4_2_pcmpistr_cconly (NULL, NULL, operands[2], operands[3], operands[4])); + if (!(flags || ecx || xmm0)) +emit_note (NOTE_INSN_DELETED); + DONE; } [(set_attr "type" "sselog") Index: testsuite/gcc.target/i386/pr50202.c === --- testsuite/gcc.target/i386/pr50202.c (revision 0) +++ testsuite/gcc.target/i386/pr50202.c (revision 0) @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fno-tree-dse -fno-dce -msse4" } */ +/* { dg-require-effective-target sse4 } */ + +typedef char __v16qi __attribute__ ((__vector_size__ (16))); + +__v16qi v; +int i; + +void +foo (void) +{ + i = __builtin_ia32_pcmpistri128 (v, v, 255); + i = 255; +}
[PATCH, MELT] fixing upgrade-warmelt target
Hello, I am trying to fix upgrade-warmelt into last revision of MELT. We are using some move-if-change on meltdesc file (in melt-stage3 for example) to make a save (to a meltdesc\~) but we still need the meltdesc file for the generated files. So I replaced move-ifchange by a cp and it goes beyond (but there are still issues. Pierre Vittet Index: melt-build.tpl === --- melt-build.tpl (révision 178131) +++ melt-build.tpl (copie de travail) @@ -579,7 +579,7 @@ ENDFOR melt_translator_file+] [+FOR melt_translator_file+] @echo upgrading MELT translator [+base+] ## dont indent the [+base+]+meltdesc.c - $(melt_make_move) $(MELT_LAST_STAGE)/[+base+]+meltdesc.c $(MELT_LAST_STAGE)/[+base+]+meltdesc.c~; \ + cp $(MELT_LAST_STAGE)/[+base+]+meltdesc.c $(MELT_LAST_STAGE)/[+base+]+meltdesc.c~; \ sed s/$(MELT_LAST_STAGE)/MELT-STAGE-ZERO/g $(MELT_LAST_STAGE)/[+base+]+meltdesc.c > $(srcdir)/melt/generated/[+base+]+meltdesc.c for f in $(MELT_LAST_STAGE)/[+base+].c $(MELT_LAST_STAGE)/[+base+]+[0-9]*.c ; do \ bf=`basename $$f`; \ 2011-08-27 Pierre Vittet * melt-build.tpl (warmelt-upgrade-translator): replace move-if-change by a cp. Index: Makefile.in === --- Makefile.in (révision 178131) +++ Makefile.in (copie de travail) @@ -5516,7 +5516,7 @@ upgrade-warmelt: $(WARMELT_LAST) for f in $(wildcard meltrunsup*.[ch]); do \ cp $$f $$f-tmp; \ cp $(srcdir)/melt/generated/$$f $$f-old; \ - $(SHELL) $(srcdir)/../move-if-change $$f-tmp $(srcdir)/melt/generated/$$f; \ + $(SHELL) cp $$f-tmp $(srcdir)/melt/generated/$$f; \ done $(RM) melt-runtime.o melt-runtime.i s-gtype */warmelt*.o $(MAKE) s-gtype 2011-08-27 Pierre Vittet * Makefile.in (upgrade-warmelt): replace move-if-change by a cp.
Re: [libcpp,lto,fortran PATCH] Fix linemap_add use and remove unnecessary kludge
Tom Tromey writes: > Dodji>* line-map.c (linemap_add): Assert that reason must not be > Dodji>LC_RENAME when called for the first time on a "main input file". > > This is ok. I can't approve the rest but it seems reasonable. Tobias Burnus writes: > The Fortran part is OK. Thanks for the patch! Thank you Tom and Tobias. Diego, are the PCH and LTO changes OK? Thanks. -- Dodji
Re: [v3] Handle different versions of Solaris 8 ,
Hi, > -PASS: abi/header_cxxabi.c (test for excess errors) > +FAIL: abi/header_cxxabi.c (test for excess errors) > > FAIL: abi/header_cxxabi.c (test for excess errors) > Excess errors: > /var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/i386-pc-solaris2.11/bits/c++config.h:167:1: > error: unknown type name 'namespace' > /var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/i386-pc-solaris2.11/bits/c++config.h:168:1: > error: expected '=', ',', ';', 'asm' or '__attribute__' before '{' token > > which is pretty obvious given that this test is supposed to be compiled > as C :-) To be honest, I'm not at all sure to understand what's going on here, maybe we should return to the fail later. > I guess the patch is ok now? Yes. Nice that Jon replied in the meanwhile and clarified the undefined behavior issue: for the time being I think we can keep the __cplusplus checks, should also help during this testing period. We can certainly clean up the thing later. Maybe you could add a comment somewhere summarizing what Jon wrote. By the way I noticed only today (sorry, I'm using a phone ;) that all the new configury work impacts only Solaris configuration, thus in general we are on pretty safe ground Paolo
[PATCH, i386]: Add REG_EQUAL notes to SSE mult sequences (+ a fix in legitimize_tls_address)
Hello! 2011-08-27 Uros Bizjak * config/i386/sse.md (mulv16qi3): Attach REG_EQUAL note. (*sse2_mulv4si3): Ditto. (mulv2di3): Ditto. * config/i386/i386.c (legitimize_tls_address): Change REG_EQIV notes to REG_EQUAL. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: sse.md === --- sse.md (revision 178130) +++ sse.md (working copy) @@ -4726,6 +4726,9 @@ /* Extract the even bytes and merge them back together. */ ix86_expand_vec_extract_even_odd (operands[0], t[5], t[4], 0); + + set_unique_reg_note (get_last_insn (), REG_EQUAL, + gen_rtx_MULT (V16QImode, operands[1], operands[2])); DONE; }) @@ -5179,6 +5182,9 @@ /* Merge the parts back together. */ emit_insn (gen_vec_interleave_lowv4si (op0, t5, t6)); + + set_unique_reg_note (get_last_insn (), REG_EQUAL, + gen_rtx_MULT (V4SImode, operands[1], operands[2])); DONE; }) @@ -5261,6 +5267,9 @@ emit_insn (gen_addv2di3 (t6, t1, t4)); emit_insn (gen_addv2di3 (op0, t6, t5)); } + + set_unique_reg_note (get_last_insn (), REG_EQUAL, + gen_rtx_MULT (V2DImode, operands[1], operands[2])); DONE; }) Index: i386.c === --- i386.c (revision 178129) +++ i386.c (working copy) @@ -12268,7 +12268,7 @@ legitimize_tls_address (rtx x, enum tls_model mode tp = get_thread_pointer (true); dest = force_reg (Pmode, gen_rtx_PLUS (Pmode, tp, dest)); - set_unique_reg_note (get_last_insn (), REG_EQUIV, x); + set_unique_reg_note (get_last_insn (), REG_EQUAL, x); } else { @@ -12315,7 +12315,7 @@ legitimize_tls_address (rtx x, enum tls_model mode emit_insn (gen_tls_dynamic_gnu2_32 (base, tmp, pic)); tp = get_thread_pointer (true); - set_unique_reg_note (get_last_insn (), REG_EQUIV, + set_unique_reg_note (get_last_insn (), REG_EQUAL, gen_rtx_MINUS (Pmode, tmp, tp)); } else @@ -12331,7 +12331,7 @@ legitimize_tls_address (rtx x, enum tls_model mode insns = get_insns (); end_sequence (); - /* Attach a unique REG_EQUIV, to allow the RTL optimizers to + /* Attach a unique REG_EQUAL, to allow the RTL optimizers to share the LD_BASE result with other LD model accesses. */ eqv = gen_rtx_UNSPEC (Pmode, gen_rtvec (1, const0_rtx), UNSPEC_TLS_LD_BASE); @@ -12352,7 +12352,7 @@ legitimize_tls_address (rtx x, enum tls_model mode { dest = force_reg (Pmode, gen_rtx_PLUS (Pmode, dest, tp)); - set_unique_reg_note (get_last_insn (), REG_EQUIV, x); + set_unique_reg_note (get_last_insn (), REG_EQUAL, x); } break;
[PATCH, i386]: A couple of small fixes
Hello! 2011-08-27 Uros Bizjak * config/i386/sse.md (*absneg2): Fix split condition. (vec_extract_lo_): Prevent both operands in memory. (vec_extract_lo_v16hi): Ditto. (*vec_extract_v4sf_mem): Add TARGET_SSE insn constraint. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN and 4.6 branch. Uros. Index: sse.md === --- sse.md (revision 178130) +++ sse.md (working copy) @@ -648,7 +648,7 @@ (use (match_operand:VF 2 "nonimmediate_operand""xm,0, xm,x"))] "TARGET_SSE" "#" - "reload_completed" + "&& reload_completed" [(const_int 0)] { enum rtx_code absneg_op; @@ -3708,7 +3708,7 @@ (vec_select: (match_operand:VI8F_256 1 "nonimmediate_operand" "xm,x") (parallel [(const_int 0) (const_int 1)])))] - "TARGET_AVX" + "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "#" "&& reload_completed" [(const_int 0)] @@ -3742,7 +3742,7 @@ (match_operand:VI4F_256 1 "nonimmediate_operand" "xm,x") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])))] - "TARGET_AVX" + "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "#" "&& reload_completed" [(const_int 0)] @@ -3779,7 +3779,7 @@ (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)])))] - "TARGET_AVX" + "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "#" "&& reload_completed" [(const_int 0)] @@ -3822,7 +3822,7 @@ (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])))] - "TARGET_AVX" + "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "#" "&& reload_completed" [(const_int 0)] @@ -3876,9 +3876,9 @@ (vec_select:SF (match_operand:V4SF 1 "memory_operand" "o") (parallel [(match_operand 2 "const_0_to_3_operand" "n")])))] - "" + "TARGET_SSE" "#" - "reload_completed" + "&& reload_completed" [(const_int 0)] { int i = INTVAL (operands[2]);
Re: [libcpp,lto,fortran PATCH] Fix linemap_add use and remove unnecessary kludge
On Sat, Aug 27, 2011 at 12:18 PM, Dodji Seketeli wrote: > Tom Tromey writes: > >> Dodji> * line-map.c (linemap_add): Assert that reason must not be >> Dodji> LC_RENAME when called for the first time on a "main input >> file". >> >> This is ok. I can't approve the rest but it seems reasonable. > > Tobias Burnus writes: > >> The Fortran part is OK. Thanks for the patch! > > Thank you Tom and Tobias. > > Diego, are the PCH and LTO changes OK? In the LTO FE the two linemap_add calls were to advance the location counter to cover the builtin special locations. You exchange these with only one - that doesn't look correct without more explanation. The PCH change is ok. Thanks, Richard. > Thanks. > > -- > Dodji >
Re: [x86] Use match_test for .md attributes
On Mon, Aug 15, 2011 at 11:57 AM, Richard Sandiford wrote: >>> Following on from the two patches I've just posted, this one makes >>> config/i386/*.md use match_test for .md attributes. Tested as >>> described here: >> >>> http://gcc.gnu.org/ml/gcc-patches/2011-08/msg01182.html >> >>> * config/i386/i386.md: Use (match_test ...) for attribute tests. >>> * config/i386/mmx.md: Likewise. >>> * config/i386/sse.md: Likewise. >> >> - (eq (symbol_ref "TARGET_SSE2") (const_int 0))) >> + (not (match_test "TARGET_SSE2"))) >> >> Jus a question - in predicates.md, i.e. (match_test "!TARGET_SSE2") is >> used. Do we want to standardize on (not (match_test "...")) form >> everywhere? > > Yeah, good question. I'd used (not (match_test ...)) so that genattrtab > could better optimise combinations of expressions. I suppose we don't > yet combine predicate expressions in the same way, so it probably makes > no difference there. We might use predicate expressions more in future > though. > > I'm happy to convert predicate match_tests at the same time. That would be much appreciated. Thanks, Uros.
Re: [PATCH][ARM] -m{cpu,tune,arch}=native
On 26/08/11 17:16, Joseph S. Myers wrote: arm-tables.opt is a generated file. You need to modify the source files and regenerate it, not modify the generated file. Fixed; the "native" option value is now defined in arm.opt. Thanks for spotting this. OK? Andrew 2011-08-27 Andrew Stubbs gcc/ * config.host (arm*-*-linux*): Add driver-arm.o and x-arm. * config/arm/arm.opt: Add 'native' processor_type and arm_arch enum values. * config/arm/arm.h (host_detect_local_cpu): New prototype. (EXTRA_SPEC_FUNCTIONS): New define. (MCPU_MTUNE_NATIVE_SPECS): New define. (DRIVER_SELF_SPECS): New define. * config/arm/driver-arm.c: New file. * config/arm/x-arm: New file. * doc/invoke.texi (ARM Options): Document -mcpu=native, -mtune=native and -march=native. --- a/gcc/config.host +++ b/gcc/config.host @@ -100,6 +100,14 @@ case ${host} in esac case ${host} in + arm*-*-linux*) +case ${target} in + arm*-*-*) + host_extra_gcc_objs="driver-arm.o" + host_xmake_file="${host_xmake_file} arm/x-arm" + ;; +esac +;; alpha*-*-linux* | alpha*-dec-osf*) case ${target} in alpha*-*-linux* | alpha*-dec-osf*) --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -2223,4 +2223,21 @@ extern int making_const_table; instruction. */ #define MAX_LDM_STM_OPS 4 +/* -mcpu=native handling only makes sense with compiler running on + an ARM chip. */ +#if defined(__arm__) +extern const char *host_detect_local_cpu (int argc, const char **argv); +# define EXTRA_SPEC_FUNCTIONS \ + { "local_cpu_detect", host_detect_local_cpu }, + +# define MCPU_MTUNE_NATIVE_SPECS \ + " %{march=native:%http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" + +static struct { + const char *part_no; + const char *arch_name; + const char *cpu_name; +} cpu_table[] = { +{"0xc08", "armv7-a", "cortex-a8"}, +{"0xc09", "armv7-a", "cortex-a9"}, +{NULL, NULL, NULL} +}; + +/* This will be called by the spec parser in gcc.c when it sees + a %:local_cpu_detect(args) construct. Currently it will be called + with either "arch", "cpu" or "tune" as argument depending on if + -march=native, -mcpu=native or -mtune=native is to be substituted. + + It returns a string containing new command line parameters to be + put at the place of the above two options, depending on what CPU + this is executed. E.g. "-march=armv7-a" on a Cortex-A8 for + -march=native. If the routine can't detect a known processor, + the -march or -mtune option is discarded. + + ARGC and ARGV are set depending on the actual arguments given + in the spec. */ +const char * +host_detect_local_cpu (int argc, const char **argv) +{ + const char *val = NULL; + char buf[128]; + FILE *f; + bool arch; + + if (argc < 1) +return NULL; + + arch = strcmp (argv[0], "arch") == 0; + if (!arch && strcmp (argv[0], "cpu") != 0 && strcmp (argv[0], "tune")) +return NULL; + + f = fopen ("/proc/cpuinfo", "r"); + if (f == NULL) +return NULL; + + while (fgets (buf, sizeof (buf), f) != NULL) +if (strncmp (buf, "CPU part", sizeof ("CPU part") - 1) == 0) + { + int i; + for (i = 0; cpu_table[i].part_no != NULL; i++) + if (strstr (buf, cpu_table[i].part_no) != NULL) + { + val = arch ? cpu_table[i].arch_name : cpu_table[i].cpu_name; + break; + } + break; + } + + fclose (f); + + if (val == NULL) +return NULL; + + return concat ("-m", argv[0], "=", val, NULL); +} --- /dev/null +++ b/gcc/config/arm/x-arm @@ -0,0 +1,3 @@ +driver-arm.o: $(srcdir)/config/arm/driver-arm.c \ + $(CONFIG_H) $(SYSTEM_H) + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) $< --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -10318,6 +10318,11 @@ assembly code. Permissible names are: @samp{arm2}, @samp{arm250}, @samp{fa526}, @samp{fa626}, @samp{fa606te}, @samp{fa626te}, @samp{fmp626}, @samp{fa726te}. +@option{-mcpu=native} causes the compiler to auto-detect the CPU +of the build computer. At present, this feature is only supported on +Linux, and not all architectures are recognised. If the auto-detect is +unsuccessful the option has no effect. + @item -mtune=@var{name} @opindex mtune This option is very similar to the @option{-mcpu=} option, except that @@ -10329,6 +10334,11 @@ will generate based on the CPU specified by a @option{-mcpu=} option. For some ARM implementations better performance can be obtained by using this option. +@option{-mtune=native} causes the compiler to auto-detect the CPU +of the build computer. At present, this feature is only supported on +Linux, and not all architectures are recognised. If the auto-detect is +unsuccessful the option has no effect. + @item -march=@var{name} @opindex march This specifies the name of the target ARM architecture. GCC uses this @@ -10342,6 +10352,11 @@ of the @option{-mcpu=} option. Permissible names are: @samp{armv2}, @samp{armv7}, @samp{arm
Re: [PATCH][ARM] Generic tuning
On 26/08/11 17:18, Joseph S. Myers wrote: Again, arm-tables.opt is generated - so the log entry should just be * config/arm/arm-tables.opt: Regenerate. and the file should be what you get from regeneration. Changelog entry updated. The file was already correct. OK? Andrew 2011-08-27 Andrew Stubbs gcc/ * config/arm/arm-cores.def (generic-armv7-a): New architecture. * config/arm/arm-tables.opt: Regenerate. * config/arm/arm-tune.md: Regenerate. * config/arm/arm.c (arm_file_start): Output .arch directive when user passes -mcpu=generic-*. (arm_issue_rate): Add genericv7a support. * config/arm/arm.h (EXTRA_SPECS): Add asm_cpu_spec. (ASM_CPU_SPEC): New define. * config/arm/elf.h (ASM_SPEC): Use %(asm_cpu_spec). * config/arm/semi.h (ASM_SPEC): Likewise. * doc/invoke.texi (ARM Options): Document -mcpu=generic-* and -mtune=generic-*. --- a/gcc/config/arm/arm-cores.def +++ b/gcc/config/arm/arm-cores.def @@ -124,6 +124,7 @@ ARM_CORE("mpcorenovfp", mpcorenovfp, 6K, FL_LDSCHED, 9e) ARM_CORE("mpcore", mpcore, 6K, FL_LDSCHED | FL_VFPV2, 9e) ARM_CORE("arm1156t2-s", arm1156t2s, 6T2, FL_LDSCHED, v6t2) ARM_CORE("arm1156t2f-s", arm1156t2fs, 6T2, FL_LDSCHED | FL_VFPV2, v6t2) +ARM_CORE("generic-armv7-a", genericv7a, 7A, FL_LDSCHED, cortex) ARM_CORE("cortex-a5", cortexa5, 7A, FL_LDSCHED, cortex_a5) ARM_CORE("cortex-a8", cortexa8, 7A, FL_LDSCHED, cortex) ARM_CORE("cortex-a9", cortexa9, 7A, FL_LDSCHED, cortex_a9) @@ -135,3 +136,4 @@ ARM_CORE("cortex-m4", cortexm4, 7EM, FL_LDSCHED, cortex) ARM_CORE("cortex-m3", cortexm3, 7M, FL_LDSCHED, cortex) ARM_CORE("cortex-m1", cortexm1, 6M, FL_LDSCHED, cortex) ARM_CORE("cortex-m0", cortexm0, 6M, FL_LDSCHED, cortex) + --- a/gcc/config/arm/arm-tables.opt +++ b/gcc/config/arm/arm-tables.opt @@ -232,6 +232,9 @@ EnumValue Enum(processor_type) String(arm1156t2f-s) Value(arm1156t2fs) EnumValue +Enum(processor_type) String(generic-armv7-a) Value(genericv7a) + +EnumValue Enum(processor_type) String(cortex-a5) Value(cortexa5) EnumValue --- a/gcc/config/arm/arm-tune.md +++ b/gcc/config/arm/arm-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from arm-cores.def (define_attr "tune" - "arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,cortexa5,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexm4,cortexm3,cortexm1,cortexm0" + "arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7d,arm7di,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7m,arm7dm,arm7dmi,arm8,arm810,strongarm,strongarm110,strongarm1100,strongarm1110,fa526,fa626,arm7tdmi,arm7tdmis,arm710t,arm720t,arm740t,arm9,arm9tdmi,arm920,arm920t,arm922t,arm940t,ep9312,arm10tdmi,arm1020t,arm9e,arm946es,arm966es,arm968es,arm10e,arm1020e,arm1022e,xscale,iwmmxt,iwmmxt2,fa606te,fa626te,fmp626,fa726te,arm926ejs,arm1026ejs,arm1136js,arm1136jfs,arm1176jzs,arm1176jzfs,mpcorenovfp,mpcore,arm1156t2s,arm1156t2fs,genericv7a,cortexa5,cortexa8,cortexa9,cortexa15,cortexr4,cortexr4f,cortexr5,cortexm4,cortexm3,cortexm1,cortexm0" (const (symbol_ref "((enum attr_tune) arm_tune)"))) --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -22195,6 +22195,8 @@ arm_file_start (void) const char *fpu_name; if (arm_selected_arch) asm_fprintf (asm_out_file, "\t.arch %s\n", arm_selected_arch->name); + else if (strncmp (arm_selected_cpu->name, "generic", 7) == 0) + asm_fprintf (asm_out_file, "\t.arch %s\n", arm_selected_cpu->name + 8); else asm_fprintf (asm_out_file, "\t.cpu %s\n", arm_selected_cpu->name); @@ -23719,6 +23721,7 @@ arm_issue_rate (void) case cortexr4: case cortexr4f: case cortexr5: +case genericv7a: case cortexa5: case cortexa8: case cortexa9: --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -189,6 +189,7 @@ extern void (*arm_lang_output_object_attributes_hook)(void); Do not define this macro if it does not need to do anything. */ #define EXTRA_SPECS \ { "subtarget_cpp_spec", SUBTARGET_CPP_SPEC }, \ + { "asm_cpu_spec", ASM_CPU_SPEC }, \ SUBTARGET_EXTRA_SPECS #ifndef SUBTARGET_EXTRA_SPECS @@ -2240,4 +2241,8 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); #define DRIVER_SELF_SPECS MCPU_MTUNE_NATIVE_SPECS +#define ASM_CPU_SPEC \ + " %{mcpu=generic-*:-march=%*;"\ + " :%{mcpu=*:-mcpu=%*} %{march=*:-march=%*}}" + #
Re: [SPARC] Fix bugs with setjmp/longjmp + alloca
> I guess when using setjmp/longjmp for exceptions the requirements > increase above and beyond what is normally sufficient, and that's why > you have to update the buffer? Absolutely, you needed to update the buffer in the EH case, e.g. in Ada. VLAs aren't first-class citizens so this wasn't really visible in C++, but the issue was present too. -- Eric Botcazou
Re: [PATCH 4/6] Shrink-wrapping
On Wed, Aug 24, 2011 at 9:46 AM, Bernd Schmidt wrote: > On 08/03/11 17:38, Richard Sandiford wrote: >> Bernd Schmidt writes: >>> +@findex simple_return >>> +@item (simple_return) >>> +Like @code{(return)}, but truly represents only a function return, while >>> +@code{(return)} may represent an insn that also performs other functions >>> +of the function epilogue. Like @code{(return)}, this may also occur in >>> +conditional jumps. >> >> Sorry, I've forgotton the outcome of the discussion about what happens >> on targets whose return expands to the same code as their simple_return. >> Do the targets still need both "return" and "simple_return" rtxes? > > It's important to distinguish between these names as rtxes that can > occur in instruction patterns, and a use as a standard pattern name. > When a "return" pattern is generated, it should either fail or expand to > something that performs both the epilogue and the return. A > "simple_return" expands to something that performs only the return. > > Most targets allow "return" patterns only if the epilogue is empty. In > that case, "return" and "simple_return" can expand to the same insn; it > does not matter if that insn uses "simple_return" or "return", as they > are equivalent in the absence of an epilogue. It would be slightly nicer > to use "simple_return" in the patterns everywhere except ARM, but ports > don't need to be changed. > >> Do they need both md patterns (but potentially using the same rtx >> underneath)? > > The "return" standard pattern is needed for the existing optimizations > (inserting returns in-line rather than jumping to the end of the > function). Typically, it always fails if the function needs an epilogue, > except in the ARM case. > For shrink-wrapping to work, a port needs a "simple_return" pattern, > which the compiler can use even if parts of the function need an > epilogue. So yes, they have different purposes. > >> I ask because the rtl.def comment implies that those targets still >> need both expanders and both rtxes. If that's so, I think it needs >> to be mentioned here too. E.g. something like: >> >> Like @code{(return)}, but truly represents only a function return, while >> @code{(return)} may represent an insn that also performs other functions >> of the function epilogue. @code{(return)} only occurs on paths that >> pass through the function prologue, while @code{(simple_return)} >> only occurs on paths that do not pass through the prologue. > > This is not accurate for the rtx code. It is mostly accurate for the > standard pattern name. A simple_return rtx may occur just after an > epilogue, i.e. on a path that has passed through the prologue. > > Even for the simple_return pattern, I'm not sure reorg.c couldn't > introduce new expansions in a location after both prologue and epilogue. > >> Like @code{(return)}, @code{(simple_return)} may also occur in >> conditional jumps. >> >> You need to document the simple_return pattern in md.texi too. > > I was trying to update the documentation to only the current state after > the patch. The thinking was that without shrink-wrapping, nothing > generates this pattern, so documenting it would be misleading. > However, with the mips changes in this version of the patch, reorg.c > does make use of this pattern, so I've added documentation > >>> @@ -3498,6 +3506,8 @@ relax_delay_slots (rtx first) >>> continue; >>> >>> target_label = JUMP_LABEL (delay_insn); >>> + if (target_label && ANY_RETURN_P (target_label)) >>> + continue; >>> >>> if (!ANY_RETURN_P (target_label)) >>> { >> >> This doesn't look like a pure "handle return as well as simple return" >> change. Is the idea that every following test only makes sense for >> labels, and that things like: >> >> && prev_active_insn (target_label) == insn >> >> (to pick just one example) are actively dangerous for returns? > > That probably was the idea. Looking at it again, there's one case at the > bottom of the loop which may be safe, but given that there were no code > generation differences with the patch on three targets with > define_delay, I've done: > >> If so, I think you should remove the immediately-following. >> "if (!ANY_RETURN_P (target_label))" condition and reindent the body. > > this. > >> Given what you said about JUMP_LABEL sometimes being null, >> I think we need either (a) to check whether each *_return_label >> is null before comparing it with JUMP_LABEL, or (b) to ensure that >> we're dealing with a jump to a label. (b) seems neater IMO >> (as a call to jump_to_label_p). > > Done. > >> >>> +#if defined HAVE_return || defined HAVE_simple_return >>> + if ( >>> #ifdef HAVE_return >>> - if (HAVE_return && end_of_function_label != 0) >>> + (HAVE_return && function_return_label != 0) >>> +#else >>> + 0 >>> +#endif >>> +#ifdef HAVE_simple_return >>> + || (HAVE_simple_return && function_simple_return_label != 0) >>> +#endif >>> + ) >>>
Re: [libcpp,lto,fortran PATCH] Fix linemap_add use and remove unnecessary kludge
Hello Richard, Richard Guenther writes: > In the LTO FE the two linemap_add calls were to advance the location > counter to cover the builtin special locations. You exchange these > with only one - that doesn't look correct without more explanation. It seems to me that you don't need to worry about advancing the location counter to cover builtin special locations because the lowest possible location that could be handed out by the line map is set to RESERVED_LOCATION_COUNT. You can see that by looking at linemap_init that sets the initial highest location to RESERVED_LOCATION_COUNT - 1, and at linemap_add that sets the starting location of the added map to the highest location + 1. And RESERVED_LOCATION is set to 2 in line-map.h. > The PCH change is ok. Thank you. -- Dodji