Re: [PATCH] toplev.c: Process the failure when read fails for random_seed
On 1/20/15 10:11, Joseph Myers wrote: > On Mon, 19 Jan 2015, Chen Gang S wrote: > >> On 12/31/2014 06:26 AM, Joseph Myers wrote: >>> On Mon, 29 Dec 2014, Chen Gang S wrote: >>> And in honest, this year what I have done is really not quite well, next year I should be improved: should scanning Bugzilla and try to fix the existing issues (just like another members' suggestions to me). >>> >>> Note that for any substantial patches you'll need to complete the >>> copyright assignment paperwork (I don't see you listed in copyright.list >>> at present). >>> >> >> Excuse me, I am not quite familiar with the related working flow, at >> present, I finished assignment paperwork for binutils and gdb, and I >> am one of write after approval member for binutils and gdb. >> >> Do you mean I need follow the same working flow for gcc, just as for the >> binutils and gdb? (or only post my assignment is OK?). > > You need to complete the same assignment form. Start with > > http://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-assign.future > > but this time name GCC as the package. > OK, thanks, I will follow the related working flow. And I shall try to post my assignment paper within this month. Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
Re: [AArch64] Add a new scheduling description for the ARM Cortex-A57 processor
On 19/01/15 21:05, James Greenhalgh wrote: On Mon, Jan 19, 2015 at 08:57:31PM +, Gerald Pfeifer wrote: On Monday 2015-01-19 17:52, James Greenhalgh wrote: OK after the Cortex-A57 scheduling description goes in to the ARM port? Yes, thanks, except that once will be sufficient. ;-) (The current patch features the same hunk twice?) Once under AArch64 and once under ARM. I'm happy to drop one or the other hunk. Neither is incorrect, but I agree it is odd to say the same thing twice. Ramana, Marcus, Richard, any opinions on how you would like this resolved? Perhaps an ARM/AArch64 common changes section? Though I'm not sure which other changes would go in to it. I'm not sure if a "common" section improves readability. I'd rather this remained as it is today. My 10 paise. Ramana Cheers, James
Re: [PATCH] add includes in config/tilepro/mul-tables.c and config/tilegx/mul-tables.c
On Tue, 20 Jan 2015, Prathamesh Kulkarni wrote: > Hi, > When I committed r219655, I didn't check in tilepro/mul-tables.c since it > was auto-generated from config/tilepro/gen-mul-tables.cc. > Andrew pointed out to me that this causes the files > confi/tilepro/mul-tables.c and config/tilegx/mul-tables.c to show up > in svn diff. The attached patch fixes that by > putting includes in config/tilepro/mul-tables.c in the order they are > present in > config/tilepro/gen-mul-tables.cc. > > The patch won't affect the build since they are generated from > gen-mul-tables.cc, the sole point of this patch is to avoid noise in > svn diff, so could this be > considered similar to a doc-fix ? > OK to commit ? Ok. Thanks, Richard.
Re: [PATCH 7/8] Model cache auto-prefetcher in scheduler
On 19/01/15 18:14, Maxim Kuvyrkov wrote: On Jan 19, 2015, at 6:05 PM, Richard Earnshaw wrote: On 16/01/15 15:06, Maxim Kuvyrkov wrote: @@ -1874,7 +1889,8 @@ const struct tune_params arm_cortex_a15_tune = true, true, /* Prefer 32-bit encodings. */ true,/* Prefer Neon for stringops. */ 8, /* Maximum insns to inline memset. */ - ARM_FUSE_NOTHING /* Fuseable pairs of instructions. */ + ARM_FUSE_NOTHING,/* Fuseable pairs of instructions. */ + max_insn_queue_index + 1 /* Sched L2 autopref depth. */ }; Hmm, two issues here: 1) This requires a static constructor for the tuning table entry (since the value of max_insn_queue_index has to be looked up at run time. Are you sure? I didn't check the object files, but, since max_insn_queue_index is a "const int", I would expect a relocation that would be resolved at link time, not a constructor. I did miss this during the original review, sorry about that. In theory yes, but in practice apparently not. Probably something LTO might be able to handle, but in the absence of defaults to LTO I don't see how we can guarantee something like this. I do see a static constructor being put in for this on a cross compiler build. 00d79338 <_Z41__static_initialization_and_destruction_0ii>: d79338: 55 push %rbp d79339: 48 89 e5mov%rsp,%rbp d7933c: 89 7d fcmov%edi,-0x4(%rbp) d7933f: 89 75 f8mov%esi,-0x8(%rbp) d79342: 83 7d fc 01 cmpl $0x1,-0x4(%rbp) d79346: 75 27 jned7936f <_Z41__static_initialization_and_destruction_0ii+0x37> d79348: 81 7d f8 ff ff 00 00cmpl $0x,-0x8(%rbp) d7934f: 75 1e jned7936f <_Z41__static_initialization_and_destruction_0ii+0x37> d79351: 8b 05 f1 43 48 00 mov0x4843f1(%rip),%eax # 11fd748 d79357: 83 c0 01add$0x1,%eax d7935a: 89 05 b4 27 a9 00 mov%eax,0xa927b4(%rip) # 180bb14 <_ZL19arm_cortex_a15_tune+0x54> d79360: 8b 05 e2 43 48 00 mov0x4843e2(%rip),%eax # 11fd748 d79366: 83 c0 01add$0x1,%eax d79369: 89 05 05 28 a9 00 mov%eax,0xa92805(%rip) # 180bb74 <_ZL19arm_cortex_a57_tune+0x54> d7936f: 5d pop%rbp d79370: c3 retq Is it a problem to have a static constructor for the tables? 2) Doesn't this mean that the depth of searching will depend on properties of the automata rather than some machine specific values (so that potentially adding or removing unrelated scheduler rules could change the behaviour of the compiler)? No. The extra queue entries that will appear from extending an unrelated automaton will be empty, so the search will check them, but won't find anything. In general, how should someone tuning the compiler for this parameter select a value that isn't one of (-1, m_i_q_d+1)? From my experiments it seems there are 4 reasonable values for the parameter: (-1) autopref turned off, (0) turned on in rank_for_schedule, (m_i_q_d+1) turned on everywhere. If there is a static constructor generated for tune tables and it is a problem to have it -- I can shrink acceptable values to these 3 and call it a day. There doesn't seem to be anything documented in the coding conventions for this specifically. In the absence of any such documentation, I'd rather not grow static constructors for something like this. Can we just shrink the acceptable values to 3 and just call it a day here ? regards Ramana -- Maxim Kuvyrkov www.linaro.org
Re: [PATCH][ARM] PR 64149: Remove -mlra/-mno-lra option for ARM.
On Tue, Jan 13, 2015 at 1:32 PM, Matthew Wahab wrote: > Hello, > > The LRA register alloator is enabled by default for the ARM backend and > -mno-lra should no longer be used. This patch removes the -mlra/-mno-lra > option from the ARM backend. > > arm-none-linux-gnueabihf passes gcc-check with no new failures. > > Matthew > > 2015-01-13 Matthew Wahab > > PR target/64149 > * config/arm/arm.opt: Remove lra option and arm_lra_flag variable. > * config/arm/arm.h (MODE_BASE_REG_CLASS): Remove use of > arm_lra_flag, > replace the conditional with it's true branch. > * config/arm/arm.c (TARGET_LRA_P): Set to hook_bool_void_true. > (arm_lra_p): Remove. > * testsuite/gcc.target/arm/thumb1-far-jump-3.c: Remove. This is OK - Usually we don't remove command line options but given this was always a transitional option, I'd rather folks with broken packages with LRA actually removed the mno-lra option from their packages and reported issues that we could fix. An entry for changes.html please for both ARM and AArch64 backends. Also please work on removing some of the dead reload code now from the ARM backend too :) . Ramana
Re: Compare-elim pass (was: Re: [PATCH] Fix PR 61225)
> > It would be nice to only have to write the set+set version, and do > > some markup to say which of the clobber variants should be generated, > > yes. > > define_subst should be able to do that. The Visium port uses that (but the other way around). -- Eric Botcazou
Re: [PATCH] Fix PR64535 - increase emergency EH buffers via a new allocator
On Mon, 19 Jan 2015, Jonathan Wakely wrote: > On 19/01/15 11:33 +0100, Richard Biener wrote: > > On Mon, 12 Jan 2015, Richard Biener wrote: > > > > > > > > This "fixes" PR64535 by changing the fixed object size emergency pool > > > to a variable EH object size (but fixed arena size) allocator. Via > > > combining the dependent and non-dependent EH arenas this should allow > > > around 600 bad_alloc throws in OOM situations on x86_64-linux > > > compared to the current 64 which should provide some headroom to > > > the poor souls using EH to communicate OOM in a heavily threaded > > > enviroment. > > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 1 > > > as in the patch below, forcing the use of the allocator). > > I see only the #else part is kept now - that was what I was going to > suggest. > > > Unfortunately I couldn't get an answer of whether throwing > > bad_alloc from a throw where we can't allocate an exception > > is a) valid or b) even required by the standard ('throw' isn't > > listed as 'allocation function' - also our EH allocators are > > marked as throw(), so such change would change the ABI...). > > I'll ask the C++ committee. Thanks. Only needing to deal with std::bad_alloc allocations from the pool would greatly simplify it (I'd do a fixed-object-size one then). > > > With the cost of some more members I can make the allocator more > > > generic (use a constructor with a arena and a arena size parameter) > > > and we may move it somewhere public under __gnu_cxx? But eventually > > > boost has something like this anyway. > > > > Didn't explore this - it doesn't really match the STL allocator interface > > and imposes overhead even over an implementation class (STL allocators > > know the size of the objects they free). > > Yeah, I don't think there's any advantage complicating this type to > make it usable as an STL allocator - it does what it's designed to do > and that's fine. > > > I'm re-bootstrapping / testing with the cosmetic changes I did and > > with EH allocation not forced to go through the emergency pool > > (I've done that in previous bootstraps / tests to get the pool code > > exercised). > > > Any comments? We have a customer that runs into the issue that 64 > > bad_alloc exceptions are not enough for them (yes, they require bad_alloc > > to work when thrown in a massive quantity from threads). Other > > solutions for this would include to simply wait and re-try (with possibly > > deadlocking if no progress is made) to artificially throttling > > bad_alloc allocations from the EH emergency pool (identify it by > > size, sleep some time inside the lock). > > > > CCing rth who implemented the existing code. > > I don't have any better ideas for fixing the issue, so it's approved > by me. Unless rth comes back with something else please go ahead and > check it in. Testing revealed g++.old-deja/g++.eh/badalloc1.C which has an interesting way of limiting allocation (providing its own malloc). I had to bump its arena size to make the upfront allocation in libstdc++ pass. I also added code to deal with malloc failing there (not sure if it's worth failing in a way to still setup a minimal-size emergency arena). I also replaced all size_t with std::size_t for consistency. Thanks, Richard. 2015-01-12 Richard Biener PR libstdc++/64535 * libsupc++/eh_alloc.cc: Include new. (bitmask_type): Remove. (one_buffer): Likewise. (emergency_buffer): Likewise. (emergency_used): Likewise. (dependents_buffer): Likewise. (dependents_used): Likewise. (class pool): New custom fixed-size arena, variable size object allocator. (emergency_pool): New global. (__cxxabiv1::__cxa_allocate_exception): Use new emergency_pool. (__cxxabiv1::__cxa_free_exception): Likewise. (__cxxabiv1::__cxa_allocate_dependent_exception): Likewise. (__cxxabiv1::__cxa_free_dependent_exception): Likewise. * g++.old-deja/g++.eh/badalloc1.C: Adjust. Index: libstdc++-v3/libsupc++/eh_alloc.cc === *** libstdc++-v3/libsupc++/eh_alloc.cc.orig 2015-01-19 13:32:47.559305641 +0100 --- libstdc++-v3/libsupc++/eh_alloc.cc 2015-01-20 10:03:17.255749415 +0100 *** *** 34,39 --- 34,40 #include #include "unwind-cxx.h" #include + #include #if _GLIBCXX_HOSTED using std::free; *** using namespace __cxxabiv1; *** 72,133 # define EMERGENCY_OBJ_COUNT 4 #endif - #if INT_MAX == 32767 || EMERGENCY_OBJ_COUNT <= 32 - typedef unsigned int bitmask_type; - #else - #if defined (_GLIBCXX_LLP64) - typedef unsigned long long bitmask_type; - #else - typedef unsigned long bitmask_type; - #endif - #endif ! typedef char one_buffer[EMERGENCY_OBJ_SIZE] __attribute__((aligned)); ! static one_buffer emergency_buffer[EMERGENCY_OBJ_COUNT]; ! static bitmask_
Re: Housekeeping work in backends.html
> For the moxie, nvptx, rl178 and rx ports, maintainers can send me the string > as Sandra did for the nios2 port and I'll update the document. DJ kindly sent them for both ports so I have installed them. -- Eric BotcazouIndex: backends.html === RCS file: /cvs/gcc/wwwdocs/htdocs/backends.html,v retrieving revision 1.62 diff -u -r1.62 backends.html --- backends.html 16 Jan 2015 17:07:57 - 1.62 +++ backends.html 20 Jan 2015 09:15:47 - @@ -100,7 +100,9 @@ nvptx | S Q Cqg be pa | ? Q CBD qrm i e pdp11 |L ICqrc d e +rl78 |L F l g bds rs6000 | Q Cqr p i +rx | b s s390 | ? Qqr g b ia e sh | Q CB qr p b i sparc | Q CB qr i
Re: [PATCH] pr 64076 - tolerate different definitions of symbols in lto
On Tue, Jan 20, 2015 at 3:31 AM, wrote: > From: Trevor Saunders > > Hi, > > when doing an lto link we can have some symbols be ir only and others be > machine code, which trips the assert here. Just adjust the assert to handle > that. > > bootstrapped + regtested x86_64-linux-gnu, ok? Testcase? It's hard to understand what "machine code" is otherwise or why this assert would fail. Thanks, Richard. > Trev > > gcc/ > > * ipa-visibility.c (update_visibility_by_resolution_info): Only > assert when not in lto mode. > --- > gcc/ipa-visibility.c | 18 +- > 1 file changed, 13 insertions(+), 5 deletions(-) > > diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c > index 71894af..0791a1c 100644 > --- a/gcc/ipa-visibility.c > +++ b/gcc/ipa-visibility.c > @@ -425,11 +425,19 @@ update_visibility_by_resolution_info (symtab_node * > node) >if (node->same_comdat_group) > for (symtab_node *next = node->same_comdat_group; > next != node; next = next->same_comdat_group) > - gcc_assert (!next->externally_visible > - || define == (next->resolution == LDPR_PREVAILING_DEF_IRONLY > - || next->resolution == LDPR_PREVAILING_DEF > - || next->resolution == LDPR_UNDEF > - || next->resolution == > LDPR_PREVAILING_DEF_IRONLY_EXP)); > + { > + if (!next->externally_visible) > + continue; > + > + bool same_def > + = define == (next->resolution == LDPR_PREVAILING_DEF_IRONLY > + || next->resolution == LDPR_PREVAILING_DEF > + || next->resolution == LDPR_UNDEF > + || next->resolution == LDPR_PREVAILING_DEF_IRONLY_EXP); > + gcc_assert (in_lto_p || same_def); > + if (!same_def) > + return; > + } > >if (node->same_comdat_group) > for (symtab_node *next = node->same_comdat_group; > -- > 2.1.4 >
Re: [AArch64] Add a new scheduling description for the ARM Cortex-A57 processor
On 19/01/15 21:05, James Greenhalgh wrote: On Mon, Jan 19, 2015 at 08:57:31PM +, Gerald Pfeifer wrote: On Monday 2015-01-19 17:52, James Greenhalgh wrote: OK after the Cortex-A57 scheduling description goes in to the ARM port? Yes, thanks, except that once will be sufficient. ;-) (The current patch features the same hunk twice?) Once under AArch64 and once under ARM. I'm happy to drop one or the other hunk. Neither is incorrect, but I agree it is odd to say the same thing twice. Ramana, Marcus, Richard, any opinions on how you would like this resolved? Perhaps an ARM/AArch64 common changes section? Though I'm not sure which other changes would go in to it. I'd prefer separate sections, IMHO that is more useful. /Marcus
Re: libgo patch committed: Update to Go 1.4
Ian Lance Taylor writes: > On Mon, Jan 19, 2015 at 4:17 AM, Rainer Orth > wrote: >> Ian Lance Taylor writes: >> >>> On Thu, Jan 15, 2015 at 8:30 AM, Rainer Orth >>> wrote: Apart from that, bootstrap fails in gotools: due to the use of -static-libgo, all commands there fail to link since the socket functions are missing. It seems like $LIBS from libgo needs to be added somewhere, but I'm unsure how best to handle this. To make any progress at all, I've just manually added -lsocket -lnsl to gotools/Makefile (AM_LDFLAGS). >>> >>> I also don't know what the best way is to handle this. For now I've >>> just added a configure test to check whether the libraries are needed. >>> Based on the libgo build, as far as I can tell, no other libraries >>> should be needed. >> >> While this is true for Solaris 11, Solaris 10 needs librt for nanosleep, >> sched_yield and the sem_* functions. The following patch copies the >> corresponding libgo test and allows gotools to build even on Solaris 10. > > This is OK to commit with a ChangeLog entry (the gotools directory is > not mirrored and lives only in the GCC repository). Done with the following: 2015-01-20 Rainer Orth * configure.ac: Check if sched_yield and/or nanosleep need -lrt. * configure: Regenerate. * Makefile.am (go$(EXEEXT), gofmt$(EXEEXT), cgo$(EXEEXT)): Link with $(LIBS). * Makefile.in: Regenerate. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[visium] Adjust LIB_SPEC
... to match the just submitted newlib port. Applied on the mainline. 2015-01-20 Eric Botcazou * config/visium/visium.h (LIB_SPEC): Adjust in default case. -- Eric BotcazouIndex: config/visium/visium.h === --- config/visium/visium.h (revision 219797) +++ config/visium/visium.h (working copy) @@ -29,10 +29,8 @@ #define CPP_SPEC "%{mcpu=gr6:-D__gr6__; :-D__gr5__}" /* Targets of a link */ -#define LIB_SPEC "\ -%{msim : --start-group -lc -lsim --end-group ; \ - mdebug : --start-group -lc -ldebug --end-group ; \ - : -lc -lnosys }" +#define LIB_SPEC \ + "--start-group -lc %{msim:-lsim; mdebug:-ldebug; :-lserial} --end-group" #define ENDFILE_SPEC "crtend.o%s crtn.o%s" #define STARTFILE_SPEC "crti.o%s crtbegin.o%s crt0.o%s"
Re: [PATCH, i386] Remove EBX usage from asm code
Uros Bizjak writes: > On Sat, Jan 17, 2015 at 7:36 PM, Rainer Orth > wrote: >> Uros Bizjak writes: >> >>> On Sat, Jan 17, 2015 at 4:18 PM, Rainer Orth >>> wrote: >>> > The patch removes EBX usage from asm code used in libgcc/crtstuff.c > It is safe now, but potentially buggy when glibc is rebuild with GCC > 5.0 as EBX is not GOT register any more. > > x86 bootstrap, make check passed. > > Is it ok? > > Evgeny > > 2014-12-28 Evgeny Stupachenko > > * gnu-user.h (CRT_GET_RFIB_DATA): Remove EBX register usage. > * config/i386/sysv4.h (CRT_GET_RFIB_DATA): Ditto. this patch broke Solaris 10/x86 bootstrap: when building amd64 crtbegin.o, gas complains >>> >>> Looks like config.gcc error for Solaris x86, amd64 target should not >>> include i386/gnu-user.h but i386/gnu-user64.h >> >> The target is i386-pc-solaris2.10, which includes i386/sysv4.h. Only >> the amd64 crtbegin.o is affected, the i386 one is fine. > > Please split sysv4-common.h out of i386/sysv4.h, similar to how > i386/gnu-user.h and gnu-user-common.h are split. This would break Solaris/SPARC (there's no sparc/sysv4-common.h), and only works on Linux/x86 by accident: * CRT_GET_RFIB_DATA is only defined in i386/gnu-user.h there, but that file is only included for 32-bit-only configurations, not 32-bit-default ones (--enable-targets=all). In the latter case, the macro is not defined for the 32-bit multilib, where it should be. * CRT_GET_RFIB_DATA is only used in libgcc/crtstuff.c and libgcc/unwind-dw2-fde-dip.c (which already has a workaround for it being incorrectly defined for 64-bit Solaris/x86). IMO, the definition has no business living in gcc/config/i386 at all, but should move to libgcc/config instead (together with the one in frv/frv.h). That being probably too intrusive at this stage, I think the best workaround for now is to simply wrap the definition in i386/syv4.h (which is Solaris/x86-only anyway) in __i386__. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH, i386] Remove EBX usage from asm code
On Tue, Jan 20, 2015 at 10:42 AM, Rainer Orth wrote: >>> The target is i386-pc-solaris2.10, which includes i386/sysv4.h. Only >>> the amd64 crtbegin.o is affected, the i386 one is fine. >> >> Please split sysv4-common.h out of i386/sysv4.h, similar to how >> i386/gnu-user.h and gnu-user-common.h are split. > > This would break Solaris/SPARC (there's no sparc/sysv4-common.h), and > only works on Linux/x86 by accident: > > * CRT_GET_RFIB_DATA is only defined in i386/gnu-user.h there, but that > file is only included for 32-bit-only configurations, not > 32-bit-default ones (--enable-targets=all). In the latter case, the > macro is not defined for the 32-bit multilib, where it should be. > > * CRT_GET_RFIB_DATA is only used in libgcc/crtstuff.c and > libgcc/unwind-dw2-fde-dip.c (which already has a workaround for it > being incorrectly defined for 64-bit Solaris/x86). > > IMO, the definition has no business living in gcc/config/i386 at all, > but should move to libgcc/config instead (together with the one in > frv/frv.h). That being probably too intrusive at this stage, I think > the best workaround for now is to simply wrap the definition in > i386/syv4.h (which is Solaris/x86-only anyway) in __i386__. Ugh... Considering your explanation and the mess in the unwinder, IMO this should be fixed in the correct way even at this stage. CC RMs for their opinion. Thanks, Uros.
[PATCH,wwwdocs] Add news entry for MIPS Release 6 - committed
I committed the following patch to wwwdocs having received approval from Gerald. Thanks, Matthew Index: htdocs/index.html === RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v retrieving revision 1.953 diff -r1.953 index.html 54a55,59 > MIPS Release 6 architecture support > [2015-01-20] > Support for MIPS Release 6 (r6) has been contributed by Imagination > Technologies. >
Re: [PATCH, i386] Remove EBX usage from asm code
On Tue, Jan 20, 2015 at 10:51:03AM +0100, Uros Bizjak wrote: > On Tue, Jan 20, 2015 at 10:42 AM, Rainer Orth > wrote: > > >>> The target is i386-pc-solaris2.10, which includes i386/sysv4.h. Only > >>> the amd64 crtbegin.o is affected, the i386 one is fine. > >> > >> Please split sysv4-common.h out of i386/sysv4.h, similar to how > >> i386/gnu-user.h and gnu-user-common.h are split. > > > > This would break Solaris/SPARC (there's no sparc/sysv4-common.h), and > > only works on Linux/x86 by accident: > > > > * CRT_GET_RFIB_DATA is only defined in i386/gnu-user.h there, but that > > file is only included for 32-bit-only configurations, not > > 32-bit-default ones (--enable-targets=all). In the latter case, the > > macro is not defined for the 32-bit multilib, where it should be. > > > > * CRT_GET_RFIB_DATA is only used in libgcc/crtstuff.c and > > libgcc/unwind-dw2-fde-dip.c (which already has a workaround for it > > being incorrectly defined for 64-bit Solaris/x86). > > > > IMO, the definition has no business living in gcc/config/i386 at all, > > but should move to libgcc/config instead (together with the one in > > frv/frv.h). That being probably too intrusive at this stage, I think > > the best workaround for now is to simply wrap the definition in > > i386/syv4.h (which is Solaris/x86-only anyway) in __i386__. > > Ugh... > > Considering your explanation and the mess in the unwinder, IMO this > should be fixed in the correct way even at this stage. > > CC RMs for their opinion. Agreed. Jakub
Re: [PATCH 7/8] Model cache auto-prefetcher in scheduler
On 19/01/15 18:14, Maxim Kuvyrkov wrote: > On Jan 19, 2015, at 6:05 PM, Richard Earnshaw wrote: > >> On 16/01/15 15:06, Maxim Kuvyrkov wrote: >>> @@ -1874,7 +1889,8 @@ const struct tune_params arm_cortex_a15_tune = >>> true, true, /* Prefer 32-bit encodings. >>> */ >>> true, /* Prefer Neon for >>> stringops. */ >>> 8,/* Maximum insns to >>> inline memset. */ >>> - ARM_FUSE_NOTHING /* Fuseable pairs of >>> instructions. */ >>> + ARM_FUSE_NOTHING,/* Fuseable pairs of >>> instructions. */ >>> + max_insn_queue_index + 1 /* Sched L2 autopref depth. */ >>> }; >> >> >> Hmm, two issues here: >> 1) This requires a static constructor for the tuning table entry (since >> the value of max_insn_queue_index has to be looked up at run time. > > Are you sure? I didn't check the object files, but, since > max_insn_queue_index is a "const int", I would expect a relocation that would > be resolved at link time, not a constructor. > Yes, I'm sure. Relocations can only resolve addresses of objects, not their contents. LTO might eliminate the need for the reloc, but otherwise the compiler will never see the definition and will need to create a static constructor. > Is it a problem to have a static constructor for the tables? Needing constructors means that the compiler can't put the object into read-only sections of the image. It's not a huge problem, but if there are ways by which they can be avoided, that's likely to be preferable; there's a small run-time overhead to running them. > >> >> 2) Doesn't this mean that the depth of searching will depend on >> properties of the automata rather than some machine specific values (so >> that potentially adding or removing unrelated scheduler rules could >> change the behaviour of the compiler)? > > No. The extra queue entries that will appear from extending an unrelated > automaton will be empty, so the search will check them, but won't find > anything. > OK, so there's just a minor performance cost of checking values that never hit. >> >> In general, how should someone tuning the compiler for this parameter >> select a value that isn't one of (-1, m_i_q_d+1)? > > From my experiments it seems there are 4 reasonable values for the parameter: > (-1) autopref turned off, (0) turned on in rank_for_schedule, (m_i_q_d+1) > turned on everywhere. If there is a static constructor generated for tune > tables and it is a problem to have it -- I can shrink acceptable values to > these 3 and call it a day. > You only mention 3 values: what was the fourth? It might be better then to define a set of values that represent each of these cases and only allow the tuning parameters to select one of those. The init code then uses that set to select how to set up the various parameters to meet those goals. So something like ARM_SCHED_AUTOPREF_OFF ARM_SCHED_AUTOPREF_RANK ARM_SCHED_AUTOPREF_FULL R. > -- > Maxim Kuvyrkov > www.linaro.org > > > >
Re: Housekeeping work in backends.html
> the attached patch removes obsolete ports (c4x, m68hc11 and ms1), toggles > the 'p' letter and adjust accordingly (only avr, fr30, m68k, mcore, rs6000 > and sh still use define_peephole) and removes trailing spaces. Same treatment for the 'b' letter, the ports that uses '"* ..."' notation for output template code are a minority and new ports don't use it. This also puts 4 "blockers" for new ports (c, p, b, f) together. Applied. -- Eric BotcazouIndex: backends.html === RCS file: /cvs/gcc/wwwdocs/htdocs/backends.html,v retrieving revision 1.63 diff -u -r1.63 backends.html --- backends.html 20 Jan 2015 09:17:12 - 1.63 +++ backends.html 20 Jan 2015 10:18:04 - @@ -48,10 +48,10 @@ (Not necessarily supported by all subtargets.) c Port uses cc0. p Port uses define_peephole (as opposed to define_peephole2). +b Port uses '"* ..."' notation for output template code. f Port does not define prologue and/or epilogue RTL expanders. g Port does not define TARGET_ASM_FUNCTION_(PRO|EPI)LOGUE. m Port does not use define_constants. -b Port does not use '"* ..."' notation for output template code. d Port does not use DFA scheduler descriptions. i Port generates multiple inheritance thunks using TARGET_ASM_OUTPUT_MI(_VCALL)_THUNK. @@ -65,55 +65,55 @@ | Characteristics -Target | HMSLQNFICBD lqrcpfgmbdiates +Target | HMSLQNFICBD lqrcpbfgmdiates ---+ -aarch64| Qqg ia s -alpha | ?? Q Cqg b i e -arc| Bg i -arm| ia s -avr|L FIl cp g bd -bfin | F g i -c6x| S CBg b i -cr16 |L F C g ds -cris | F B c g bdi s -epiphany | C g b i s -fr30 | ??FI B p gm ds -frv| ?? Bi s -h8300 | FI B c g ds -i386 | ? Qqia -ia64 | ? Q Cqr i -iq2000 | ??? FICBg t -lm32 | F g b -m32c |L FIl g ds -m32r | FI s -m68k | ?cp i -mcore | ?FIp gm s -mep| F C g t s -microblaze | CB b i s -mips | Q CB qr b ia s -mmix | HM Q Cq bdi e -mn10300| ?? c g i s -moxie | F g ds -msp430 |L FIl g ds +aarch64| Qq b g ia s +alpha | ?? Q Cq g i e +arc| B b g i +arm| bia s +avr|L FIl cp g d +bfin | Fg i +c6x| S CB g i +cr16 |L F C g ds +cris | F B c g di s +epiphany | C g i s +fr30 | ??FI B pb gmds +frv| ?? B bi s +h8300 | FI B c g ds +i386 | ? Qq bia +ia64 | ? Q Cqr bi +iq2000 | ??? FICB b gt +lm32 | Fg +m32c |L FIlb g ds +m32r | FI bs +m68k | ?cpbi +mcore | ?FIpb gm s +mep| F Cb gt s +microblaze | CBi s +mips | Q CB qr ia s +mmix | HM Q Cq di e +mn10300| ?? c g i s +moxie | Fg d t s +msp430 |L FIlb g ds nds32 | F C ia s -nios2 | S C b -nvptx | S Q Cqg be -pa | ? Q CBD qrm i e -pdp11 |L ICqrc d e -rl78 |L F l g bds -rs6000 | Q Cqr p i -rx | b s -s390 | ? Qqr g b ia e -sh | Q CB qr p b i -sparc | Q CB qr i -spu| ? Q *C g b i -stormy16 | ???L FIC D l m di -tilegx | S Q Cqg b i e -tilepro| S F C g b i e -v850 | ??FI c gm s -vax| M?I c di e -visium | Bg b s -xtensa | C b +nios2 | S C +nvptx | S Q Cq g e +pa | ? Q CBD qr b m i e +pdp11 |L ICqrc b d e +rl78 |L F l g ds +rs6000 | Q Cqr pbi +rx | s +s390 | ? Qqrg ia e +sh | Q CB qr p i +sparc | Q CB qr bi +sp
Re: [Ada] Fix bootstrapping on darwin9/10 (PR ada/64349).
On 14 Jan 2015, at 09:03, Tristan Gingold wrote: > >> On 09 Jan 2015, at 00:42, Iain Sandoe wrote: >> >> >> On 8 Jan 2015, at 13:52, Tristan Gingold wrote: >> >>> On 08 Jan 2015, at 13:49, Iain Sandoe wrote: Hi Tristan, On 7 Jan 2015, at 10:15, Arnaud Charlet wrote: > Use _NSGetEnviron to get environment. > > Tested on x86_64-pc-linux-gnu, committed on trunk > > 2015-01-07 Tristan Gingold > > PR ada/64349 > * env.c (__gnat_environ): Adjust for darwin9/darwin10. > > So my original patch assumed that, while it was not legal to use environ from a shlib, it is legal to use _NSGetEnviron () from an application ... .. and, OK fine, I see the point about ! defined (__arm__) .. but a few other comments. ISTM that there's a partial implementation to distinguish between IN_RTS and application? >>> >>> Yes you're right. The added code should have been added after the #endif >>> for IN_RTS. >> >> How about this? >> It uses the interface where needed, avoids it for main exes and gets rid of >> the negative conditional (which IMO makes the code a little more readable). >> >> Iain >> >> P.S. this is not Darwin9/10 - specific the only reason it doesn't fail on >> Darwin >= 11 is because they default to -undefined dynamic_lookup .. and so >> find the symbol from the exe. > > Sorry for the late answer. We did something slightly different: always > #include crt_externs.h on no-arm Darwin. Any news on when this might hit trunk? - it is a bootstrap issue (although on older targets). thanks Iain
Re: [Ada] Fix bootstrapping on darwin9/10 (PR ada/64349).
> Any news on when this might hit trunk? > - it is a bootstrap issue (although on older targets). Right, and you have local patches/a work around. I have been on paternity leave, so with no time to sync our changes (and with other priorities :-)). My next sync won't be before next week. Let us know if you'd like to see Tristan's patch before that, we can send it to you in the mean. Arno
Re: [Ada] Fix bootstrapping on darwin9/10 (PR ada/64349).
On 20 Jan 2015, at 10:53, Arnaud Charlet wrote: >> Any news on when this might hit trunk? >> - it is a bootstrap issue (although on older targets). > > Right, and you have local patches/a work around. > > I have been on paternity leave, so with no time to sync our changes (and > with other priorities :-)). indeed :-) > My next sync won't be before next week. > > Let us know if you'd like to see Tristan's patch before that, we can send it > to you in the mean. That's fine - as you say we have a wrok-around in the meantime, thanks Iain
[PATCH] Fix PR64410 testcase
Committed. Richard. 2015-01-20 Richard Biener PR tree-optimization/64410 * g++.dg/vect/pr64410.cc: Require vect_double. Index: gcc/testsuite/g++.dg/vect/pr64410.cc === --- gcc/testsuite/g++.dg/vect/pr64410.cc(revision 219883) +++ gcc/testsuite/g++.dg/vect/pr64410.cc(working copy) @@ -1,4 +1,5 @@ // { dg-do compile } +// { dg-require-effective-target vect_double } #include #include
Re: [SH] Introduce treg_set_expr
Oleg Endo wrote: > The updated treg_set_expr patch is attached, which should fix the GBR > issues. Tests here OK. > Kaz, could you please try again? New tests that FAIL: libgomp.fortran/udr14.f90 -O3 -g (internal compiler error) libgomp.fortran/udr14.f90 -O3 -g (test for excess errors) Old tests that passed, that have disappeared: (Eeek!) gcc.target/sh/pr49263-1.c scan-assembler-not bclr gcc.target/sh/pr49263-1.c scan-assembler-times extu 1 gcc.target/sh/pr49263-2.c scan-assembler-times -129 2 gcc.target/sh/pr49263-2.c scan-assembler-times extu 1 For the new ICE, libgomp tests log says: /exp/ldroot/dodes/LOCAL/trunk/libgomp/testsuite/libgomp.fortran/udr14.f90:15:0: internal compiler error: in maybe_record_trace_start, at dwarf2cfi.c:2318 0x8384ad6 maybe_record_trace_start ../../LOCAL/trunk/gcc/dwarf2cfi.c:2318 0x8385023 scan_trace ../../LOCAL/trunk/gcc/dwarf2cfi.c:2496 0x8385b35 create_cfi_notes ../../LOCAL/trunk/gcc/dwarf2cfi.c:2650 0x8385b35 execute_dwarf2_frame ../../LOCAL/trunk/gcc/dwarf2cfi.c:3006 0x8385b35 execute ../../LOCAL/trunk/gcc/dwarf2cfi.c:3486 Please submit a full bug report, ... "./f951 udr14.f90 -g -O3 -fopenmp -o xxx.s" can reproduce this ICE. Regards, kaz
[PATCH] Fix up dwarf2out ICE (PR debug/64663)
Hi! This patch fixes ICE caused by trying to put negative bitposition into an EXPR_LIST node; as mode is 8-bit, if the negative value is e.g. -256 (bitpos % 256 == 0), we'd ICE on the assertion that if mode is 0, then the expression must be CONCAT of the actual bit position (or size) and rtl expression. Only the second hunk is strictly needed, the rest is to avoid silent wrapping. For 4.9/4.8 perhaps just the second hunk would be enough. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2015-01-20 Jakub Jelinek PR debug/64663 * dwarf2out.c (decl_piece_node): Don't put bitsize into mode if bitsize <= 0. (decl_piece_bitsize, adjust_piece_list, add_var_loc_to_decl, dw_sra_loc_expr): Use HOST_WIDE_INT instead of int for bit sizes and positions. * gcc.dg/pr64663.c: New test. --- gcc/dwarf2out.c.jj 2015-01-19 09:31:22.0 +0100 +++ gcc/dwarf2out.c 2015-01-19 12:57:48.104400169 +0100 @@ -5062,7 +5062,7 @@ equate_decl_number_to_die (tree decl, dw /* Return how many bits covers PIECE EXPR_LIST. */ -static int +static HOST_WIDE_INT decl_piece_bitsize (rtx piece) { int ret = (int) GET_MODE (piece); @@ -5090,7 +5090,7 @@ decl_piece_varloc_ptr (rtx piece) static rtx_expr_list * decl_piece_node (rtx loc_note, HOST_WIDE_INT bitsize, rtx next) { - if (bitsize <= (int) MAX_MACHINE_MODE) + if (bitsize > 0 && bitsize <= (int) MAX_MACHINE_MODE) return alloc_EXPR_LIST (bitsize, loc_note, next); else return alloc_EXPR_LIST (0, gen_rtx_CONCAT (VOIDmode, @@ -5129,7 +5129,7 @@ adjust_piece_list (rtx *dest, rtx *src, HOST_WIDE_INT bitpos, HOST_WIDE_INT piece_bitpos, HOST_WIDE_INT bitsize, rtx loc_note) { - int diff; + HOST_WIDE_INT diff; bool copy = inner != NULL; if (copy) @@ -5269,7 +5269,7 @@ add_var_loc_to_decl (tree decl, rtx loc_ { struct var_loc_node *last = temp->last, *unused = NULL; rtx *piece_loc = NULL, last_loc_note; - int piece_bitpos = 0; + HOST_WIDE_INT piece_bitpos = 0; if (last->next) { last = last->next; @@ -5280,7 +5280,7 @@ add_var_loc_to_decl (tree decl, rtx loc_ piece_loc = &last->loc; do { - int cur_bitsize = decl_piece_bitsize (*piece_loc); + HOST_WIDE_INT cur_bitsize = decl_piece_bitsize (*piece_loc); if (piece_bitpos + cur_bitsize > bitpos) break; piece_bitpos += cur_bitsize; @@ -13924,7 +13924,7 @@ static dw_loc_descr_ref dw_sra_loc_expr (tree decl, rtx loc) { rtx p; - unsigned int padsize = 0; + unsigned HOST_WIDE_INT padsize = 0; dw_loc_descr_ref descr, *descr_tail; unsigned HOST_WIDE_INT decl_size; rtx varloc; @@ -13940,11 +13940,11 @@ dw_sra_loc_expr (tree decl, rtx loc) for (p = loc; p; p = XEXP (p, 1)) { - unsigned int bitsize = decl_piece_bitsize (p); + unsigned HOST_WIDE_INT bitsize = decl_piece_bitsize (p); rtx loc_note = *decl_piece_varloc_ptr (p); dw_loc_descr_ref cur_descr; dw_loc_descr_ref *tail, last = NULL; - unsigned int opsize = 0; + unsigned HOST_WIDE_INT opsize = 0; if (loc_note == NULL_RTX || NOTE_VAR_LOCATION_LOC (loc_note) == NULL_RTX) --- gcc/testsuite/gcc.dg/pr64663.c.jj 2015-01-19 12:59:13.032958657 +0100 +++ gcc/testsuite/gcc.dg/pr64663.c 2015-01-19 12:59:18.020873996 +0100 @@ -0,0 +1,17 @@ +/* PR debug/64663 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -g -w" } */ + +void +foo (void) +{ + int a[9]; + a[-8] = 0; +} + +void +bar (void) +{ + int a[9]; + a[-9] = 0; +} Jakub
[4.8] Backport PR61553
I'd like to backport this patch from trunk to 4.9 so as to fix PR63751. It's safe and has been on trunk for several months. Bootstrapped/regtested on x86_64-linux, ok? 2015-01-20 Marek Polacek Backport from mainline 2014-06-23 Marek Polacek PR c/61553 * c-common.c (get_atomic_generic_size): Don't segfault if the type doesn't have a size. * c-c++-common/pr61553.c: New test. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 487fb4e..8856701 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -10402,7 +10402,8 @@ get_atomic_generic_size (location_t loc, tree function, function); return 0; } - size = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (type))); + tree type_size = TYPE_SIZE_UNIT (TREE_TYPE (type)); + size = type_size ? tree_to_uhwi (type_size) : 0; if (size != size_0) { error_at (loc, "size mismatch in argument %d of %qE", x + 1, diff --git gcc/testsuite/c-c++-common/pr61553.c gcc/testsuite/c-c++-common/pr61553.c index e69de29..8a3b699 100644 --- gcc/testsuite/c-c++-common/pr61553.c +++ gcc/testsuite/c-c++-common/pr61553.c @@ -0,0 +1,8 @@ +/* PR c/61553 */ +/* { dg-do compile } */ + +void +foo (char *s) +{ + __atomic_store (s, (void *) 0, __ATOMIC_SEQ_CST); /* { dg-error "size mismatch" } */ +} Marek
Re: [4.8] Backport PR61553
On Tue, Jan 20, 2015 at 12:49:07PM +0100, Marek Polacek wrote: > I'd like to backport this patch from trunk to 4.9 so as to > fix PR63751. It's safe and has been on trunk for several months. > > Bootstrapped/regtested on x86_64-linux, ok? To 4.9 or 4.8 (subject says 4.8, above is 4.9). I think it would be ok to both. > > 2015-01-20 Marek Polacek > > Backport from mainline > 2014-06-23 Marek Polacek > > PR c/61553 > * c-common.c (get_atomic_generic_size): Don't segfault if the > type doesn't have a size. > > * c-c++-common/pr61553.c: New test. Jakub
Re: [patch] Add and last pieces of C++11 std::lib
On 19/01/15 13:27 +, Jonathan Wakely wrote: On 16/01/15 23:38 +, Jonathan Wakely wrote: This defines the C++11 header and adds the wstring_convert and wbuffer_convert utilities. I've discovered that wasn't the last piece of the C++11 library, there were new constructors taking std::string added to std::locale and all the std::xxx_byname facets. It would be fixed by the attached patch (tested on x86_64-linux with old and new std::string), but we're in stage4 now so I'm not committing it yet. It's committed to trunk now. commit 977b94ddcf8218efa0318f69b3a2cc5b5d9eb5be Author: Jonathan Wakely Date: Sun Jan 18 16:41:28 2015 + Add C++11 std::string constructors for locales and facets. * config/abi/pre/gnu.ver: Export new constructors. * include/bits/codecvt.h (codecvt_byname): Add string constructor. (codecvt_byname, codecvt_byname): Define explicit specializations and declare explicit instantiations. * include/bits/locale_classes.h (locale, collate_byname): Add string constructors. * include/bits/locale_facets.h (ctype_byname, numpunct_byname): Likewise. * include/bits/locale_facets_nonio.h (time_get_byname, time_put_byname, moneypunct_byname, messages_byname): Likewise. * src/c++11/codecvt.cc (codecvt_byname, codecvt_byname): Define explicit instantiations. * src/c++11/locale-inst.cc (time_put_byname, codecvt_byname): Instantiate string constructors. (ctype_byname): Define string constructor. * testsuite/22_locale/codecvt_byname/1.cc: New. * testsuite/22_locale/collate_byname/1.cc: New. * testsuite/22_locale/ctype_byname/2.cc: New. * testsuite/22_locale/messages_byname/1.cc: New. * testsuite/22_locale/moneypunct_byname/1.cc: New. * testsuite/22_locale/numpunct_byname/1.cc: New.
[PATCH][wwwdocs] Document removal of -mlra/-mnolra from Aarch64 and ARM backends.
Hello, This patch documents in changes.html the removal of the -mlra/-mnolra from the Aarch64 and ARM backends. Tested by checking the updated webpage in Firefox. Matthew WahabIndex: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.67 diff -u -r1.67 changes.html --- htdocs/gcc-5/changes.html 19 Jan 2015 15:48:46 - 1.67 +++ htdocs/gcc-5/changes.html 20 Jan 2015 11:30:25 - @@ -456,7 +456,10 @@ Support for the Cavium ThunderX processor is now available through the -mcpu=thunderx and -mtune=thunderx options. - + The transitional options -mlra and -mno-lra + have been removed. The AArch64 backend now uses the local register + allocator (LRA) only. + ARM @@ -488,7 +491,10 @@ The options relating to the old ABI -mapcs and -mapcs-frame have been deprecated. - + The transitional options -mlra and -mno-lra + have been removed. The ARM backend now uses the local register allocator + (LRA) only. + IA-32/x86-64
Re: [patch] libstdc++/64658 define std::atomic_init()
On 19/01/15 16:02 +, Jonathan Wakely wrote: We declare atomic_init() but then never define it, I assume that's just an accident. Although the standard says this function is non-atomic, the simplest fix at this stage is just to do an atomic store (when we get to stage 1 again I'd like to make the function a friend of std::__atomic_base<> so it can set the private member variable directly as a simple non-atomic assignment). Tested x86_64-linux, *not* committed to trunk. Now committed to trunk. commit 061dd1a073ef4646727a3f29dfa3169a760757b3 Author: Jonathan Wakely Date: Sun Jan 18 17:40:17 2015 + PR libstdc++/64658 * include/std/atomic (atomic_init): Define. * testsuite/29_atomics/atomic/64658.cc: New.
Re: [patch] [C++14] Implement N3657: heterogeneous lookup in associative containers.
On 19/01/15 17:16 +, Jonathan Wakely wrote: This is the last missing piece of the C++14 library, as proposed in http://www.open-std.org/JTC1/sc22/wg21/docs/papers/2013/n3657.htm Tested x86_64-linux, *not* committed. Now committed to trunk.
[PATCH] Fix PR64684
This fixes PR64684 where a call to is_proper_for_analysis lead to adding a variable to all_module_statics before it was disabled via ignore_module_statics. Bootstrap and regtest running on x86_64-unknown-linux-gnu. Richard. 2015-01-20 Richard Biener PR ipa/64684 * ipa-reference.c (add_static_var): Inline ... (analyze_function): ... here after splitting out from ... (is_proper_for_analysis): ... this. * gcc.dg/lto/pr64684_0.c: New testcase. * gcc.dg/lto/pr64684_1.c: Likewise. * gcc.dg/lto/pr64684_2.c: Likewise. * gcc.dg/lto/pr64685_0.c: Likewise. * gcc.dg/lto/pr64685_1.c: Likewise. Index: gcc/ipa-reference.c === --- gcc/ipa-reference.c (revision 219884) +++ gcc/ipa-reference.c (working copy) @@ -236,21 +236,6 @@ ipa_reference_get_not_written_global (st } - -/* Add VAR to all_module_statics and the two - reference_vars_to_consider* sets. */ - -static inline void -add_static_var (tree var) -{ - int uid = DECL_UID (var); - gcc_assert (TREE_CODE (var) == VAR_DECL); - if (dump_file) -splay_tree_insert (reference_vars_to_consider, - uid, (splay_tree_value)var); - bitmap_set_bit (all_module_statics, uid); -} - /* Return true if the variable T is the right kind of static variable to perform compilation unit scope escape analysis. */ @@ -285,12 +270,6 @@ is_proper_for_analysis (tree t) if (bitmap_bit_p (ignore_module_statics, DECL_UID (t))) return false; - /* This is a variable we care about. Check if we have seen it - before, and if not add it the set of variables we care about. */ - if (all_module_statics - && !bitmap_bit_p (all_module_statics, DECL_UID (t))) -add_static_var (t); - return true; } @@ -497,6 +476,15 @@ analyze_function (struct cgraph_node *fn var = ref->referred->decl; if (!is_proper_for_analysis (var)) continue; + /* This is a variable we care about. Check if we have seen it +before, and if not add it the set of variables we care about. */ + if (all_module_statics + && bitmap_set_bit (all_module_statics, DECL_UID (var))) + { + if (dump_file) + splay_tree_insert (reference_vars_to_consider, + DECL_UID (var), (splay_tree_value)var); + } switch (ref->use) { case IPA_REF_LOAD: Index: gcc/testsuite/gcc.dg/lto/pr64684_0.c === --- gcc/testsuite/gcc.dg/lto/pr64684_0.c(revision 0) +++ gcc/testsuite/gcc.dg/lto/pr64684_0.c(working copy) @@ -0,0 +1,13 @@ +/* { dg-lto-do run } */ +/* { dg-lto-options { { -O1 -flto } } } */ + +extern void fn2 (void); +extern int a; + +void +fn1 () +{ + a = -1; + fn2 (); + a &= 1; +} Index: gcc/testsuite/gcc.dg/lto/pr64684_1.c === --- gcc/testsuite/gcc.dg/lto/pr64684_1.c(revision 0) +++ gcc/testsuite/gcc.dg/lto/pr64684_1.c(working copy) @@ -0,0 +1,9 @@ +/* { dg-options "-Os" } */ + +extern int a; + +void +fn2 (void) +{ + a = 0; +} Index: gcc/testsuite/gcc.dg/lto/pr64684_2.c === --- gcc/testsuite/gcc.dg/lto/pr64684_2.c(revision 0) +++ gcc/testsuite/gcc.dg/lto/pr64684_2.c(working copy) @@ -0,0 +1,16 @@ +/* { dg-options "-O0" } */ + +extern void fn1 (void); + +int a; + +int +main () +{ + fn1 (); + + if (a != 0) +__builtin_abort (); + + return 0; +} Index: gcc/testsuite/gcc.dg/lto/pr64685_0.c === --- gcc/testsuite/gcc.dg/lto/pr64685_0.c(revision 0) +++ gcc/testsuite/gcc.dg/lto/pr64685_0.c(working copy) @@ -0,0 +1,10 @@ +/* { dg-lto-do run } */ +/* { dg-lto-options { { -flto } } } */ + +extern int b; + +void +fn1 (void) +{ +b = 0; +} Index: gcc/testsuite/gcc.dg/lto/pr64685_1.c === --- gcc/testsuite/gcc.dg/lto/pr64685_1.c(revision 0) +++ gcc/testsuite/gcc.dg/lto/pr64685_1.c(working copy) @@ -0,0 +1,27 @@ +/* { dg-options "-O1" } */ + +extern void fn1 (void); + +int a[2], b; + +static void +foo (int p) +{ + b = 1 ^ a[(b ^ 1) & 1]; + b = 1 ^ a[b & 1]; + if (p) +__builtin_abort (); +} + +int +main () +{ + foo (0); + b = 0; + foo (0); + + if (b != 1) +__builtin_abort (); + + return 0; +}
Re: [patch] libstdc++/64650 add bad_optional_access default constructor
On 19/01/15 15:35 +, Jonathan Wakely wrote: The Library Fundamentals TS says std::experimental::bad_optional_access should have a default constructor, but we only support construction from strings. This removes the unused and non-standard std::string constructor and adds the required default constructor. Tested x86_64-linux, *not* committed. Now committed to trunk.
Re: [Ada] Fix bootstrapping on darwin9/10 (PR ada/64349).
> Le 20 janv. 2015 à 11:59, Iain Sandoe a écrit : > > > On 20 Jan 2015, at 10:53, Arnaud Charlet wrote: > >>> Any news on when this might hit trunk? >>> - it is a bootstrap issue (although on older targets). >> >> Right, and you have local patches/a work around. >> >> I have been on paternity leave, so with no time to sync our changes (and >> with other priorities :-)). > > indeed :-) > >> My next sync won't be before next week. >> >> Let us know if you'd like to see Tristan's patch before that, we can send it >> to you in the mean. > > That's fine - as you say we have a wrok-around in the meantime, > thanks > Iain Could you please post (or mail me) Tristan's patch? I’ld like to test it before it is committed (chat échaudé craint l’eau froide!). TIA Dominique
Re: [patch] Update C++11 status in libstdc++ docs
The latest C++11 and C++14 status updates. Committed to trunk. commit 4f5c0e10664f9230fe836f37bf1f33f252c78fd2 Author: Jonathan Wakely Date: Tue Jan 20 12:30:35 2015 + * doc/xml/manual/status_cxx2011.xml: Remove stray dbhtml tags. * doc/xml/manual/status_cxx2014.xml: Update status. * doc/html/manual/status.html: Regenerate. diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml index 76387c7..742d38d 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml @@ -944,7 +944,6 @@ particular release. - 20.11.5 Class template duration Y @@ -1068,14 +1067,12 @@ particular release. - 21.2.3.1 struct char_traitsY - 21.2.3.2 struct char_traits Y diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml index f716739..fc32995 100644 --- a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml +++ b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml @@ -212,14 +212,13 @@ not in any particular release. - http://www.w3.org/1999/xlink"; xlink:href="http://www.open-std.org/JTC1/sc22/WG21/docs/papers/2013/n3657.htm";> N3657 Adding heterogeneous comparison lookup to associative containers - WIP + Y @@ -242,8 +241,8 @@ not in any particular release. Null Forward Iterators - N - + Partial + Only affects Debug Mode
Re: [C++ PATCH, RFC] PR c++/63959, continued
On 01/19/2015 06:06 PM, Ville Voutilainen wrote: + return (trivial_type_p (type1) + || (scalarish_type_p (type1) && CP_TYPE_VOLATILE_P (type1)) + || type_code1 == REFERENCE_TYPE || (CLASS_TYPE_P (type1) && TYPE_HAS_TRIVIAL_DESTRUCTOR (type1))); I think we can drop the trivial_type_p check now, it's redundant. Jason
Re: Housekeeping work in backends.html
On Wed, Jan 7, 2015 at 9:42 AM, Eric Botcazou wrote: >> the attached patch removes obsolete ports (c4x, m68hc11 and ms1), toggles >> the 'p' letter and adjust accordingly (only avr, fr30, m68k, mcore, rs6000 >> and sh still use define_peephole) and removes trailing spaces. > > Same treatment for the 'd' letter, the ports that do not use DFA scheduler > descriptions are a clear minority (avr, cr16, cris, fr30, h8300, m32c, mmix, > msp430, pdp11, stormy16, vax). Applied. Perhaps 'd' should just go away completely. It was intended to distinguish between ports using the old scheduler description and ports using the DFA model. But support for the old scheduler description was removed some 10 years ago, and AFAIR the targets that don't use the DFA scheduler don't use the scheduler at all. Ciao! Steven
Re: [PATCH] Fix PR64535 - increase emergency EH buffers via a new allocator
On 20/01/15 10:06 +0100, Richard Biener wrote: On Mon, 19 Jan 2015, Jonathan Wakely wrote: On 19/01/15 11:33 +0100, Richard Biener wrote: > On Mon, 12 Jan 2015, Richard Biener wrote: > > > > > This "fixes" PR64535 by changing the fixed object size emergency pool > > to a variable EH object size (but fixed arena size) allocator. Via > > combining the dependent and non-dependent EH arenas this should allow > > around 600 bad_alloc throws in OOM situations on x86_64-linux > > compared to the current 64 which should provide some headroom to > > the poor souls using EH to communicate OOM in a heavily threaded > > enviroment. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 1 > > as in the patch below, forcing the use of the allocator). I see only the #else part is kept now - that was what I was going to suggest. > Unfortunately I couldn't get an answer of whether throwing > bad_alloc from a throw where we can't allocate an exception > is a) valid or b) even required by the standard ('throw' isn't > listed as 'allocation function' - also our EH allocators are > marked as throw(), so such change would change the ABI...). I'll ask the C++ committee. Thanks. Only needing to deal with std::bad_alloc allocations from the pool would greatly simplify it (I'd do a fixed-object-size one then). Basically the unspecified way memory is allocated for exceptions cannot be 'operator new()', but malloc or a static buffer (or both) is OK. If that runs out of space (which is the situation we care about here) it's undefined, because the program has exceeded the system's resource limits. So we can do whatever is best for our users. On that basis, I think using a fixed-object-size pool and only using it for bad_alloc exceptions is reasonable.
[PATCH]: Conditionally include target specific files while building TSAN
Hi all, This patch changes make file and configure under libsanitizer, to separate out X86_64 specific file "tsan_rtl_amd64.S" from getting build for targets other than X86_64. Ok for trunk? Please review. regards, Venkat, ChangeLog 2015-01-19 Venkataramanan Kumar * configure.ac (TSAN_TARGET_DEPENDENT_OBJECTS): Define. * configure: Regenerate. * tsan/Makefile.am (EXTRA_libtsan_la_SOURCES): Define, (libtsan_la_DEPENDENCIES): Likewise. * Makefile.in: Regenerate. * asan/Makefile.in: Regenerate. * interception/Makefile.in: Regenerate. * libbacktrace/Makefile.in: Regenerate. * lsan/Makefile.in: Regenerate. * sanitizer_common/Makefile.in: Regenerate. * tsan/Makefile.in: Regenerate. * ubsan/Makefile.in: Regenerate. diff --git a/libsanitizer/Makefile.in b/libsanitizer/Makefile.in index 0b89245..79a1be6 100644 --- a/libsanitizer/Makefile.in +++ b/libsanitizer/Makefile.in @@ -185,6 +185,7 @@ SED = @SED@ SET_MAKE = @SET_MAKE@ SHELL = @SHELL@ STRIP = @STRIP@ +TSAN_TARGET_DEPENDENT_OBJECTS = @TSAN_TARGET_DEPENDENT_OBJECTS@ VERSION = @VERSION@ VIEW_FILE = @VIEW_FILE@ abs_builddir = @abs_builddir@ diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in index 1a65944..e61ceda 100644 --- a/libsanitizer/asan/Makefile.in +++ b/libsanitizer/asan/Makefile.in @@ -194,6 +194,7 @@ SED = @SED@ SET_MAKE = @SET_MAKE@ SHELL = @SHELL@ STRIP = @STRIP@ +TSAN_TARGET_DEPENDENT_OBJECTS = @TSAN_TARGET_DEPENDENT_OBJECTS@ VERSION = @VERSION@ VIEW_FILE = @VIEW_FILE@ abs_builddir = @abs_builddir@ diff --git a/libsanitizer/configure b/libsanitizer/configure index 108b1fd..4a90acf 100755 --- a/libsanitizer/configure +++ b/libsanitizer/configure @@ -604,6 +604,7 @@ ac_subst_vars='am__EXEEXT_FALSE am__EXEEXT_TRUE LTLIBOBJS LIBOBJS +TSAN_TARGET_DEPENDENT_OBJECTS LIBBACKTRACE_SUPPORTED_FALSE LIBBACKTRACE_SUPPORTED_TRUE BACKTRACE_SUPPORTS_THREADS @@ -12019,7 +12020,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 12022 "configure" +#line 12023 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -12125,7 +12126,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 12128 "configure" +#line 12129 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -16362,6 +16363,12 @@ if test "x$TSAN_SUPPORTED" = "xyes"; then fi +case "${target}" in + x86_64-*-linux-*) TSAN_TARGET_DEPENDENT_OBJECTS='tsan_rtl_amd64.lo' ;; + *) TSAN_TARGET_DEPENDENT_OBJECTS='' ;; +esac + + cat >confcache <<\_ACEOF # This file is a shell script that caches the results of configure # tests run on this system so they can be shared between configure diff --git a/libsanitizer/configure.ac b/libsanitizer/configure.ac index e672131..03208db 100644 --- a/libsanitizer/configure.ac +++ b/libsanitizer/configure.ac @@ -346,4 +346,10 @@ _EOF ]) fi +case "${target}" in + x86_64-*-linux-*) TSAN_TARGET_DEPENDENT_OBJECTS='tsan_rtl_amd64.lo' ;; + *) TSAN_TARGET_DEPENDENT_OBJECTS='' ;; +esac +AC_SUBST([TSAN_TARGET_DEPENDENT_OBJECTS]) + AC_OUTPUT diff --git a/libsanitizer/interception/Makefile.in b/libsanitizer/interception/Makefile.in index 8ce4fd0..0e261b4 100644 --- a/libsanitizer/interception/Makefile.in +++ b/libsanitizer/interception/Makefile.in @@ -150,6 +150,7 @@ SED = @SED@ SET_MAKE = @SET_MAKE@ SHELL = @SHELL@ STRIP = @STRIP@ +TSAN_TARGET_DEPENDENT_OBJECTS = @TSAN_TARGET_DEPENDENT_OBJECTS@ VERSION = @VERSION@ VIEW_FILE = @VIEW_FILE@ abs_builddir = @abs_builddir@ diff --git a/libsanitizer/libbacktrace/Makefile.in b/libsanitizer/libbacktrace/Makefile.in index a4f9912..7d2e244 100644 --- a/libsanitizer/libbacktrace/Makefile.in +++ b/libsanitizer/libbacktrace/Makefile.in @@ -192,6 +192,7 @@ SED = @SED@ SET_MAKE = @SET_MAKE@ SHELL = @SHELL@ STRIP = @STRIP@ +TSAN_TARGET_DEPENDENT_OBJECTS = @TSAN_TARGET_DEPENDENT_OBJECTS@ VERSION = @VERSION@ VIEW_FILE = @VIEW_FILE@ abs_builddir = @abs_builddir@ diff --git a/libsanitizer/lsan/Makefile.in b/libsanitizer/lsan/Makefile.in index bb6f95f..3ad4401 100644 --- a/libsanitizer/lsan/Makefile.in +++ b/libsanitizer/lsan/Makefile.in @@ -185,6 +185,7 @@ SED = @SED@ SET_MAKE = @SET_MAKE@ SHELL = @SHELL@ STRIP = @STRIP@ +TSAN_TARGET_DEPENDENT_OBJECTS = @TSAN_TARGET_DEPENDENT_OBJECTS@ VERSION = @VERSION@ VIEW_FILE = @VIEW_FILE@ abs_builddir = @abs_builddir@ diff --git a/libsanitizer/sanitizer_common/Makefile.in b/libsanitizer/sanitizer_common/Makefile.in index 86bf787..4a0e727 100644 --- a/libsanitizer/sanitizer_common/Makefile.in +++ b/libsanitizer/sanitizer_common/Makefile.in @@ -178,6 +178,7 @@ SED = @SED@ SET_MAKE = @SET_MAKE@ SHELL = @SHELL@ STRIP = @STRIP@ +TSAN_TARGET_DEPENDENT_OBJECTS = @TSAN_TARGET_DEPENDENT_OBJECTS@ VERSION = @VERSION@ VIEW_FILE = @VIEW_FILE@ abs_builddir = @abs_builddir@ diff --git a/libsanitizer/tsan/Makefile.am b/libsanitizer/tsan/Makefile.am index c532590..abfa
Re: [PATCH] Fix PR64535 - increase emergency EH buffers via a new allocator
On Tue, 20 Jan 2015, Jonathan Wakely wrote: > On 20/01/15 10:06 +0100, Richard Biener wrote: > > On Mon, 19 Jan 2015, Jonathan Wakely wrote: > > > > > On 19/01/15 11:33 +0100, Richard Biener wrote: > > > > On Mon, 12 Jan 2015, Richard Biener wrote: > > > > > > > > > > > > > > This "fixes" PR64535 by changing the fixed object size emergency pool > > > > > to a variable EH object size (but fixed arena size) allocator. Via > > > > > combining the dependent and non-dependent EH arenas this should allow > > > > > around 600 bad_alloc throws in OOM situations on x86_64-linux > > > > > compared to the current 64 which should provide some headroom to > > > > > the poor souls using EH to communicate OOM in a heavily threaded > > > > > enviroment. > > > > > > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu (with the #if 1 > > > > > as in the patch below, forcing the use of the allocator). > > > > > > I see only the #else part is kept now - that was what I was going to > > > suggest. > > > > > > > Unfortunately I couldn't get an answer of whether throwing > > > > bad_alloc from a throw where we can't allocate an exception > > > > is a) valid or b) even required by the standard ('throw' isn't > > > > listed as 'allocation function' - also our EH allocators are > > > > marked as throw(), so such change would change the ABI...). > > > > > > I'll ask the C++ committee. > > > > Thanks. Only needing to deal with std::bad_alloc allocations from > > the pool would greatly simplify it (I'd do a fixed-object-size one then). > > Basically the unspecified way memory is allocated for exceptions > cannot be 'operator new()', but malloc or a static buffer (or both) is > OK. If that runs out of space (which is the situation we care about > here) it's undefined, because the program has exceeded the system's > resource limits. So we can do whatever is best for our users. > > On that basis, I think using a fixed-object-size pool and only using > it for bad_alloc exceptions is reasonable. Though my question was whether 'throw X()' might instead behave as 'throw std::bad_alloc()' if allocating X exceeds the system's resource limits. I guess the answer is yes as it's undefined? Of course we can't really implement that because our EH allocator is marked throw(). Richard.
Re: [PATCH 7/8] Model cache auto-prefetcher in scheduler
On Jan 20, 2015, at 1:24 PM, Richard Earnshaw wrote: ... >>> In general, how should someone tuning the compiler for this parameter >>> select a value that isn't one of (-1, m_i_q_d+1)? >> >> From my experiments it seems there are 4 reasonable values for the >> parameter: (-1) autopref turned off, (0) turned on in rank_for_schedule, >> (m_i_q_d+1) turned on everywhere. If there is a static constructor >> generated for tune tables and it is a problem to have it -- I can shrink >> acceptable values to these 3 and call it a day. >> > > You only mention 3 values: what was the fourth? Typo. No fourth. > It might be better then > to define a set of values that represent each of these cases and only > allow the tuning parameters to select one of those. The init code then > uses that set to select how to set up the various parameters to meet > those goals. > > So something like > > ARM_SCHED_AUTOPREF_OFF > ARM_SCHED_AUTOPREF_RANK > ARM_SCHED_AUTOPREF_FULL A patch is attached. I bootstrapped it on arm-linux-gnueabihf. OK to apply? -- Maxim Kuvyrkov www.linaro.org 0001-Use-enum-for-sched_autopref-tune-settings.patch Description: Binary data
Re: [PATCH 7/8] Model cache auto-prefetcher in scheduler
On 20/01/15 13:26, Maxim Kuvyrkov wrote: > On Jan 20, 2015, at 1:24 PM, Richard Earnshaw wrote: > ... In general, how should someone tuning the compiler for this parameter select a value that isn't one of (-1, m_i_q_d+1)? >>> >>> From my experiments it seems there are 4 reasonable values for the >>> parameter: (-1) autopref turned off, (0) turned on in rank_for_schedule, >>> (m_i_q_d+1) turned on everywhere. If there is a static constructor >>> generated for tune tables and it is a problem to have it -- I can shrink >>> acceptable values to these 3 and call it a day. >>> >> >> You only mention 3 values: what was the fourth? > > Typo. No fourth. > >> It might be better then >> to define a set of values that represent each of these cases and only >> allow the tuning parameters to select one of those. The init code then >> uses that set to select how to set up the various parameters to meet >> those goals. >> >> So something like >> >> ARM_SCHED_AUTOPREF_OFF >> ARM_SCHED_AUTOPREF_RANK >> ARM_SCHED_AUTOPREF_FULL > > A patch is attached. I bootstrapped it on arm-linux-gnueabihf. OK to apply? > OK. Thanks. R. > -- > Maxim Kuvyrkov > www.linaro.org > > > 0001-Use-enum-for-sched_autopref-tune-settings.patch > > > From 9d9ee7c33210960970d0d78ccc7a16a58b392f85 Mon Sep 17 00:00:00 2001 > From: Maxim Kuvyrkov > Date: Tue, 20 Jan 2015 12:30:37 + > Subject: [PATCH 1/3] Use enum for sched_autopref tune settings > > * config/arm/arm-protos.h (enum arm_sched_autopref): New constants. > (struct tune_params): Use the enum. > * arm.c (arm_*_tune): Update. > (arm_option_override): Update. > --- > gcc/config/arm/arm-protos.h |9 +++- > gcc/config/arm/arm.c| 51 > +-- > 2 files changed, 38 insertions(+), 22 deletions(-) > > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h > index 3db7e16..307babb 100644 > --- a/gcc/config/arm/arm-protos.h > +++ b/gcc/config/arm/arm-protos.h > @@ -257,6 +257,13 @@ struct cpu_vec_costs { > > struct cpu_cost_table; > > +enum arm_sched_autopref > + { > +ARM_SCHED_AUTOPREF_OFF, > +ARM_SCHED_AUTOPREF_RANK, > +ARM_SCHED_AUTOPREF_FULL > + }; > + > struct tune_params > { >bool (*rtx_costs) (rtx, RTX_CODE, RTX_CODE, int *, bool); > @@ -292,7 +299,7 @@ struct tune_params >/* Bitfield encoding the fuseable pairs of instructions. */ >unsigned int fuseable_ops; >/* Depth of scheduling queue to check for L2 autoprefetcher. */ > - int sched_autopref_queue_depth; > + enum arm_sched_autopref sched_autopref; > }; > > extern const struct tune_params *current_tune; > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index fddd770..34672ce 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -1697,7 +1697,7 @@ const struct tune_params arm_slowmul_tune = >false, /* Prefer Neon for stringops. > */ >8, /* Maximum insns to inline > memset. */ >ARM_FUSE_NOTHING, /* Fuseable pairs of > instructions. */ > - -1 /* Sched L2 autopref depth. */ > + ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ > }; > > const struct tune_params arm_fastmul_tune = > @@ -1718,7 +1718,7 @@ const struct tune_params arm_fastmul_tune = >false, /* Prefer Neon for stringops. > */ >8, /* Maximum insns to inline > memset. */ >ARM_FUSE_NOTHING, /* Fuseable pairs of > instructions. */ > - -1 /* Sched L2 autopref depth. */ > + ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ > }; > > /* StrongARM has early execution of branches, so a sequence that is worth > @@ -1742,7 +1742,7 @@ const struct tune_params arm_strongarm_tune = >false, /* Prefer Neon for stringops. > */ >8, /* Maximum insns to inline > memset. */ >ARM_FUSE_NOTHING, /* Fuseable pairs of > instructions. */ > - -1 /* Sched L2 autopref depth. */ > + ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ > }; > > const struct tune_params arm_xscale_tune = > @@ -1763,7 +1763,7 @@ const struct tune_params arm_xscale_tune = >false, /* Prefer Neon for stringops. > */ >8, /* Maximum insns to inline > memset. */ >ARM_FUSE_NOTHING, /* Fuseable pairs of > instructions. */ > - -1 /* Sched L2 autopref depth. */ > + ARM_SCHED_AUTOPREF_OFF /* Sched L2 autopref. */ > }; >
RFA: RL78: Scan inside PARALLELs when looking for dead code
Hi DJ, Here is a small patch to fix a code-gen problem for the RL78. The bug was that the register death pass was not looking inside PARALLELs, and thus missing some USE and SET cases. I considered adding code to scan all of the elements in the PARALLEL, but the only ones that can be generated for the RL78 are the SImode shift patterns and these always consist of a SET as the first element and a CLOBBER as the second element. So I decided to keep things simple and create the patch below. OK to apply ? Cheers Nick 2015-01-20 Nick Clifton * config/rl78/rl78.c (rl78_calculate_death_notes): Look inside PARALLELs. Index: gcc/config/rl78/rl78.c === --- gcc/config/rl78/rl78.c (revision 219891) +++ gcc/config/rl78/rl78.c (working copy) @@ -3583,6 +3583,8 @@ { case INSN: p = PATTERN (insn); + if (GET_CODE (p) == PARALLEL) + p = XVECEXP (p, 0, 0); switch (GET_CODE (p)) { case SET:
[PATCH, committed] Fix chkp-always_inline.c test to not fail with -fpic
Hi, chkp-always_inline.c test fails with -fpic (https://gcc.gnu.org/ml/gcc-regression/2015-01/msg00528.html) because of non static function with 'always_inline' attribute. This patch fixes it. Committed as obvious. Thanks, Ilya -- 2015-01-20 Ilya Enkovich * gcc.target/i386/chkp-always_inline.c (f1): Make static to avoid errors with -fpic. diff --git a/gcc/testsuite/gcc.target/i386/chkp-always_inline.c b/gcc/testsuite/gcc.target/i386/chkp-always_inline.c index 16d2358..26e80fe 100644 --- a/gcc/testsuite/gcc.target/i386/chkp-always_inline.c +++ b/gcc/testsuite/gcc.target/i386/chkp-always_inline.c @@ -2,7 +2,7 @@ /* { dg-require-effective-target mpx } */ /* { dg-options "-fcheck-pointer-bounds -mmpx -O2 -Wno-attributes" } */ -__attribute__((always_inline)) int f1 (int *p) +static __attribute__((always_inline)) int f1 (int *p) { return *p; }
Re: [PATCH][wwwdocs] Document removal of -mlra/-mnolra from Aarch64 and ARM backends.
On 20/01/15 11:59, Matthew Wahab wrote: > Hello, > > This patch documents in changes.html the removal of the -mlra/-mnolra > from the Aarch64 and ARM backends. > > Tested by checking the updated webpage in Firefox. > > Matthew Wahab > > > htdocs_mlra_changes.patch > OK. R. > > Index: htdocs/gcc-5/changes.html > === > RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v > retrieving revision 1.67 > diff -u -r1.67 changes.html > --- htdocs/gcc-5/changes.html 19 Jan 2015 15:48:46 - 1.67 > +++ htdocs/gcc-5/changes.html 20 Jan 2015 11:30:25 - > @@ -456,7 +456,10 @@ > Support for the Cavium ThunderX processor is now available through > the > -mcpu=thunderx and -mtune=thunderx options. > > - > + The transitional options -mlra and > -mno-lra > + have been removed. The AArch64 backend now uses the local register > + allocator (LRA) only. > + > > > ARM > @@ -488,7 +491,10 @@ > The options relating to the old ABI -mapcs and >-mapcs-frame have been deprecated. > > - > + The transitional options -mlra and > -mno-lra > + have been removed. The ARM backend now uses the local register > allocator > + (LRA) only. > + > > > IA-32/x86-64 >
Re: [PATCH] pr 64076 - tolerate different definitions of symbols in lto
On Tue, Jan 20, 2015 at 10:18:25AM +0100, Richard Biener wrote: > On Tue, Jan 20, 2015 at 3:31 AM, wrote: > > From: Trevor Saunders > > > > Hi, > > > > when doing an lto link we can have some symbols be ir only and others be > > machine code, which trips the assert here. Just adjust the assert to handle > > that. > > > > bootstrapped + regtested x86_64-linux-gnu, ok? > > Testcase? It's hard to understand what "machine code" is otherwise > or why this assert would fail. > > Thanks, the one in the pr, basically one file is compiled without -flto and then linked with -flto that + an odr violation making the two object files different sets of thunks. does the lto test suite support not passing -flto to one compilation? it wasn't clear to me how to put the test case in the testsuite. Trev > Richard. > > > Trev > > > > gcc/ > > > > * ipa-visibility.c (update_visibility_by_resolution_info): Only > > assert when not in lto mode. > > --- > > gcc/ipa-visibility.c | 18 +- > > 1 file changed, 13 insertions(+), 5 deletions(-) > > > > diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c > > index 71894af..0791a1c 100644 > > --- a/gcc/ipa-visibility.c > > +++ b/gcc/ipa-visibility.c > > @@ -425,11 +425,19 @@ update_visibility_by_resolution_info (symtab_node * > > node) > >if (node->same_comdat_group) > > for (symtab_node *next = node->same_comdat_group; > > next != node; next = next->same_comdat_group) > > - gcc_assert (!next->externally_visible > > - || define == (next->resolution == > > LDPR_PREVAILING_DEF_IRONLY > > - || next->resolution == LDPR_PREVAILING_DEF > > - || next->resolution == LDPR_UNDEF > > - || next->resolution == > > LDPR_PREVAILING_DEF_IRONLY_EXP)); > > + { > > + if (!next->externally_visible) > > + continue; > > + > > + bool same_def > > + = define == (next->resolution == LDPR_PREVAILING_DEF_IRONLY > > + || next->resolution == LDPR_PREVAILING_DEF > > + || next->resolution == LDPR_UNDEF > > + || next->resolution == > > LDPR_PREVAILING_DEF_IRONLY_EXP); > > + gcc_assert (in_lto_p || same_def); > > + if (!same_def) > > + return; > > + } > > > >if (node->same_comdat_group) > > for (symtab_node *next = node->same_comdat_group; > > -- > > 2.1.4 > >
Re: [PATCH] pr 64076 - tolerate different definitions of symbols in lto
On Tue, Jan 20, 2015 at 3:15 PM, Trevor Saunders wrote: > On Tue, Jan 20, 2015 at 10:18:25AM +0100, Richard Biener wrote: >> On Tue, Jan 20, 2015 at 3:31 AM, wrote: >> > From: Trevor Saunders >> > >> > Hi, >> > >> > when doing an lto link we can have some symbols be ir only and others be >> > machine code, which trips the assert here. Just adjust the assert to >> > handle >> > that. >> > >> > bootstrapped + regtested x86_64-linux-gnu, ok? >> >> Testcase? It's hard to understand what "machine code" is otherwise >> or why this assert would fail. >> >> Thanks, > > the one in the pr, basically one file is compiled without -flto and then > linked with -flto that + an odr violation making the two object files > different sets of thunks. > > does the lto test suite support not passing -flto to one compilation? it > wasn't clear to me how to put the test case in the testsuite. Yes, simply add { dg-options -fno-lto } to one of the secondary files. Richard. > Trev > >> Richard. >> >> > Trev >> > >> > gcc/ >> > >> > * ipa-visibility.c (update_visibility_by_resolution_info): Only >> > assert when not in lto mode. >> > --- >> > gcc/ipa-visibility.c | 18 +- >> > 1 file changed, 13 insertions(+), 5 deletions(-) >> > >> > diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c >> > index 71894af..0791a1c 100644 >> > --- a/gcc/ipa-visibility.c >> > +++ b/gcc/ipa-visibility.c >> > @@ -425,11 +425,19 @@ update_visibility_by_resolution_info (symtab_node * >> > node) >> >if (node->same_comdat_group) >> > for (symtab_node *next = node->same_comdat_group; >> > next != node; next = next->same_comdat_group) >> > - gcc_assert (!next->externally_visible >> > - || define == (next->resolution == >> > LDPR_PREVAILING_DEF_IRONLY >> > - || next->resolution == LDPR_PREVAILING_DEF >> > - || next->resolution == LDPR_UNDEF >> > - || next->resolution == >> > LDPR_PREVAILING_DEF_IRONLY_EXP)); >> > + { >> > + if (!next->externally_visible) >> > + continue; >> > + >> > + bool same_def >> > + = define == (next->resolution == LDPR_PREVAILING_DEF_IRONLY >> > + || next->resolution == LDPR_PREVAILING_DEF >> > + || next->resolution == LDPR_UNDEF >> > + || next->resolution == >> > LDPR_PREVAILING_DEF_IRONLY_EXP); >> > + gcc_assert (in_lto_p || same_def); >> > + if (!same_def) >> > + return; >> > + } >> > >> >if (node->same_comdat_group) >> > for (symtab_node *next = node->same_comdat_group; >> > -- >> > 2.1.4 >> >
Re: [4.8] Backport PR61553
On Tue, Jan 20, 2015 at 12:51:45PM +0100, Jakub Jelinek wrote: > On Tue, Jan 20, 2015 at 12:49:07PM +0100, Marek Polacek wrote: > > I'd like to backport this patch from trunk to 4.9 so as to > > fix PR63751. It's safe and has been on trunk for several months. > > > > Bootstrapped/regtested on x86_64-linux, ok? > > To 4.9 or 4.8 (subject says 4.8, above is 4.9). > I think it would be ok to both. Eek, I meant 4.9. But 4.8 has the same issue, so I've committed the slightly adjusted patch to both 4.9 and 4.8. Marek
Re: Fix 59828 - Broken assembly on ppc* with two -mcpu= options
On Tue, Jan 20, 2015 at 12:41 AM, Alan Modra wrote: > On Mon, Jan 19, 2015 at 10:43:29PM -0500, David Edelsohn wrote: >> On Fri, Jan 17, 2014 at 10:58 PM, Alan Modra wrote: >> > This patch cures PR59828 by translating all the -mcpu options at once, >> > in order, to their equivalent assembler -m options by using a new spec >> > function. In the process this removes some duplication. >> >> ASM_CPU_SPEC is too fragile a mechanism. I would much prefer to >> expand on the ".machine" directive that I added to >> rs6000_file_start(). The initial implementation explicitly avoids >> .machine when -mcpu= or --with-cpu= is present as a conservative >> start. >> >> It seems much better to select a .machine directive based on the >> actual target ISA flag bits enabled than translating CPU command line >> options to ASM options. Patches to replace ASM_CPU_SPEC with .machine >> and expand functionality for AIX are welcome. > > This might make sense when looking only at gcc, but when considering > the whole toolchain I think you'll run into difficulty. gas and other > powerpc assemblers have always been invoked with -m options to select > the cpu, so if you do away with ASM_CPU_SPEC and rely on .machine then > you will be exercising the assembler in a new way. I am sure that > this will not work for all powerpc assemblers currently in use. It is stressing .machine more than that feature has been in the past, but it is functionality that is suppose to work. .machine already has been stressed more with IFUNC pushing and popping ISAs. Are you concerned about the fundamental functionality of the pseudo-op or a particular GAS release missing support for a particular ISA? AIX supports .machine, but I think that it expects slightly different processor names. I am not certain about LLVM-AS, but it normally is not fed external assembly language files. Thanks, David
Re: [C++ PATCH, RFC] PR c++/63959, continued
On 20 January 2015 at 15:06, Jason Merrill wrote: > On 01/19/2015 06:06 PM, Ville Voutilainen wrote: >> >> + return (trivial_type_p (type1) >> + || (scalarish_type_p (type1) && CP_TYPE_VOLATILE_P (type1)) >> + || type_code1 == REFERENCE_TYPE >> || (CLASS_TYPE_P (type1) >> && TYPE_HAS_TRIVIAL_DESTRUCTOR (type1))); > > > I think we can drop the trivial_type_p check now, it's redundant. Well, we can do even better, just check for scalarish_type instead of trivial || scalarish && volatile. ;) Patch attached, changelog as before. I re-ran the tests that previously were failing in the first versions of the patch, didn't run the full suite again, as I unfortunately need to run. :) diff --git a/gcc/cp/class.c b/gcc/cp/class.c index edb87fe..529a2bf 100644 --- a/gcc/cp/class.c +++ b/gcc/cp/class.c @@ -3717,6 +3717,16 @@ check_field_decls (tree t, tree *access_decls, if (DECL_INITIAL (x) && cxx_dialect < cxx14) CLASSTYPE_NON_AGGREGATE (t) = true; + /* If any field is volatile, the structure type has complex copy +and move operations. */ + if (CP_TYPE_VOLATILE_P (type)) + { + TYPE_HAS_COMPLEX_COPY_ASSIGN (t) = 1; + TYPE_HAS_COMPLEX_MOVE_ASSIGN (t) = 1; + TYPE_HAS_COMPLEX_COPY_CTOR (t) = 1; + TYPE_HAS_COMPLEX_MOVE_CTOR (t) = 1; + } + /* If any field is const, the structure type is pseudo-const. */ if (CP_TYPE_CONST_P (type)) { diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 15b8d01..5856670 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -7383,7 +7383,7 @@ trait_expr_value (cp_trait_kind kind, tree type1, tree type2) case CPTK_HAS_TRIVIAL_DESTRUCTOR: type1 = strip_array_types (type1); - return (trivial_type_p (type1) || type_code1 == REFERENCE_TYPE + return (scalarish_type_p (type1) || type_code1 == REFERENCE_TYPE || (CLASS_TYPE_P (type1) && TYPE_HAS_TRIVIAL_DESTRUCTOR (type1))); diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index 80f2ce6..79be556 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -3211,7 +3211,8 @@ trivially_copyable_p (const_tree t) && (!TYPE_HAS_COPY_ASSIGN (t) || !TYPE_HAS_COMPLEX_COPY_ASSIGN (t)) && !TYPE_HAS_COMPLEX_MOVE_ASSIGN (t) - && TYPE_HAS_TRIVIAL_DESTRUCTOR (t)); + && TYPE_HAS_TRIVIAL_DESTRUCTOR (t) + && !CP_TYPE_VOLATILE_P (t)); else return !CP_TYPE_VOLATILE_P (t) && scalarish_type_p (t); } @@ -3228,7 +3229,7 @@ trivial_type_p (const_tree t) return (TYPE_HAS_TRIVIAL_DFLT (t) && trivially_copyable_p (t)); else -return scalarish_type_p (t); +return !CP_TYPE_VOLATILE_P (t) && scalarish_type_p (t); } /* Returns 1 iff type T is a POD type, as defined in [basic.types]. */ diff --git a/gcc/testsuite/g++.dg/cpp0x/trivial1.C b/gcc/testsuite/g++.dg/cpp0x/trivial1.C index 3fed570..d0d7b46 100644 --- a/gcc/testsuite/g++.dg/cpp0x/trivial1.C +++ b/gcc/testsuite/g++.dg/cpp0x/trivial1.C @@ -18,6 +18,9 @@ #define TRY(expr) static_assert (expr, #expr) #define YES(type) TRY(std::is_trivial::value); \ TRY(std::is_trivial::value); \ + TRY(!std::is_trivial::value) +#define YES2(type) TRY(std::is_trivial::value); \ + TRY(std::is_trivial::value); \ TRY(std::is_trivial::value) #define NO(type) TRY(!std::is_trivial::value); \ TRY(!std::is_trivial::value); \ @@ -27,8 +30,8 @@ struct A; YES(int); YES(__complex int); -YES(void *); -YES(int A::*); +YES2(void *); +YES2(int A::*); typedef int (A::*pmf)(); YES(pmf); diff --git a/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C b/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C index a5bac7b..35ef1f1 100644 --- a/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C +++ b/gcc/testsuite/g++.dg/ext/is_trivially_constructible1.C @@ -39,5 +39,16 @@ SA(!__is_trivially_copyable(volatile int)); struct E1 {const int val;}; SA(__is_trivially_copyable(E1)); +SA(!__is_trivially_copyable(volatile E1)); struct E2 {int& val;}; SA(__is_trivially_copyable(E2)); +struct E3 {volatile int val;}; +SA(!__is_trivially_copyable(E3)); +struct E4 {A a;}; +SA(!__is_trivially_copyable(volatile E4)); +struct E5 {volatile A a;}; +SA(!__is_trivially_copyable(E5)); +SA(!__is_trivially_copyable(volatile E5)); +struct E6 : A {}; +SA(__is_trivially_copyable(E6)); +SA(!__is_trivially_copyable(volatile E6)); diff --git a/libstdc++-v3/testsuite/20_util/has_trivial_copy_constructor/value.cc b/libstdc++-v3/testsuite/20_util/has_trivial_copy_constructor/value.cc index ea58a22..9cfe672 100644 --- a/libstdc++-v3/testsuite/20_util/has_trivial_copy_constructor/value.cc +++ b/libstdc++-v3/testsuite/20_util/has_trivial_copy_constructor/value.cc @@ -31,30 +31,31 @@ void test01() using namespace __gnu_test; // Positive tests. - static_assert(test_category(true), ""); - static_assert(test_category(true), ""); - st
[PATCH]Skip g++.dg/tls testes on target using status wrapper
Hi all, This patch will add "unwrapped" target selector for g++.dg/tls tests. When "needs_status_wrapper" flag is set in a board file, testglue.o is used to wrap exit, _exit, abort etc. standard functions, and print out "*** EXIT code X" for dejagnu. And this print is only done once. Normally, exit() is called first with the return value of main function. And this value is feed back to dejagnu. The really exit code from _exit(), however, is not correctly captured. This patch will skip those testes as the intended exit code is not correctly captured by dejagnu. Okay to commit? Regards, Renlin Li gcc/testsuite/ChangeLog: 2015-01-20 Renlin Li * g++.dg/tls/thread_local5.C: Skip when dejagnu wrapper is used. * g++.dg/tls/thread_local5g.C: Likewise. * g++.dg/tls/thread_local6g.C: Likewise. diff --git a/gcc/testsuite/g++.dg/tls/thread_local5.C b/gcc/testsuite/g++.dg/tls/thread_local5.C index 8d17584..c4d5ff0 100644 --- a/gcc/testsuite/g++.dg/tls/thread_local5.C +++ b/gcc/testsuite/g++.dg/tls/thread_local5.C @@ -2,6 +2,7 @@ // { dg-do run } // { dg-require-effective-target c++11 } +// { dg-require-effective-target unwrapped } // { dg-require-effective-target tls_runtime } // { dg-require-effective-target pthread } // { dg-options -pthread } diff --git a/gcc/testsuite/g++.dg/tls/thread_local5g.C b/gcc/testsuite/g++.dg/tls/thread_local5g.C index f87b038..5ced551 100644 --- a/gcc/testsuite/g++.dg/tls/thread_local5g.C +++ b/gcc/testsuite/g++.dg/tls/thread_local5g.C @@ -2,6 +2,7 @@ // { dg-do run } // { dg-require-effective-target c++11 } +// { dg-require-effective-target unwrapped } // { dg-require-effective-target tls_runtime } // { dg-require-effective-target pthread } // { dg-require-cxa-atexit "" } diff --git a/gcc/testsuite/g++.dg/tls/thread_local6g.C b/gcc/testsuite/g++.dg/tls/thread_local6g.C index f261d54..b8f9cdf 100644 --- a/gcc/testsuite/g++.dg/tls/thread_local6g.C +++ b/gcc/testsuite/g++.dg/tls/thread_local6g.C @@ -2,6 +2,7 @@ // { dg-do run { target c++11 } } // { dg-add-options tls } +// { dg-require-effective-target unwrapped } // { dg-require-effective-target tls_runtime } // { dg-require-cxa-atexit "" }
Re: [PATCH] Fix PR64535 - increase emergency EH buffers via a new allocator
On 20/01/15 14:25 +0100, Richard Biener wrote: Though my question was whether 'throw X()' might instead behave as 'throw std::bad_alloc()' if allocating X exceeds the system's resource limits. I guess the answer is yes as it's undefined? Right, if we can't allocate memory for an X (plus the EH header) then it's UB, so we can do anything, including using your pool to get room for a bad_alloc and throwing that.
[PATCH][AARCH64]Fix TLS local exec model addressing code generation inconsistency.
Hi all, The following code sequence should be generated for TLS local exec model in aarch64 backend. add t0, tp, #:tprel_hi12:x1, lsl #12 add t0, #:tprel_lo12_nc:x1 However, we have the following codegen using -S option. add t0, tp, #:tprel_hi12:x1 < (1) add t0, #:tprel_lo12_nc:x1 This is not correct from the first impression. The tprel_hi12 should left shift 12 bits first and add to thread pointer. However, the gas is able to detect tprel_hi12 relocate modifier, and rewrite the instruction marked as 1 above into the shifted form. So the final behaviour is correct. But I think the inconsistency is very confusing. The asm generated by Gcc and object dumped from object code are different because of above reason. This patch should fix this small issue. Okay to commit? Regards, Renlin Li gcc/ChangeLog: 2015-01-20 Renlin Li * config/aarch64/aarch64.c (aarch64_load_symref_appropriately): Correct the comment. * config/aarch64/aarch64.md (tlsle_small_): Add left shift 12-bit for higher part.diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index ee9a962..cd01b82 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -692,8 +692,8 @@ tls_symbolic_operand_type (rtx addr) Local Exec: mrs tp, tpidr_el0 - add t0, tp, #:tprel_hi12:imm - add t0, #:tprel_lo12_nc:imm + add t0, tp, #:tprel_hi12:imm, lsl #12 + add t0, t0, #:tprel_lo12_nc:imm */ static void diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 597ff8c..90f7bf4 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4039,7 +4039,7 @@ (match_operand 2 "aarch64_tls_le_symref" "S")] UNSPEC_GOTSMALLTLS))] "" - "add\\t%0, %1, #%G2\;add\\t%0, %0, #%L2" + "add\\t%0, %1, #%G2, lsl #12\;add\\t%0, %0, #%L2" [(set_attr "type" "alu_sreg") (set_attr "length" "8")] )
[PATCH, nios2] Updates to Nios II Linux
The Nios II ports of glibc and Linux kernel are now both upstream. New system conventions now use a non-executable stack. Attached patch committed to support new conventions, applied to both trunk and 4.9 branch. Chung-Lin 2015-01-20 Chung-Lin Tang gcc/ * config/nios2/nios2.c (nios2_asm_file_end): Implement TARGET_ASM_FILE_END hook for adding .note.GNU-stack section when needed. (TARGET_ASM_FILE_END): Define. libgcc/ * config/nios2/linux-unwind.h (nios2_fallback_frame_state): Update rt_sigframe format and address for current Nios II Linux conventions. Index: libgcc/config/nios2/linux-unwind.h === --- libgcc/config/nios2/linux-unwind.h (revision 219897) +++ libgcc/config/nios2/linux-unwind.h (working copy) @@ -67,10 +67,9 @@ nios2_fallback_frame_state (struct _Unwind_Context if (pc[0] == (0x0084 | (__NR_rt_sigreturn << 6))) { struct rt_sigframe { - char retcode[12]; siginfo_t info; struct nios2_ucontext uc; - } *rt_ = context->ra; + } *rt_ = context->cfa; struct nios2_mcontext *regs = &rt_->uc.uc_mcontext; int i; Index: gcc/config/nios2/nios2.c === --- gcc/config/nios2/nios2.c (revision 219897) +++ gcc/config/nios2/nios2.c (working copy) @@ -2223,6 +2223,18 @@ nios2_output_dwarf_dtprel (FILE *file, int size, r fprintf (file, ")"); } +/* Implemet TARGET_ASM_FILE_END. */ + +static void +nios2_asm_file_end (void) +{ + /* The Nios II Linux stack is mapped non-executable by default, so add a + .note.GNU-stack section for switching to executable stacks only when + trampolines are generated. */ + if (TARGET_LINUX_ABI && trampolines_created) +file_end_indicate_exec_stack (); +} + /* Implement TARGET_ASM_FUNCTION_PROLOGUE. */ static void nios2_asm_function_prologue (FILE *file, HOST_WIDE_INT size ATTRIBUTE_UNUSED) @@ -3401,6 +3413,9 @@ nios2_merge_decl_attributes (tree olddecl, tree ne #undef TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA nios2_output_addr_const_extra +#undef TARGET_ASM_FILE_END +#define TARGET_ASM_FILE_END nios2_asm_file_end + #undef TARGET_OPTION_OVERRIDE #define TARGET_OPTION_OVERRIDE nios2_option_override
Re: [[ARM/AArch64][testsuite] 00/36] More Neon intrinsics tests.
On 19 January 2015 at 17:52, Marcus Shawcroft wrote: > On 13 January 2015 at 15:17, Christophe Lyon > wrote: >> This patch series is a follow-up of the conversion of my existing >> testsuite into DejaGnu. It does not yet cover all the tests I wrote, >> but I chose to post this set to have a chance to have it accepted >> before stage 4. I will have 35 more files to convert after this set. > > Christophe, can you respin 9,13,17,21,28,29,30 to address Tejas' comments? Sure. Here they are. > Thanks > /Marcus
[PING] [PATCH] [AArch64, NEON] Improve vmulX intrinsics
Hi, This is a ping for: https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00775.html Regtested with aarch64-linux-gnu on QEMU. This patch has no regressions for aarch64_be-linux-gnu big-endian target too. OK for the trunk? Thanks. Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 219845) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,38 @@ +2014-12-11 Felix Yang + Jiji Jiang + + * config/aarch64/aarch64-simd.md (aarch64_mul_n, + aarch64_mull_n, aarch64_mull, + aarch64_simd_mull2_n, aarch64_mull2_n, + aarch64_mull_lane, aarch64_mull2_lane_internal, + aarch64_mull_laneq, aarch64_mull2_laneq_internal, + aarch64_smull2_lane, aarch64_umull2_lane, + aarch64_smull2_laneq, aarch64_umull2_laneq, + aarch64_fmulx, aarch64_fmulx, aarch64_fmulx_lane, + aarch64_pmull2v16qi, aarch64_pmullv8qi): New patterns. + * config/aarch64/aarch64-simd-builtins.def (vec_widen_smult_hi_, + vec_widen_umult_hi_, umull, smull, smull_n, umull_n, mul_n, smull2_n, + umull2_n, smull_lane, umull_lane, smull_laneq, umull_laneq, pmull, + umull2_lane, smull2_laneq, umull2_laneq, fmulx, fmulx_lane, pmull2, + smull2_lane): New builtins. + * config/aarch64/arm_neon.h (vmul_n_f32, vmul_n_s16, vmul_n_s32, + vmul_n_u16, vmul_n_u32, vmulq_n_f32, vmulq_n_f64, vmulq_n_s16, + vmulq_n_s32, vmulq_n_u16, vmulq_n_u32, vmull_high_lane_s16, + vmull_high_lane_s32, vmull_high_lane_u16, vmull_high_lane_u32, + vmull_high_laneq_s16, vmull_high_laneq_s32, vmull_high_laneq_u16, + vmull_high_laneq_u32, vmull_high_n_s16, vmull_high_n_s32, + vmull_high_n_u16, vmull_high_n_u32, vmull_high_p8, vmull_high_s8, + vmull_high_s16, vmull_high_s32, vmull_high_u8, vmull_high_u16, + vmull_high_u32, vmull_lane_s16, vmull_lane_s32, vmull_lane_u16, + vmull_lane_u32, vmull_laneq_s16, vmull_laneq_s32, vmull_laneq_u16, + vmull_laneq_u32, vmull_n_s16, vmull_n_s32, vmull_n_u16, vmull_n_u32, + vmull_p8, vmull_s8, vmull_s16, vmull_s32, vmull_u8, vmull_u16, + vmull_u32, vmulx_f32, vmulx_lane_f32, vmulxd_f64, vmulxq_f32, + vmulxq_f64, vmulxq_lane_f32, vmulxq_lane_f64, vmulxs_f32): Rewrite + using builtin functions. + * config/aarch64/iterators.md (UNSPEC_FMULX, UNSPEC_FMULX_LANE, + VDQF_Q): New unspec and int iterator. + 2015-01-19 Jiong Wang Andrew Pinski Index: gcc/config/aarch64/arm_neon.h === --- gcc/config/aarch64/arm_neon.h (revision 219845) +++ gcc/config/aarch64/arm_neon.h (working copy) @@ -7580,671 +7580,6 @@ vmovn_u64 (uint64x2_t a) return result; } -__extension__ static __inline float32x2_t __attribute__ ((__always_inline__)) -vmul_n_f32 (float32x2_t a, float32_t b) -{ - float32x2_t result; - __asm__ ("fmul %0.2s,%1.2s,%2.s[0]" - : "=w"(result) - : "w"(a), "w"(b) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) -vmul_n_s16 (int16x4_t a, int16_t b) -{ - int16x4_t result; - __asm__ ("mul %0.4h,%1.4h,%2.h[0]" - : "=w"(result) - : "w"(a), "x"(b) - : /* No clobbers */); - return result; -} - -__extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) -vmul_n_s32 (int32x2_t a, int32_t b) -{ - int32x2_t result; - __asm__ ("mul %0.2s,%1.2s,%2.s[0]" - : "=w"(result) - : "w"(a), "w"(b) - : /* No clobbers */); - return result; -} - -__extension__ static __inline uint16x4_t __attribute__ ((__always_inline__)) -vmul_n_u16 (uint16x4_t a, uint16_t b) -{ - uint16x4_t result; - __asm__ ("mul %0.4h,%1.4h,%2.h[0]" - : "=w"(result) - : "w"(a), "x"(b) - : /* No clobbers */); - return result; -} - -__extension__ static __inline uint32x2_t __attribute__ ((__always_inline__)) -vmul_n_u32 (uint32x2_t a, uint32_t b) -{ - uint32x2_t result; - __asm__ ("mul %0.2s,%1.2s,%2.s[0]" - : "=w"(result) - : "w"(a), "w"(b) - : /* No clobbers */); - return result; -} - -#define vmull_high_lane_s16(a, b, c)\ - __extension__ \ -({ \ - int16x4_t b_ = (b); \ - int16x8_t a_ = (a); \ - int32x4_t result;\ - __asm__ ("smull2 %0.4s, %1.8h, %2.h[%3]" \ -: "=w"(result) \ -: "w"(a_), "x"(b_), "i"(c) \ -: /* No clobbers */);
Re: [[ARM/AArch64][testsuite] 09/36] Add vsubhn, vraddhn and vrsubhn tests. Split vaddhn.c into vXXXhn.inc and vaddhn.c to share code with other new tests.
On 16 January 2015 at 17:30, Christophe Lyon wrote: > On 16 January 2015 at 17:07, Tejas Belagod wrote: >> On 13/01/15 15:18, Christophe Lyon wrote: >>> >>> >>> * gcc.target/aarch64/advsimd-intrinsics/vXXXhn.inc: New file. >>> * gcc.target/aarch64/advsimd-intrinsics/vraddhn.c: New file. >>> * gcc.target/aarch64/advsimd-intrinsics/vrsubhn.c: New file. >>> * gcc.target/aarch64/advsimd-intrinsics/vsubhn.c: New file. >>> * gcc.target/aarch64/advsimd-intrinsics/vaddhn.c: Use code from >>> vXXXhn.inc. >>> >>> diff --git >>> a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn.inc >>> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn.inc >>> new file mode 100644 >>> index 000..0dbcc92 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn.inc >>> @@ -0,0 +1,50 @@ >>> +#define FNNAME1(NAME) exec_ ## NAME >>> +#define FNNAME(NAME) FNNAME1(NAME) >>> + >>> +void FNNAME (INSN_NAME) (void) >>> +{ >>> + /* Basic test: vec64=vaddhn(vec128_a, vec128_b), then store the result. >>> */ >>> +#define TEST_VADDHN1(INSN, T1, T2, W, W2, N) \ >>> + VECT_VAR(vector64, T1, W2, N) = INSN##_##T2##W(VECT_VAR(vector1, T1, W, >>> N), \ >>> +VECT_VAR(vector2, T1, W, >>> N)); \ >>> + vst1_##T2##W2(VECT_VAR(result, T1, W2, N), VECT_VAR(vector64, T1, W2, >>> N)) >>> + >>> +#define TEST_VADDHN(INSN, T1, T2, W, W2, N)\ >>> + TEST_VADDHN1(INSN, T1, T2, W, W2, N) >>> + >> >> >> Minor nit. If this is a template file, maybe you should name this macro >> TEST_ADDHN as TEST_XXHN? Just that a template having an INSN name is >> confusing. > Agreed. > >>> +VECT_VAR_DECL(expected,poly,8,8) [] = { 0x33, 0x33, 0x33, 0x33, >>> + 0x33, 0x33, 0x33, 0x33 }; >>> +VECT_VAR_DECL(expected,poly,16,4) [] = { 0x, 0x, 0x, 0x >>> }; >> >> >> Though never used, poly seems to have sneaked in here too. > Indeed, sorry for that. > I rushed to have as many tests as possible ready before stage 4, but > obviously I missed a few cleanups. > Here is an updated version, where I have removed a few more useless variables than you noticed: the [u]int64x1 as well as the 128 bits ones. Christophe. >> Otherwise, LGTM. >> >> Thanks, >> Tejas. >> From 3acafe49a7402b859a88d1ef808b828a0acf96c4 Mon Sep 17 00:00:00 2001 From: Christophe Lyon Date: Tue, 2 Dec 2014 15:05:30 +0100 Subject: [[ARM/AArch64][testsuite] 09/36] Add vsubhn, vraddhn and vrsubhn tests. Split vaddhn.c into vXXXhn.inc and vaddhn.c to share code with other new tests. diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn.inc b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn.inc new file mode 100644 index 000..5aabedd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn.inc @@ -0,0 +1,55 @@ +#define FNNAME1(NAME) exec_ ## NAME +#define FNNAME(NAME) FNNAME1(NAME) + +void FNNAME (INSN_NAME) (void) +{ + /* Basic test: vec64=vXXXhn(vec128_a, vec128_b), then store the result. */ +#define TEST_VXXXHN1(INSN, T1, T2, W, W2, N)\ + VECT_VAR(vector64, T1, W2, N) = INSN##_##T2##W(VECT_VAR(vector1, T1, W, N), \ + VECT_VAR(vector2, T1, W, N)); \ + vst1_##T2##W2(VECT_VAR(result, T1, W2, N), VECT_VAR(vector64, T1, W2, N)) + +#define TEST_VXXXHN(INSN, T1, T2, W, W2, N) \ + TEST_VXXXHN1(INSN, T1, T2, W, W2, N) + + DECL_VARIABLE_64BITS_VARIANTS(vector64); + DECL_VARIABLE_128BITS_VARIANTS(vector1); + DECL_VARIABLE_128BITS_VARIANTS(vector2); + + clean_results (); + + /* Fill input vector1 and vector2 with arbitrary values */ + VDUP(vector1, q, int, s, 16, 8, 50*(UINT8_MAX+1)); + VDUP(vector1, q, int, s, 32, 4, 50*(UINT16_MAX+1)); + VDUP(vector1, q, int, s, 64, 2, 24*((uint64_t)UINT32_MAX+1)); + VDUP(vector1, q, uint, u, 16, 8, 3*(UINT8_MAX+1)); + VDUP(vector1, q, uint, u, 32, 4, 55*(UINT16_MAX+1)); + VDUP(vector1, q, uint, u, 64, 2, 3*((uint64_t)UINT32_MAX+1)); + + VDUP(vector2, q, int, s, 16, 8, (uint16_t)UINT8_MAX); + VDUP(vector2, q, int, s, 32, 4, (uint32_t)UINT16_MAX); + VDUP(vector2, q, int, s, 64, 2, (uint64_t)UINT32_MAX); + VDUP(vector2, q, uint, u, 16, 8, (uint16_t)UINT8_MAX); + VDUP(vector2, q, uint, u, 32, 4, (uint32_t)UINT16_MAX); + VDUP(vector2, q, uint, u, 64, 2, (uint64_t)UINT32_MAX); + + TEST_VXXXHN(INSN_NAME, int, s, 16, 8, 8); + TEST_VXXXHN(INSN_NAME, int, s, 32, 16, 4); + TEST_VXXXHN(INSN_NAME, int, s, 64, 32, 2); + TEST_VXXXHN(INSN_NAME, uint, u, 16, 8, 8); + TEST_VXXXHN(INSN_NAME, uint, u, 32, 16, 4); + TEST_VXXXHN(INSN_NAME, uint, u, 64, 32, 2); + + CHECK(TEST_MSG, int, 8, 8, PRIx8, expected, ""); + CHECK(TEST_MSG, int, 16, 4, PRIx16, expected, ""); + CHECK(TEST_MSG, int, 32, 2, PRIx32, expected, ""); + CHECK(TEST_MSG, uint, 8, 8, PRIx8, expected, ""); + CHECK(TEST_MSG, uint, 16, 4, PRIx16, expected, ""); + CHECK(TEST_MSG, uint, 32, 2, PRIx32, expected, ""); +} + +int ma
Re: [[ARM/AArch64][testsuite] 21/36] Add vmovl tests.
On 16 January 2015 at 19:15, Tejas Belagod wrote: > On 13/01/15 15:18, Christophe Lyon wrote: >> >> * gcc.target/aarch64/advsimd-intrinsics/vmovl.c: New file. >> >> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovl.c >> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovl.c >> new file mode 100644 >> index 000..427c9ba >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovl.c >> @@ -0,0 +1,77 @@ >> +#include >> +#include "arm-neon-ref.h" >> +#include "compute-ref-data.h" >> + >> +/* Expected results. */ >> +VECT_VAR_DECL(expected,int,8,8) [] = { 0x33, 0x33, 0x33, 0x33, >> + 0x33, 0x33, 0x33, 0x33 }; >> +VECT_VAR_DECL(expected,int,16,4) [] = { 0x, 0x, 0x, 0x }; >> +VECT_VAR_DECL(expected,int,32,2) [] = { 0x, 0x }; >> +VECT_VAR_DECL(expected,int,64,1) [] = { 0x }; >> +VECT_VAR_DECL(expected,uint,8,8) [] = { 0x33, 0x33, 0x33, 0x33, >> + 0x33, 0x33, 0x33, 0x33 }; >> +VECT_VAR_DECL(expected,uint,16,4) [] = { 0x, 0x, 0x, 0x >> }; >> +VECT_VAR_DECL(expected,uint,32,2) [] = { 0x, 0x }; >> +VECT_VAR_DECL(expected,uint,64,1) [] = { 0x }; >> +VECT_VAR_DECL(expected,poly,8,8) [] = { 0x33, 0x33, 0x33, 0x33, >> + 0x33, 0x33, 0x33, 0x33 }; >> +VECT_VAR_DECL(expected,poly,16,4) [] = { 0x, 0x, 0x, 0x >> }; >> +VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x, 0x }; >> +VECT_VAR_DECL(expected,int,8,16) [] = { 0x33, 0x33, 0x33, 0x33, >> + 0x33, 0x33, 0x33, 0x33, >> + 0x33, 0x33, 0x33, 0x33, >> + 0x33, 0x33, 0x33, 0x33 }; >> +VECT_VAR_DECL(expected,int,16,8) [] = { 0xfff0, 0xfff1, 0xfff2, 0xfff3, >> + 0xfff4, 0xfff5, 0xfff6, 0xfff7 }; >> +VECT_VAR_DECL(expected,int,32,4) [] = { 0xfff0, 0xfff1, >> + 0xfff2, 0xfff3 }; >> +VECT_VAR_DECL(expected,int,64,2) [] = { 0xfff0, >> + 0xfff1 }; >> +VECT_VAR_DECL(expected,uint,8,16) [] = { 0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33 }; >> +VECT_VAR_DECL(expected,uint,16,8) [] = { 0xf0, 0xf1, 0xf2, 0xf3, >> +0xf4, 0xf5, 0xf6, 0xf7 }; >> +VECT_VAR_DECL(expected,uint,32,4) [] = { 0xfff0, 0xfff1, 0xfff2, 0xfff3 >> }; >> +VECT_VAR_DECL(expected,uint,64,2) [] = { 0xfff0, 0xfff1 }; >> +VECT_VAR_DECL(expected,poly,8,16) [] = { 0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33 }; >> +VECT_VAR_DECL(expected,poly,16,8) [] = { 0x, 0x, 0x, 0x, >> +0x, 0x, 0x, 0x }; >> +VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x, 0x, >> + 0x, 0x }; >> + > > > No poly or float for vmovl. > Here is a new version, with more cleanup: only 16x8, 32x4 and 64x2 variants are necessary. > Otherwise, LGTM. > > Tejas. > > From 2f56acd54cee2d9b9b62de9e624fb2a64f114101 Mon Sep 17 00:00:00 2001 From: Christophe Lyon Date: Wed, 10 Dec 2014 17:24:48 +0100 Subject: [[ARM/AArch64][testsuite] 21/36] Add vmovl tests. diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovl.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovl.c new file mode 100644 index 000..fd94d72 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovl.c @@ -0,0 +1,52 @@ +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +/* Expected results. */ +VECT_VAR_DECL(expected,int,16,8) [] = { 0xfff0, 0xfff1, 0xfff2, 0xfff3, + 0xfff4, 0xfff5, 0xfff6, 0xfff7 }; +VECT_VAR_DECL(expected,int,32,4) [] = { 0xfff0, 0xfff1, + 0xfff2, 0xfff3 }; +VECT_VAR_DECL(expected,int,64,2) [] = { 0xfff0, + 0xfff1 }; +VECT_VAR_DECL(expected,uint,16,8) [] = { 0xf0, 0xf1, 0xf2, 0xf3, + 0xf4, 0xf5, 0xf6, 0xf7 }; +VECT_VAR_DECL(expected,uint,32,4) [] = { 0xfff0, 0xfff1, 0xfff2, 0xfff3 }; +VECT_VAR_DECL(expected,uint,64,2) [] = { 0xfff0, 0xfff1 }; + +#define TEST_MSG "VMOVL" +void exec_vmovl (void) +{ + /* Basic test: vec128=vmovl(vec64), then store the result. */ +#define TEST_VMOVL(T1, T2, W, W2, N) \ + VECT_VAR(vector128, T1, W2, N) = \ +vmovl_##T2##W(VECT_VAR(vector64, T1, W, N)); \ + vst1q_##T2##W2(VECT_VAR(result, T1, W2, N), VECT_VAR
Re: [[ARM/AArch64][testsuite] 13/36] Add vmla_n and vmls_n tests.
On 16 January 2015 at 17:24, Tejas Belagod wrote: >> +VECT_VAR_DECL(expected,poly,8,8) [] = { 0x33, 0x33, 0x33, 0x33, >> + 0x33, 0x33, 0x33, 0x33 }; >> +VECT_VAR_DECL(expected,poly,16,4) [] = { 0x, 0x, 0x, 0x >> }; > > > No poly vmlx_n, otherwise LGTM. > Here is a new version, with a bit more cleanup than requested, since only 16x4 and 32x2 variants are supported. > Tejas. > > From 2b9d1ba0f54086dc6511766cbf19883b2439ca49 Mon Sep 17 00:00:00 2001 From: Christophe Lyon Date: Thu, 4 Dec 2014 00:37:35 +0100 Subject: [[ARM/AArch64][testsuite] 13/36] Add vmla_n and vmls_n tests. diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX_n.inc b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX_n.inc new file mode 100644 index 000..375023a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX_n.inc @@ -0,0 +1,87 @@ +#define FNNAME1(NAME) exec_ ## NAME +#define FNNAME(NAME) FNNAME1(NAME) + +void FNNAME (INSN_NAME) (void) +{ +#define DECL_VMLX_N(VAR) \ + DECL_VARIABLE(VAR, int, 16, 4); \ + DECL_VARIABLE(VAR, int, 32, 2); \ + DECL_VARIABLE(VAR, uint, 16, 4); \ + DECL_VARIABLE(VAR, uint, 32, 2); \ + DECL_VARIABLE(VAR, float, 32, 2); \ + DECL_VARIABLE(VAR, int, 16, 8); \ + DECL_VARIABLE(VAR, int, 32, 4); \ + DECL_VARIABLE(VAR, uint, 16, 8); \ + DECL_VARIABLE(VAR, float, 32, 4); \ + DECL_VARIABLE(VAR, uint, 32, 4) + + /* vector_res = vmlx_n(vector, vector2, val), + then store the result. */ +#define TEST_VMLX_N1(INSN, Q, T1, T2, W, N, V) \ + VECT_VAR(vector_res, T1, W, N) = \ +INSN##Q##_n_##T2##W(VECT_VAR(vector, T1, W, N), \ + VECT_VAR(vector2, T1, W, N),\ + V); \ + vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), \ + VECT_VAR(vector_res, T1, W, N)) + +#define TEST_VMLX_N(INSN, Q, T1, T2, W, N, V) \ + TEST_VMLX_N1(INSN, Q, T1, T2, W, N, V) + + DECL_VMLX_N(vector); + DECL_VMLX_N(vector2); + DECL_VMLX_N(vector_res); + + clean_results (); + + VLOAD(vector, buffer, , int, s, 16, 4); + VLOAD(vector, buffer, , int, s, 32, 2); + VLOAD(vector, buffer, , uint, u, 16, 4); + VLOAD(vector, buffer, , uint, u, 32, 2); + VLOAD(vector, buffer, , float, f, 32, 2); + VLOAD(vector, buffer, q, int, s, 16, 8); + VLOAD(vector, buffer, q, int, s, 32, 4); + VLOAD(vector, buffer, q, uint, u, 16, 8); + VLOAD(vector, buffer, q, uint, u, 32, 4); + VLOAD(vector, buffer, q, float, f, 32, 4); + + VDUP(vector2, , int, s, 16, 4, 0x55); + VDUP(vector2, , int, s, 32, 2, 0x55); + VDUP(vector2, , uint, u, 16, 4, 0x55); + VDUP(vector2, , uint, u, 32, 2, 0x55); + VDUP(vector2, , float, f, 32, 2, 55.2f); + VDUP(vector2, q, int, s, 16, 8, 0x55); + VDUP(vector2, q, int, s, 32, 4, 0x55); + VDUP(vector2, q, uint, u, 16, 8, 0x55); + VDUP(vector2, q, uint, u, 32, 4, 0x55); + VDUP(vector2, q, float, f, 32, 4, 55.9f); + + /* Choose multiplier arbitrarily. */ + TEST_VMLX_N(INSN_NAME, , int, s, 16, 4, 0x11); + TEST_VMLX_N(INSN_NAME, , int, s, 32, 2, 0x22); + TEST_VMLX_N(INSN_NAME, , uint, u, 16, 4, 0x33); + TEST_VMLX_N(INSN_NAME, , uint, u, 32, 2, 0x44); + TEST_VMLX_N(INSN_NAME, , float, f, 32, 2, 22.3f); + TEST_VMLX_N(INSN_NAME, q, int, s, 16, 8, 0x55); + TEST_VMLX_N(INSN_NAME, q, int, s, 32, 4, 0x66); + TEST_VMLX_N(INSN_NAME, q, uint, u, 16, 8, 0x77); + TEST_VMLX_N(INSN_NAME, q, uint, u, 32, 4, 0x88); + TEST_VMLX_N(INSN_NAME, q, float, f, 32, 4, 66.7f); + + CHECK(TEST_MSG, int, 16, 4, PRIx16, expected, ""); + CHECK(TEST_MSG, int, 32, 2, PRIx32, expected, ""); + CHECK(TEST_MSG, uint, 16, 4, PRIx16, expected, ""); + CHECK(TEST_MSG, uint, 32, 2, PRIx32, expected, ""); + CHECK_FP(TEST_MSG, float, 32, 2, PRIx32, expected, ""); + CHECK(TEST_MSG, int, 16, 8, PRIx16, expected, ""); + CHECK(TEST_MSG, int, 32, 4, PRIx32, expected, ""); + CHECK(TEST_MSG, uint, 16, 8, PRIx16, expected, ""); + CHECK(TEST_MSG, uint, 32, 4, PRIx32, expected, ""); + CHECK_FP(TEST_MSG, float, 32, 4, PRIx32, expected, ""); +} + +int main (void) +{ + FNNAME (INSN_NAME) (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmla_n.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmla_n.c new file mode 100644 index 000..8e88aad --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmla_n.c @@ -0,0 +1,23 @@ +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define INSN_NAME vmla +#define TEST_MSG "VMLA_N" + +/* Expected results. */ +VECT_VAR_DECL(expected,int,16,4) [] = { 0x595, 0x596, 0x597, 0x598 }; +VECT_VAR_DECL(expected,int,32,2) [] = { 0xb3a, 0xb3b }; +VECT_VAR_DECL(expected,uint,16,4) [] = { 0x10df, 0x10e0, 0x10e1, 0x10e2 }; +VECT_VAR_DECL(expected,uint,32,2) [] = { 0x1684, 0x1685 }; +VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0x4497deb8, 0x4497feb8 }; +VECT_VAR_DECL(expected,int,16,8) [] = { 0x1c29, 0x1c2a, 0x1c2b, 0x1c2c, + 0x1c2d, 0x1c2e, 0x1c2f,
Re: [[ARM/AArch64][testsuite] 17/36] Add vpadd, vpmax and vpmin tests.
On 16 January 2015 at 18:54, Christophe Lyon wrote: > On 16 January 2015 at 18:52, Tejas Belagod wrote: >> On 13/01/15 15:18, Christophe Lyon wrote: >>> >>> * gcc.target/aarch64/advsimd-intrinsics/vpXXX.inc: New file. >>> * gcc.target/aarch64/advsimd-intrinsics/vpadd.c: New file. >>> * gcc.target/aarch64/advsimd-intrinsics/vpmax.c: New file. >>> * gcc.target/aarch64/advsimd-intrinsics/vpmin.c: New file. >>> >>> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpXXX.inc >>> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpXXX.inc >>> new file mode 100644 >>> index 000..7ac2ed4 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpXXX.inc >>> @@ -0,0 +1,67 @@ >>> +#define FNNAME1(NAME) exec_ ## NAME >>> +#define FNNAME(NAME) FNNAME1(NAME) >>> + >>> +void FNNAME (INSN_NAME) (void) >>> +{ >>> + /* Basic test: y=OP(x), then store the result. */ >>> +#define TEST_VPADD1(INSN, T1, T2, W, N) >>> \ >>> + VECT_VAR(vector_res, T1, W, N) = \ >>> +INSN##_##T2##W(VECT_VAR(vector, T1, W, N), \ >>> + VECT_VAR(vector, T1, W, N)); \ >>> + vst1##_##T2##W(VECT_VAR(result, T1, W, N), \ >>> +VECT_VAR(vector_res, T1, W, N)) >>> + >>> +#define TEST_VPADD(INSN, T1, T2, W, N) \ >>> + TEST_VPADD1(INSN, T1, T2, W, N) \ >>> + >>> + /* No need for 64 bits variants. */ >>> + DECL_VARIABLE(vector, int, 8, 8); >>> + DECL_VARIABLE(vector, int, 16, 4); >>> + DECL_VARIABLE(vector, int, 32, 2); >>> + DECL_VARIABLE(vector, uint, 8, 8); >>> + DECL_VARIABLE(vector, uint, 16, 4); >>> + DECL_VARIABLE(vector, uint, 32, 2); >>> + DECL_VARIABLE(vector, float, 32, 2); >>> + >>> + DECL_VARIABLE(vector_res, int, 8, 8); >>> + DECL_VARIABLE(vector_res, int, 16, 4); >>> + DECL_VARIABLE(vector_res, int, 32, 2); >>> + DECL_VARIABLE(vector_res, uint, 8, 8); >>> + DECL_VARIABLE(vector_res, uint, 16, 4); >>> + DECL_VARIABLE(vector_res, uint, 32, 2); >>> + DECL_VARIABLE(vector_res, float, 32, 2); >>> + >>> + clean_results (); >>> + >>> + /* Initialize input "vector" from "buffer". */ >>> + VLOAD(vector, buffer, , int, s, 8, 8); >>> + VLOAD(vector, buffer, , int, s, 16, 4); >>> + VLOAD(vector, buffer, , int, s, 32, 2); >>> + VLOAD(vector, buffer, , uint, u, 8, 8); >>> + VLOAD(vector, buffer, , uint, u, 16, 4); >>> + VLOAD(vector, buffer, , uint, u, 32, 2); >>> + VLOAD(vector, buffer, , float, f, 32, 2); >>> + >>> + /* Apply a unary operator named INSN_NAME. */ >> >> >> Unary op? >> > Hmm cut & paste issue. Thanks > Here is an updated versoin, also renaming VPADD into VPXXX, since it's in a template. >> >>> + TEST_VPADD(INSN_NAME, int, s, 8, 8); >>> + TEST_VPADD(INSN_NAME, int, s, 16, 4); >>> + TEST_VPADD(INSN_NAME, int, s, 32, 2); >>> + TEST_VPADD(INSN_NAME, uint, u, 8, 8); >>> + TEST_VPADD(INSN_NAME, uint, u, 16, 4); >>> + TEST_VPADD(INSN_NAME, uint, u, 32, 2); >>> + TEST_VPADD(INSN_NAME, float, f, 32, 2); >>> + >>> + CHECK(TEST_MSG, int, 8, 8, PRIx32, expected, ""); >>> + CHECK(TEST_MSG, int, 16, 4, PRIx64, expected, ""); >>> + CHECK(TEST_MSG, int, 32, 2, PRIx32, expected, ""); >>> + CHECK(TEST_MSG, uint, 8, 8, PRIx32, expected, ""); >>> + CHECK(TEST_MSG, uint, 16, 4, PRIx64, expected, ""); >>> + CHECK(TEST_MSG, uint, 32, 2, PRIx32, expected, ""); >>> + CHECK_FP(TEST_MSG, float, 32, 2, PRIx32, expected, ""); >>> +} >>> + >>> +int main (void) >>> +{ >>> + FNNAME (INSN_NAME) (); >>> + return 0; >>> +} >>> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpadd.c >>> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpadd.c >>> new file mode 100644 >>> index 000..5ddfd3d >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpadd.c >>> @@ -0,0 +1,19 @@ >>> +#include >>> +#include "arm-neon-ref.h" >>> +#include "compute-ref-data.h" >>> + >>> +#define INSN_NAME vpadd >>> +#define TEST_MSG "VPADD" >>> + >>> +/* Expected results. */ >>> +VECT_VAR_DECL(expected,int,8,8) [] = { 0xe1, 0xe5, 0xe9, 0xed, >>> + 0xe1, 0xe5, 0xe9, 0xed }; >>> +VECT_VAR_DECL(expected,int,16,4) [] = { 0xffe1, 0xffe5, 0xffe1, 0xffe5 }; >>> +VECT_VAR_DECL(expected,int,32,2) [] = { 0xffe1, 0xffe1 }; >>> +VECT_VAR_DECL(expected,uint,8,8) [] = { 0xe1, 0xe5, 0xe9, 0xed, >>> + 0xe1, 0xe5, 0xe9, 0xed }; >>> +VECT_VAR_DECL(expected,uint,16,4) [] = { 0xffe1, 0xffe5, 0xffe1, 0xffe5 >>> }; >>> +VECT_VAR_DECL(expected,uint,32,2) [] = { 0xffe1, 0xffe1 }; >>> +VECT_VAR_DECL(expected,hfloat,32,2) [] = { 0xc1f8, 0xc1f8 }; >>> + >>> +#include "vpXXX.inc" >>> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpmax.c >>> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpmax.c >>> new file mode 100644 >>> index 000..f27a9a9 >>> --- /dev/null
Re: [[ARM/AArch64][testsuite] 28/36] Add vmnv tests.
On 16 January 2015 at 19:27, Tejas Belagod wrote: >> +VECT_VAR_DECL(expected,poly,16,8) [] = { 0x, 0x, 0x, 0x, >> +0x, 0x, 0x, 0x }; >> +VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x, 0x, >> + 0x, 0x }; >> + > > > No float or poly16 for vmvn_*. > Updated as attached, removed 64x1 and 64x2 too. > Otherwise, LGTM. > > Tejas. > > From f4902098b0379fd445ceda7d2202b93f56c6129f Mon Sep 17 00:00:00 2001 From: Christophe Lyon Date: Wed, 10 Dec 2014 23:07:25 +0100 Subject: [[ARM/AArch64][testsuite] 28/36] Add vmnv tests. diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmvn.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmvn.c new file mode 100644 index 000..268a707 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmvn.c @@ -0,0 +1,137 @@ +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +/* Expected results. */ +VECT_VAR_DECL(expected,int,8,8) [] = { 0xf, 0xe, 0xd, 0xc, + 0xb, 0xa, 0x9, 0x8 }; +VECT_VAR_DECL(expected,int,16,4) [] = { 0xf, 0xe, 0xd, 0xc }; +VECT_VAR_DECL(expected,int,32,2) [] = { 0xf, 0xe }; +VECT_VAR_DECL(expected,uint,8,8) [] = { 0xf, 0xe, 0xd, 0xc, + 0xb, 0xa, 0x9, 0x8 }; +VECT_VAR_DECL(expected,uint,16,4) [] = { 0xf, 0xe, 0xd, 0xc }; +VECT_VAR_DECL(expected,uint,32,2) [] = { 0xf, 0xe }; +VECT_VAR_DECL(expected,poly,8,8) [] = { 0xf, 0xe, 0xd, 0xc, + 0xb, 0xa, 0x9, 0x8 }; +VECT_VAR_DECL(expected,int,8,16) [] = { 0xf, 0xe, 0xd, 0xc, + 0xb, 0xa, 0x9, 0x8, + 0x7, 0x6, 0x5, 0x4, + 0x3, 0x2, 0x1, 0x0 }; +VECT_VAR_DECL(expected,int,16,8) [] = { 0xf, 0xe, 0xd, 0xc, + 0xb, 0xa, 0x9, 0x8 }; +VECT_VAR_DECL(expected,int,32,4) [] = { 0xf, 0xe, 0xd, 0xc }; +VECT_VAR_DECL(expected,uint,8,16) [] = { 0xf, 0xe, 0xd, 0xc, + 0xb, 0xa, 0x9, 0x8, + 0x7, 0x6, 0x5, 0x4, + 0x3, 0x2, 0x1, 0x0 }; +VECT_VAR_DECL(expected,uint,16,8) [] = { 0xf, 0xe, 0xd, 0xc, + 0xb, 0xa, 0x9, 0x8 }; +VECT_VAR_DECL(expected,uint,32,4) [] = { 0xf, 0xe, 0xd, 0xc }; +VECT_VAR_DECL(expected,poly,8,16) [] = { 0xf, 0xe, 0xd, 0xc, + 0xb, 0xa, 0x9, 0x8, + 0x7, 0x6, 0x5, 0x4, + 0x3, 0x2, 0x1, 0x0 }; + +#define INSN_NAME vmvn +#define TEST_MSG "VMVN/VMVNQ" + +#define FNNAME1(NAME) void exec_ ## NAME (void) +#define FNNAME(NAME) FNNAME1(NAME) + +FNNAME (INSN_NAME) +{ + /* Basic test: y=OP(x), then store the result. */ +#define TEST_UNARY_OP1(INSN, Q, T1, T2, W, N)\ + VECT_VAR(vector_res, T1, W, N) = \ +INSN##Q##_##T2##W(VECT_VAR(vector, T1, W, N)); \ + vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), VECT_VAR(vector_res, T1, W, N)) + +#define TEST_UNARY_OP(INSN, Q, T1, T2, W, N)\ + TEST_UNARY_OP1(INSN, Q, T1, T2, W, N) \ + + /* No need for 64 bits variants. */ + DECL_VARIABLE(vector, int, 8, 8); + DECL_VARIABLE(vector, int, 16, 4); + DECL_VARIABLE(vector, int, 32, 2); + DECL_VARIABLE(vector, uint, 8, 8); + DECL_VARIABLE(vector, uint, 16, 4); + DECL_VARIABLE(vector, uint, 32, 2); + DECL_VARIABLE(vector, poly, 8, 8); + DECL_VARIABLE(vector, int, 8, 16); + DECL_VARIABLE(vector, int, 16, 8); + DECL_VARIABLE(vector, int, 32, 4); + DECL_VARIABLE(vector, uint, 8, 16); + DECL_VARIABLE(vector, uint, 16, 8); + DECL_VARIABLE(vector, uint, 32, 4); + DECL_VARIABLE(vector, poly, 8, 16); + + DECL_VARIABLE(vector_res, int, 8, 8); + DECL_VARIABLE(vector_res, int, 16, 4); + DECL_VARIABLE(vector_res, int, 32, 2); + DECL_VARIABLE(vector_res, uint, 8, 8); + DECL_VARIABLE(vector_res, uint, 16, 4); + DECL_VARIABLE(vector_res, uint, 32, 2); + DECL_VARIABLE(vector_res, poly, 8, 8); + DECL_VARIABLE(vector_res, int, 8, 16); + DECL_VARIABLE(vector_res, int, 16, 8); + DECL_VARIABLE(vector_res, int, 32, 4); + DECL_VARIABLE(vector_res, uint, 8, 16); + DECL_VARIABLE(vector_res, uint, 16, 8); + DECL_VARIABLE(vector_res, uint, 32, 4); + DECL_VARIABLE(vector_res, poly, 8, 16); + + clean_results (); + + /* Initialize input "vector" from "buffer". */ + VLOAD(vector, buffer, , int, s, 8, 8); + VLOAD(vector, buffer, , int, s, 16, 4); + VLOAD(vector, buffer, , int, s, 32, 2); + VLOAD(vector, buffer, , uint, u, 8, 8); + VLOAD(vector, buffer, , uint, u, 16, 4); + VLOAD(vector, buffer, , uint, u, 32, 2); + VLOAD(vector, buffer, , poly, p, 8, 8); + VLOAD(vector, buffer, q, int, s, 8, 16); + VLOAD(vector, buffer, q, int, s, 16, 8); + VLOAD(vector, buffer, q, int, s, 32, 4); + VLOAD(vector, buffer, q, uint, u, 8, 16); + VLOAD(vector, buffer, q, uint, u, 16, 8); + VLOAD(vector, buffer, q, uint, u, 32, 4); + VLOAD(vector, buffer, q, poly, p, 8, 16); + + /* Apply a unary operator named INSN_NAME. */ + TEST_UNARY_OP(INSN_NAME, , int, s, 8, 8); + TEST_UNARY_OP(INSN_NAME, , int, s, 16, 4); + TEST_UNARY_OP(INSN_NAME, , int, s, 32, 2); + TEST_UNARY_OP(INSN_NAME, , uint, u, 8, 8); + TEST_UNARY_OP(INSN_NAME, , uint, u, 16, 4); + TEST_UNARY_OP
Re: [[ARM/AArch64][testsuite] 29/36] Add vpadal tests.
On 16 January 2015 at 19:29, Tejas Belagod wrote: >> +VECT_VAR_DECL(expected,poly,8,16) [] = { 0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33, >> +0x33, 0x33, 0x33, 0x33 }; >> +VECT_VAR_DECL(expected,poly,16,8) [] = { 0x, 0x, 0x, 0x, >> +0x, 0x, 0x, 0x }; >> +VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x, 0x, >> + 0x, 0x }; >> + > > > No float or poly ops for VPADAL insns. > int8 variants are not necessary either, updated as attached. > Otherwise, LGTM. > > Tejas. > > From 15f3129b408c6e70af3f78cf18ffc340d361 Mon Sep 17 00:00:00 2001 From: Christophe Lyon Date: Wed, 10 Dec 2014 23:21:27 +0100 Subject: [[ARM/AArch64][testsuite] 29/36] Add vpadal tests. diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpadal.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpadal.c new file mode 100644 index 000..0bffc0f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpadal.c @@ -0,0 +1,141 @@ +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +/* Expected results. */ +VECT_VAR_DECL(expected,int,16,4) [] = { 0xffd1, 0xffd6, 0xffdb, 0xffe0 }; +VECT_VAR_DECL(expected,int,32,2) [] = { 0xffd1, 0xffd6 }; +VECT_VAR_DECL(expected,int,64,1) [] = { 0xffd1 }; +VECT_VAR_DECL(expected,uint,16,4) [] = { 0x1d1, 0x1d6, 0x1db, 0x1e0 }; +VECT_VAR_DECL(expected,uint,32,2) [] = { 0x1ffd1, 0x1ffd6 }; +VECT_VAR_DECL(expected,uint,64,1) [] = { 0x1ffd1 }; +VECT_VAR_DECL(expected,int,16,8) [] = { 0xffd1, 0xffd6, 0xffdb, 0xffe0, + 0xffe5, 0xffea, 0xffef, 0xfff4 }; +VECT_VAR_DECL(expected,int,32,4) [] = { 0xffd1, 0xffd6, + 0xffdb, 0xffe0 }; +VECT_VAR_DECL(expected,int,64,2) [] = { 0xffd1, 0xffd6 }; +VECT_VAR_DECL(expected,uint,16,8) [] = { 0x1d1, 0x1d6, 0x1db, 0x1e0, + 0x1e5, 0x1ea, 0x1ef, 0x1f4 }; +VECT_VAR_DECL(expected,uint,32,4) [] = { 0x1ffd1, 0x1ffd6, 0x1ffdb, 0x1ffe0 }; +VECT_VAR_DECL(expected,uint,64,2) [] = { 0x1ffd1, 0x1ffd6 }; + +#define INSN_NAME vpadal +#define TEST_MSG "VPADAL/VPADALQ" + +#define FNNAME1(NAME) void exec_ ## NAME (void) +#define FNNAME(NAME) FNNAME1(NAME) + +FNNAME (INSN_NAME) +{ + /* Basic test: y=OP(x), then store the result. */ +#define TEST_VPADAL1(INSN, Q, T1, T2, W, N, W2, N2) \ + VECT_VAR(vector_res, T1, W2, N2) = \ +INSN##Q##_##T2##W(VECT_VAR(vector, T1, W2, N2), VECT_VAR(vector2, T1, W, N)); \ + vst1##Q##_##T2##W2(VECT_VAR(result, T1, W2, N2), \ + VECT_VAR(vector_res, T1, W2, N2)) + +#define TEST_VPADAL(INSN, Q, T1, T2, W, N, W2, N2) \ + TEST_VPADAL1(INSN, Q, T1, T2, W, N, W2, N2) + + DECL_VARIABLE(vector, int, 16, 4); + DECL_VARIABLE(vector, int, 32, 2); + DECL_VARIABLE(vector, int, 64, 1); + DECL_VARIABLE(vector, uint, 16, 4); + DECL_VARIABLE(vector, uint, 32, 2); + DECL_VARIABLE(vector, uint, 64, 1); + DECL_VARIABLE(vector, int, 16, 8); + DECL_VARIABLE(vector, int, 32, 4); + DECL_VARIABLE(vector, int, 64, 2); + DECL_VARIABLE(vector, uint, 16, 8); + DECL_VARIABLE(vector, uint, 32, 4); + DECL_VARIABLE(vector, uint, 64, 2); + + DECL_VARIABLE(vector2, int, 8, 8); + DECL_VARIABLE(vector2, int, 16, 4); + DECL_VARIABLE(vector2, int, 32, 2); + DECL_VARIABLE(vector2, uint, 8, 8); + DECL_VARIABLE(vector2, uint, 16, 4); + DECL_VARIABLE(vector2, uint, 32, 2); + DECL_VARIABLE(vector2, int, 8, 16); + DECL_VARIABLE(vector2, int, 16, 8); + DECL_VARIABLE(vector2, int, 32, 4); + DECL_VARIABLE(vector2, uint, 8, 16); + DECL_VARIABLE(vector2, uint, 16, 8); + DECL_VARIABLE(vector2, uint, 32, 4); + + DECL_VARIABLE(vector_res, int, 16, 4); + DECL_VARIABLE(vector_res, int, 32, 2); + DECL_VARIABLE(vector_res, int, 64, 1); + DECL_VARIABLE(vector_res, uint, 16, 4); + DECL_VARIABLE(vector_res, uint, 32, 2); + DECL_VARIABLE(vector_res, uint, 64, 1); + DECL_VARIABLE(vector_res, int, 16, 8); + DECL_VARIABLE(vector_res, int, 32, 4); + DECL_VARIABLE(vector_res, int, 64, 2); + DECL_VARIABLE(vector_res, uint, 16, 8); + DECL_VARIABLE(vector_res, uint, 32, 4); + DECL_VARIABLE(vector_res, uint, 64, 2); + + clean_results (); + + /* Initialize input "vector" from "buffer". */ + VLOAD(vector, buffer, , int, s, 16, 4); + VLOAD(vector, buffer, , int, s, 32, 2); + VLOAD(vector, buffer, , int, s, 64, 1); + VLOAD(vector, buffer, , uint, u, 16, 4); + VLOAD(vector, buffer, , uint, u, 32, 2); + VLOAD(vector, buffer, , uint, u, 64, 1); + VLOAD(vector, buffer, q, int, s, 16, 8); + VLOAD(vector, buffer, q, int, s, 32, 4); + VLOAD(vector, buffer, q, int, s, 64, 2); + VLOAD(vector, buffer, q, uint, u, 16, 8); + VLOAD(vector, buffer, q, uint, u, 32, 4); + VLOAD(vector, buffer, q, uint, u, 64, 2); + + /* Initialize input "vector2" from
Re: [[ARM/AArch64][testsuite] 30/36] Add vpaddl tests.
On 16 January 2015 at 20:41, Christophe Lyon wrote: > On 16 January 2015 at 19:49, Christophe Lyon > wrote: >> On 16 January 2015 at 19:33, Tejas Belagod wrote: >>> +VECT_VAR_DECL(expected,poly,8,16) [] = { 0x33, 0x33, 0x33, 0x33, +0x33, 0x33, 0x33, 0x33, +0x33, 0x33, 0x33, 0x33, +0x33, 0x33, 0x33, 0x33 }; +VECT_VAR_DECL(expected,poly,16,8) [] = { 0x, 0x, 0x, 0x, +0x, 0x, 0x, 0x }; +VECT_VAR_DECL(expected,hfloat,32,4) [] = { 0x, 0x, + 0x, 0x }; + >>> >>> >>> No poly or float ops. >>> +#define INSN_NAME vpaddl +#define TEST_MSG "VPADDL/VPADDLQ" + +#define FNNAME1(NAME) void exec_ ## NAME (void) +#define FNNAME(NAME) FNNAME1(NAME) + +FNNAME (INSN_NAME) +{ + /* Basic test: y=OP(x), then store the result. */ +#define TEST_VPADDL1(INSN, Q, T1, T2, W, N, W2, N2)\ + VECT_VAR(vector_res, T1, W2, N2) = \ +INSN##Q##_##T2##W(VECT_VAR(vector, T1, W, N)); \ + vst1##Q##_##T2##W2(VECT_VAR(result, T1, W2, N2), \ + VECT_VAR(vector_res, T1, W2, N2)) + +#define TEST_VPADDL(INSN, Q, T1, T2, W, N, W2, N2) \ + TEST_VPADDL1(INSN, Q, T1, T2, W, N, W2, N2) + + /* No need for 64 bits variants. */ >>> >>> >>> These look like 64-bit variants. >>> >> I mean no vector element of 64 bits. >> + DECL_VARIABLE(vector, int, 8, 8); + DECL_VARIABLE(vector, int, 16, 4); + DECL_VARIABLE(vector, int, 32, 2); + DECL_VARIABLE(vector, uint, 8, 8); + DECL_VARIABLE(vector, uint, 16, 4); + DECL_VARIABLE(vector, uint, 32, 2); + DECL_VARIABLE(vector, int, 8, 16); + DECL_VARIABLE(vector, int, 16, 8); + DECL_VARIABLE(vector, int, 32, 4); + DECL_VARIABLE(vector, uint, 8, 16); + DECL_VARIABLE(vector, uint, 16, 8); + DECL_VARIABLE(vector, uint, 32, 4); + >>> >>> + /* Apply a unary operator named INSN_NAME. */ >>> >>> Unary op? >> >> Cut & paste error, again. >> > Hmm changed my mind: vpaddl takes only one vector as input, although > it does add 2 vector elements. > Here is an updated version, removing poly, float and int8 variants. > >>> + TEST_VPADDL(INSN_NAME, , int, s, 8, 8, 16, 4); + TEST_VPADDL(INSN_NAME, , int, s, 16, 4, 32, 2); + TEST_VPADDL(INSN_NAME, , int, s, 32, 2, 64, 1); + TEST_VPADDL(INSN_NAME, , uint, u, 8, 8, 16, 4); + TEST_VPADDL(INSN_NAME, , uint, u, 16, 4, 32, 2); + TEST_VPADDL(INSN_NAME, , uint, u, 32, 2, 64, 1); + TEST_VPADDL(INSN_NAME, q, int, s, 8, 16, 16, 8); + TEST_VPADDL(INSN_NAME, q, int, s, 16, 8, 32, 4); + TEST_VPADDL(INSN_NAME, q, int, s, 32, 4, 64, 2); + TEST_VPADDL(INSN_NAME, q, uint, u, 8, 16, 16, 8); + TEST_VPADDL(INSN_NAME, q, uint, u, 16, 8, 32, 4); + TEST_VPADDL(INSN_NAME, q, uint, u, 32, 4, 64, 2); + + CHECK_RESULTS (TEST_MSG, ""); +} + +int main (void) +{ + exec_vpaddl (); + return 0; +} >>> >>> >>> Otherwise, LGTM. >>> >>> Tejas. >>> From 394f2023994b56413e3fc40412e9cc36ed0e05a7 Mon Sep 17 00:00:00 2001 From: Christophe Lyon Date: Wed, 10 Dec 2014 23:31:29 +0100 Subject: [[ARM/AArch64][testsuite] 30/36] Add vpaddl tests. diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpaddl.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpaddl.c new file mode 100644 index 000..8dc768d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vpaddl.c @@ -0,0 +1,116 @@ +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +/* Expected results. */ +VECT_VAR_DECL(expected,int,16,4) [] = { 0xffe1, 0xffe5, 0xffe9, 0xffed }; +VECT_VAR_DECL(expected,int,32,2) [] = { 0xffe1, 0xffe5 }; +VECT_VAR_DECL(expected,int,64,1) [] = { 0xffe1 }; +VECT_VAR_DECL(expected,uint,16,4) [] = { 0x1e1, 0x1e5, 0x1e9, 0x1ed }; +VECT_VAR_DECL(expected,uint,32,2) [] = { 0x1ffe1, 0x1ffe5 }; +VECT_VAR_DECL(expected,uint,64,1) [] = { 0x1ffe1 }; +VECT_VAR_DECL(expected,int,16,8) [] = { 0xffe1, 0xffe5, 0xffe9, 0xffed, + 0xfff1, 0xfff5, 0xfff9, 0xfffd }; +VECT_VAR_DECL(expected,int,32,4) [] = { 0xffe1, 0xffe5, + 0xffe9, 0xffed }; +VECT_VAR_DECL(expected,int,64,2) [] = { 0xffe1, + 0xffe5 }; +VECT_VAR_DECL(expected,uint,16,8) [] = { 0x1e1, 0x1e5, 0x1e9, 0x1ed, + 0x1f1, 0x1f5, 0x1f9, 0x1fd }; +VECT_VAR_DECL(expected,uint,32,4) [] = { 0x1ffe1, 0x1ffe5, 0x1ffe9, 0x1ffed }; +VECT_VAR_DECL(expected,uint,64,2) [] = { 0x1ffe1, 0x1ffe5 }; + +#define INSN_NAME vpaddl +#define TEST_MSG "VPADDL/VPADDLQ" + +#define FNNAME1(NAME) void exec_ ## NAME (void) +#define FN
Re: [AArch64] Add a new scheduling description for the ARM Cortex-A57 processor
On Tuesday 2015-01-20 08:15, Ramana Radhakrishnan wrote: > I'm not sure if a "common" section improves readability. I'd rather > this remained as it is today. On Tuesday 2015-01-20 09:27, Marcus Shawcroft wrote: > I'd prefer separate sections, IMHO that is more useful. /Marcus Okay, then let's go that way. Gerald
Re: [PATCH] Fix PR64313
On 01/19/2015 09:34 AM, Richard Biener wrote: ! /* Whether the user can use instead of explicitely using calls "explicitly" OK with that change. Jason
Re: [PATCH][wwwdocs] Document removal of -mlra/-mnolra from Aarch64 and ARM backends.
On 20/01/15 13:48, Richard Earnshaw wrote: On 20/01/15 11:59, Matthew Wahab wrote: Hello, This patch documents in changes.html the removal of the -mlra/-mnolra from the Aarch64 and ARM backends. Tested by checking the updated webpage in Firefox. Matthew Wahab htdocs_mlra_changes.patch OK. I've committed this to the wwwdocs cvs on Matthew's behalf. Cheers, Kyrill R. Index: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.67 diff -u -r1.67 changes.html --- htdocs/gcc-5/changes.html 19 Jan 2015 15:48:46 - 1.67 +++ htdocs/gcc-5/changes.html 20 Jan 2015 11:30:25 - @@ -456,7 +456,10 @@ Support for the Cavium ThunderX processor is now available through the -mcpu=thunderx and -mtune=thunderx options. - + The transitional options -mlra and -mno-lra + have been removed. The AArch64 backend now uses the local register + allocator (LRA) only. + ARM @@ -488,7 +491,10 @@ The options relating to the old ABI -mapcs and -mapcs-frame have been deprecated. - + The transitional options -mlra and -mno-lra + have been removed. The ARM backend now uses the local register allocator + (LRA) only. + IA-32/x86-64
libgo patch committed: Make mprof test more flexible
PR 64683 reports that sometimes the memory profiler test fails because it expects that a large value will have been garbage collected. Since gccgo's garbage collection is not (yet) precise, this may not happen. This patch lets the test pass even if the object is not collected. Ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 07baa07598ea libgo/go/runtime/pprof/mprof_test.go --- a/libgo/go/runtime/pprof/mprof_test.go Mon Jan 19 20:17:51 2015 -0800 +++ b/libgo/go/runtime/pprof/mprof_test.go Tue Jan 20 08:08:36 2015 -0800 @@ -85,10 +85,12 @@ # 0x[0-9,a-f]+runtime_pprof_test\.TestMemoryProfiler\+0x[0-9,a-f]+ .*/mprof_test.go:64 `, (1<<10)*memoryProfilerRun, (1<<20)*memoryProfilerRun), - fmt.Sprintf(`0: 0 \[%v: %v\] @ 0x[0-9,a-f x]+ + // This should start with "0: 0" but gccgo's imprecise + // GC means that sometimes the value is not collected. + fmt.Sprintf(`(0|%v): (0|%v) \[%v: %v\] @ 0x[0-9,a-f x]+ # 0x[0-9,a-f]+pprof_test\.allocateTransient2M\+0x[0-9,a-f]+ .*/mprof_test.go:30 # 0x[0-9,a-f]+runtime_pprof_test\.TestMemoryProfiler\+0x[0-9,a-f]+ .*/mprof_test.go:65 -`, memoryProfilerRun, (2<<20)*memoryProfilerRun), +`, memoryProfilerRun, (2<<20)*memoryProfilerRun, memoryProfilerRun, (2<<20)*memoryProfilerRun), } for _, test := range tests {
[wwwdocs] Re: [patch] Update C++11 status in libstdc++ docs
Here's the wwwdocs patch for gcc-5/changes.html Committed to CVS. Index: gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.68 diff -u -r1.68 changes.html --- gcc-5/changes.html 20 Jan 2015 15:49:07 - 1.68 +++ gcc-5/changes.html 20 Jan 2015 16:37:08 - @@ -293,10 +293,13 @@ Runtime Library (libstdc++) + A new implementation of std::string is enabled by default, +using the small string optimization instead of +copy-on-write reference counting; A new implementation of std::list is enabled by default, with an O(1) size() function; https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2011";> - Improved support for C++11, including: + Full support for C++11, including the following new features: std::deque and std::vectormeet the allocator-aware container requirements; @@ -307,10 +310,11 @@ std::is_trivially_constructible, std::is_trivially_assignable etc.; - I/O manipulators std::put_time, + I/O manipulators std::put_time, std::get_time, std::hexfloat and std::defaultfloat; generic locale-aware std::isblank; + locale facets for Unicode conversion; atomic operations for std::shared_ptr; std::notify_all_at_thread_exit() and functions for making futures ready at thread exit; @@ -326,9 +330,14 @@ the relevant bits in str.flags(). https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2014";> - Improved experimental support for C++14, including: + Full experimental support for C++14, including the following + new features: std::is_final type trait; + heterogeneous comparison lookup in associative containers. + global functions cbegin, cend, rbegin, +rend, crbegin, and crend for +range access to containers, arrays and initializer lists. https://gcc.gnu.org/onlinedocs/libstdc++/manual/status.html#status.iso.2014";>
[Patch, libstdc++/64680] Conform the standard regex interface
Bootstrapped and tested. Thanks! -- Regards, Tim Shen commit 3f5cbfd3be15386c56d415cd15a04c3fc44ee8c0 Author: timshen Date: Tue Jan 20 00:20:14 2015 -0800 PR libstdc++/64680 * include/bits/regex.h (basic_regex<>::basic_regex, basic_regex<>::operator=, basic_regex<>::imbue): Conform the standard interface. * testsuite/28_regex/basic_regex/assign/char/cstring.cc: New testcase. diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h index 6de883a..07c78b7 100644 --- a/libstdc++-v3/include/bits/regex.h +++ b/libstdc++-v3/include/bits/regex.h @@ -442,7 +442,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 */ explicit basic_regex(const _Ch_type* __p, flag_type __f = ECMAScript) - : basic_regex(__p, __p + _Rx_traits::length(__p), __f) + : basic_regex(__p, __p + char_traits<_Ch_type>::length(__p), __f) { } /** @@ -553,7 +553,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 */ basic_regex& operator=(const _Ch_type* __p) - { return this->assign(__p, flags()); } + { return this->assign(__p); } + + /** + * @brief Replaces a regular expression with a new one constructed from + * an initializer list. + * + * @param __l The initializer list. + * + * @throws regex_error if @p __l is not a valid regular expression. + */ + basic_regex& + operator=(initializer_list<_Ch_type> __l) + { return this->assign(__l.begin(), __l.end()); } /** * @brief Replaces a regular expression with a new one constructed from @@ -564,7 +576,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 template basic_regex& operator=(const basic_string<_Ch_type, _Ch_traits, _Alloc>& __s) - { return this->assign(__s, flags()); } + { return this->assign(__s); } // [7.8.3] assign /** @@ -712,7 +724,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 imbue(locale_type __loc) { std::swap(__loc, _M_loc); - _M_automaton = nullptr; + _M_automaton.reset(); return __loc; } diff --git a/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc b/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc index 19528b6..cf876e0 100644 --- a/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc +++ b/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc @@ -36,9 +36,18 @@ void test01() re.assign(cs); } +void test02() +{ + bool test __attribute__((unused)) = true; + + std::regex re("[[:alnum:]]", std::regex_constants::basic); + re = "\\w+"; +} + int main() { test01(); + test02(); return 0; }
Re: [patch] Add and last pieces of C++11 std::lib
On Tue, Jan 20, 2015 at 3:59 AM, Jonathan Wakely wrote: > On 19/01/15 13:27 +, Jonathan Wakely wrote: >> >> On 16/01/15 23:38 +, Jonathan Wakely wrote: >>> >>> This defines the C++11 header and adds the wstring_convert >>> and wbuffer_convert utilities. >> >> >> I've discovered that wasn't the last piece of the C++11 library, there >> were new constructors taking std::string added to std::locale and all >> the std::xxx_byname facets. >> >> It would be fixed by the attached patch (tested on x86_64-linux with >> old and new std::string), but we're in stage4 now so I'm not >> committing it yet. > > > It's committed to trunk now. > >> commit 977b94ddcf8218efa0318f69b3a2cc5b5d9eb5be >> Author: Jonathan Wakely >> Date: Sun Jan 18 16:41:28 2015 + >> >>Add C++11 std::string constructors for locales and facets. >> * config/abi/pre/gnu.ver: Export new constructors. >> * include/bits/codecvt.h (codecvt_byname): Add string constructor. >> (codecvt_byname, codecvt_byname): Define >> explicit >> specializations and declare explicit instantiations. >> * include/bits/locale_classes.h (locale, collate_byname): Add >> string >> constructors. >> * include/bits/locale_facets.h (ctype_byname, numpunct_byname): >> Likewise. >> * include/bits/locale_facets_nonio.h (time_get_byname, >> time_put_byname, moneypunct_byname, messages_byname): Likewise. >> * src/c++11/codecvt.cc (codecvt_byname, >> codecvt_byname): Define explicit instantiations. >> * src/c++11/locale-inst.cc (time_put_byname, codecvt_byname): >> Instantiate string constructors. >> (ctype_byname): Define string constructor. >> * testsuite/22_locale/codecvt_byname/1.cc: New. >> * testsuite/22_locale/collate_byname/1.cc: New. >> * testsuite/22_locale/ctype_byname/2.cc: New. >> * testsuite/22_locale/messages_byname/1.cc: New. >> * testsuite/22_locale/moneypunct_byname/1.cc: New. >> * testsuite/22_locale/numpunct_byname/1.cc: New. On Linux/ia-32, I got output is: /tmp/ccApSqaQ.o: In function `facet::facet()':^M /export/gnu/import/git/gcc-test-x32/src-trunk/libstdc++-v3/testsuite/22_locale/ctype_byname/2.cc:29: undefined reference to `std::ctype_byname::ctype_byname(std::__cxx11::basic_string, std::allocator > const&, unsigned int)'^M /tmp/ccApSqaQ.o: In function `facet::facet()':^M /export/gnu/import/git/gcc-test-x32/src-trunk/libstdc++-v3/testsuite/22_locale/ctype_byname/2.cc:29: undefined reference to `std::ctype_byname::ctype_byname(std::__cxx11::basic_string, std::allocator > const&, unsigned int)'^M collect2: error: ld returned 1 exit status^M FAIL: 22_locale/ctype_byname/2.cc (test for excess errors) FAIL: libstdc++-abi/abi_check H.J. -- H.J.
Re: [Patch, libstdc++/64680] Conform the standard regex interface
Hi, On 01/20/2015 05:57 PM, Tim Shen wrote: +void test02() +{ + bool test __attribute__((unused)) = true; + + std::regex re("[[:alnum:]]", std::regex_constants::basic); + re = "\\w+"; +} When we end up doing this to save run time, let's at least add in a comment the PR #, like // PR libstdc++/64680 before test02(). Paolo.
Re: [patch] Add and last pieces of C++11 std::lib
On 20/01/15 09:02 -0800, H.J. Lu wrote: On Linux/ia-32, I got output is: /tmp/ccApSqaQ.o: In function `facet::facet()':^M /export/gnu/import/git/gcc-test-x32/src-trunk/libstdc++-v3/testsuite/22_locale/ctype_byname/2.cc:29: undefined reference to `std::ctype_byname::ctype_byname(std::__cxx11::basic_string, std::allocator > const&, unsigned int)'^M /tmp/ccApSqaQ.o: In function `facet::facet()':^M /export/gnu/import/git/gcc-test-x32/src-trunk/libstdc++-v3/testsuite/22_locale/ctype_byname/2.cc:29: undefined reference to `std::ctype_byname::ctype_byname(std::__cxx11::basic_string, std::allocator > const&, unsigned int)'^M collect2: error: ld returned 1 exit status^M FAIL: 22_locale/ctype_byname/2.cc (test for excess errors) FAIL: libstdc++-abi/abi_check I'll take a look later today.
Re: [patch] libstdc++/56785 reduce space overhead of nested tuples
On Fri, Jan 16, 2015 at 4:23 PM, Jonathan Wakely wrote: > This replaces the current empty _Tuple_impl that terminates the > recursive inheritance hierarchy, instead adding the extra code to the > last base class that holds data so that the recursion terminates there > instead. > > The purpose of this is to avoid nested tuples having two instances of > the same _Tuple_impl base class, which cannot be placed at the same > address and so take up space despite being empty. > > Tested x86_64-linux, committed to trunk. This may have caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64695 -- H.J.
Re: Housekeeping work in backends.html
> Perhaps 'd' should just go away completely. It was intended to > distinguish between ports using the old scheduler description and > ports using the DFA model. But support for the old scheduler > description was removed some 10 years ago, and AFAIR the targets that > don't use the DFA scheduler don't use the scheduler at all. Yes, this makes sense, I'll do it (if you don't beat me to it), thanks. -- Eric Botcazou
Re: [patch] Add and last pieces of C++11 std::lib
On 20/01/15 09:02 -0800, H.J. Lu wrote: On Linux/ia-32, I got output is: /tmp/ccApSqaQ.o: In function `facet::facet()':^M /export/gnu/import/git/gcc-test-x32/src-trunk/libstdc++-v3/testsuite/22_locale/ctype_byname/2.cc:29: undefined reference to `std::ctype_byname::ctype_byname(std::__cxx11::basic_string, std::allocator > const&, unsigned int)'^M /tmp/ccApSqaQ.o: In function `facet::facet()':^M /export/gnu/import/git/gcc-test-x32/src-trunk/libstdc++-v3/testsuite/22_locale/ctype_byname/2.cc:29: undefined reference to `std::ctype_byname::ctype_byname(std::__cxx11::basic_string, std::allocator > const&, unsigned int)'^M collect2: error: ld returned 1 exit status^M FAIL: 22_locale/ctype_byname/2.cc (test for excess errors) FAIL: libstdc++-abi/abi_check I forgot that the mangled name for size_t depends on the target, so the linker script needs [jm] instead of m. Patch coming soon ...
Re: [PATCH] Fix up dwarf2out ICE (PR debug/64663)
On 01/20/2015 03:30 AM, Jakub Jelinek wrote: > 2015-01-20 Jakub Jelinek > > PR debug/64663 > * dwarf2out.c (decl_piece_node): Don't put bitsize into > mode if bitsize <= 0. > (decl_piece_bitsize, adjust_piece_list, add_var_loc_to_decl, > dw_sra_loc_expr): Use HOST_WIDE_INT instead of int for bit > sizes and positions. > > * gcc.dg/pr64663.c: New test. Ok. r~
Re: RFA: RL78: Scan inside PARALLELs when looking for dead code
> Here is a small patch to fix a code-gen problem for the RL78. The bug > was that the register death pass was not looking inside PARALLELs, and > thus missing some USE and SET cases. I considered adding code to scan > all of the elements in the PARALLEL, but the only ones that can be > generated for the RL78 are the SImode shift patterns and these always > consist of a SET as the first element and a CLOBBER as the second > element. So I decided to keep things simple and create the patch > below. > > OK to apply ? Ok. Thanks!
Re: [PATCH]Skip g++.dg/tls testes on target using status wrapper
On Jan 20, 2015, at 6:29 AM, Renlin Li wrote: > This patch will add "unwrapped" target selector for g++.dg/tls tests. > This patch will skip those testes as the intended exit code is not correctly > captured by dejagnu. > > Okay to commit? Ok. Would be nice if someone can engineer in a way to do this with wrappers. Wrapping _exit sounds nice but dangerous.
Solve more of firefox LTO regression
Hi, this patch relaxes inliner to allow limited cross-module inlining across units compiled with -O3 and -Os. This was tested with Firefox and it leads to binary of about the same size but noticeably faster in some of javascript benchmarks. Bootstrapped/regtested x86_64-linux, comitted. Honza 2015-01-19 Jan Hubicka PR lto/45375 * ipa-inline.c: Include lto-streamer.h (report_inline_failed_reason): Output source file differences and flags on optimization/target node mismatch. (can_inline_edge_p): Consider caller to be the outer inline function; be less restrictive about matching opimize and optimize_size attributes. (inline_account_function_p): Break out from ... (inline_small_functions): ... here. * ipa-inline-transform.c (clone_inlined_nodes): Use inline_account_function_p. (inline_call): Use optimize attribution; use inline_account_function_p. (inline_transform): Use opt_for_fn. * ipa-inline.h (inline_account_function_p): Declare. Index: ipa-inline.c === --- ipa-inline.c(revision 219876) +++ ipa-inline.c(working copy) @@ -145,6 +145,7 @@ along with GCC; see the file COPYING3. #include "cilk.h" #include "builtins.h" #include "fibonacci_heap.h" +#include "lto-streamer.h" typedef fibonacci_heap edge_heap_t; typedef fibonacci_node edge_heap_node_t; @@ -260,6 +261,23 @@ report_inline_failed_reason (struct cgra xstrdup_for_dump (e->caller->name ()), e->caller->order, xstrdup_for_dump (e->callee->name ()), e->callee->order, cgraph_inline_failed_string (e->inline_failed)); + if ((e->inline_failed == CIF_TARGET_OPTION_MISMATCH + || e->inline_failed == CIF_OPTIMIZATION_MISMATCH) + && e->caller->lto_file_data + && e->callee->function_symbol ()->lto_file_data) + { + fprintf (dump_file, " LTO objects: %s, %s\n", + e->caller->lto_file_data->file_name, + e->callee->function_symbol ()->lto_file_data->file_name); + } + if (e->inline_failed == CIF_TARGET_OPTION_MISMATCH) + cl_target_option_print_diff +(dump_file, 2, target_opts_for_fn (e->caller->decl), + target_opts_for_fn (e->callee->ultimate_alias_target ()->decl)); + if (e->inline_failed == CIF_OPTIMIZATION_MISMATCH) + cl_optimization_print_diff + (dump_file, 2, opts_for_fn (e->caller->decl), + opts_for_fn (e->callee->ultimate_alias_target ()->decl)); } } @@ -297,10 +315,12 @@ can_inline_edge_p (struct cgraph_edge *e bool inlinable = true; enum availability avail; cgraph_node *callee = e->callee->ultimate_alias_target (&avail); - tree caller_tree = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (e->caller->decl); + cgraph_node *caller = e->caller->global.inlined_to + ? e->caller->global.inlined_to : e->caller; + tree caller_tree = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (caller->decl); tree callee_tree = callee ? DECL_FUNCTION_SPECIFIC_OPTIMIZATION (callee->decl) : NULL; - struct function *caller_fun = e->caller->get_fun (); + struct function *caller_fun = caller->get_fun (); struct function *callee_fun = callee ? callee->get_fun () : NULL; gcc_assert (e->inline_failed); @@ -333,9 +353,9 @@ can_inline_edge_p (struct cgraph_edge *e inlinable = false; } /* Don't inline if the functions have different EH personalities. */ - else if (DECL_FUNCTION_PERSONALITY (e->caller->decl) + else if (DECL_FUNCTION_PERSONALITY (caller->decl) && DECL_FUNCTION_PERSONALITY (callee->decl) - && (DECL_FUNCTION_PERSONALITY (e->caller->decl) + && (DECL_FUNCTION_PERSONALITY (caller->decl) != DECL_FUNCTION_PERSONALITY (callee->decl))) { e->inline_failed = CIF_EH_PERSONALITY; @@ -344,7 +364,7 @@ can_inline_edge_p (struct cgraph_edge *e /* TM pure functions should not be inlined into non-TM_pure functions. */ else if (is_tm_pure (callee->decl) - && !is_tm_pure (e->caller->decl)) + && !is_tm_pure (caller->decl)) { e->inline_failed = CIF_UNSPECIFIED; inlinable = false; @@ -360,14 +380,14 @@ can_inline_edge_p (struct cgraph_edge *e inlinable = false; } /* Check compatibility of target optimization options. */ - else if (!targetm.target_option.can_inline_p (e->caller->decl, + else if (!targetm.target_option.can_inline_p (caller->decl, callee->decl)) { e->inline_failed = CIF_TARGET_OPTION_MISMATCH; inlinable = false; } /* Don't inline a function with mismatched sanitization attributes. */ - else if (!sanitize_attrs_match_for_inline_p (e->caller->decl, callee->decl)) + else if (!sanitize_attrs_match_for_inline_p (caller->decl, callee->decl)) { e->inline_f
Fix profile merging WRT speculative edges
Hi, this patch fixes ICE in ipa_merge_profiles on speculative edges. Bootstrapped/regtested x86_64-linux, comitted. Also tested by Markus on Firefox build. PR ipa/63576 * ipa-utils.c (ipa_merge_profiles): Merge speculative edges. Index: ipa-utils.c === --- ipa-utils.c (revision 219871) +++ ipa-utils.c (working copy) @@ -539,7 +539,7 @@ ipa_merge_profiles (struct cgraph_node * } if (match) { - struct cgraph_edge *e; + struct cgraph_edge *e, *e2; basic_block srcbb, dstbb; /* TODO: merge also statement histograms. */ @@ -562,19 +562,95 @@ ipa_merge_profiles (struct cgraph_node * pop_cfun (); for (e = dst->callees; e; e = e->next_callee) { - gcc_assert (!e->speculative); + if (e->speculative) + continue; e->count = gimple_bb (e->call_stmt)->count; e->frequency = compute_call_stmt_bb_frequency (dst->decl, gimple_bb (e->call_stmt)); } - for (e = dst->indirect_calls; e; e = e->next_callee) + for (e = dst->indirect_calls, e2 = src->indirect_calls; e; + e2 = (e2 ? e2->next_callee : NULL), e = e->next_callee) { - gcc_assert (!e->speculative); - e->count = gimple_bb (e->call_stmt)->count; - e->frequency = compute_call_stmt_bb_frequency -(dst->decl, - gimple_bb (e->call_stmt)); + gcov_type count = gimple_bb (e->call_stmt)->count; + int freq = compute_call_stmt_bb_frequency + (dst->decl, +gimple_bb (e->call_stmt)); + /* When call is speculative, we need to re-distribute probabilities +the same way as they was. This is not really correct because +in the other copy the speculation may differ; but probably it +is not really worth the effort. */ + if (e->speculative) + { + cgraph_edge *direct, *indirect; + cgraph_edge *direct2 = NULL, *indirect2 = NULL; + ipa_ref *ref; + + e->speculative_call_info (direct, indirect, ref); + gcc_assert (e == indirect); + if (e2 && e2->speculative) + e2->speculative_call_info (direct2, indirect2, ref); + if (indirect->count || direct->count) + { + /* We should mismatch earlier if there is no matching +indirect edge. */ + if (!e2) + { + if (dump_file) + fprintf (dump_file, +"Mismatch in merging indirect edges\n"); + } + else if (!e2->speculative) + indirect->count += e2->count; + else if (e2->speculative) + { + if (DECL_ASSEMBLER_NAME (direct2->callee->decl) + != DECL_ASSEMBLER_NAME (direct->callee->decl)) + { + if (direct2->count >= direct->count) + { + direct->redirect_callee (direct2->callee); + indirect->count += indirect2->count ++ direct->count; + direct->count = direct2->count; + } + else + indirect->count += indirect2->count + direct2->count; + } + else + { + direct->count += direct2->count; + indirect->count += indirect2->count; + } + } + int prob = RDIV (direct->count * REG_BR_PROB_BASE , + direct->count + indirect->count); + direct->frequency = RDIV (freq * prob, REG_BR_PROB_BASE); + indirect->frequency = RDIV (freq * (REG_BR_PROB_BASE - prob), + REG_BR_PROB_BASE); + } + else + /* At the moment we should have only profile feedback based + speculations when merging. */ + gcc_unreachable (); + } + else if (e2->speculative) + { + cgraph_edge *direct, *indirect; + ipa_ref *ref; + + e2->speculative_call_info (direct, indirect, ref); + e->count = count; + e->frequency = freq; + int prob = RDIV (direct->count * REG_BR_PROB_BASE, e->count); + e->make_speculative (direct->callee, direct->count, + RDIV (freq * prob, REG_BR_PROB_BASE)); + } + else +
[committed] Fix pr49888.c test
Hi! I've noticed this test started failing with newer gdb. The actual problem is that the store to v got optimized away completely some time ago and so the breakpoint on that line is problematic. Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk as obvious. 2015-01-20 Jakub Jelinek * gcc.dg/guality/pr49888.c (v): Add __attribute__((used)). --- gcc/testsuite/gcc.dg/guality/pr49888.c.jj 2012-06-28 13:33:21.0 +0200 +++ gcc/testsuite/gcc.dg/guality/pr49888.c 2015-01-20 12:32:11.136906646 +0100 @@ -2,7 +2,7 @@ /* { dg-do run } */ /* { dg-options "-g" } */ -static int v; +static int v __attribute__((used)); static void __attribute__((noinline, noclone)) f (int *p) Jakub
[committed] Fix ubsan -fsanitize=vptr sanitization (PR sanitizer/64632)
Hi! This patch reverts an apparently bad change made upstream, that got already reverted upstream in December. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2015-01-20 Jakub Jelinek PR sanitizer/64632 * ubsan/ubsan_type_hash.cc: Cherry pick upstream r224972. * g++.dg/ubsan/pr64632.C: New test. --- gcc/testsuite/g++.dg/ubsan/pr64632.C.jj 2015-01-20 16:53:42.560478816 +0100 +++ gcc/testsuite/g++.dg/ubsan/pr64632.C2015-01-20 16:53:37.683562024 +0100 @@ -0,0 +1,23 @@ +// PR sanitizer/64632 +// { dg-do run } +// { dg-options "-fsanitize=vptr -fno-sanitize-recover=vptr" } + +struct S +{ + S () : a(0) {} + int a; + int f () { return a; } + virtual int v () { return 0; } +}; + +struct X : virtual S +{ + int v () { return 2; } +}; + +int +main () +{ + X x; + return x.f (); +} --- libsanitizer/ubsan/ubsan_type_hash.cc (revision 224971) +++ libsanitizer/ubsan/ubsan_type_hash.cc (revision 224972) @@ -115,8 +115,7 @@ __ubsan::__ubsan_vptr_type_cache[__ubsan /// \brief Determine whether \p Derived has a \p Base base class subobject at /// offset \p Offset. -static bool isDerivedFromAtOffset(sptr Object, - const abi::__class_type_info *Derived, +static bool isDerivedFromAtOffset(const abi::__class_type_info *Derived, const abi::__class_type_info *Base, sptr Offset) { if (Derived->__type_name == Base->__type_name) @@ -124,7 +123,7 @@ static bool isDerivedFromAtOffset(sptr O if (const abi::__si_class_type_info *SI = dynamic_cast(Derived)) -return isDerivedFromAtOffset(Object, SI->__base_type, Base, Offset); +return isDerivedFromAtOffset(SI->__base_type, Base, Offset); const abi::__vmi_class_type_info *VTI = dynamic_cast(Derived); @@ -139,13 +138,13 @@ static bool isDerivedFromAtOffset(sptr O sptr OffsetHere = VTI->base_info[base].__offset_flags >> abi::__base_class_type_info::__offset_shift; if (VTI->base_info[base].__offset_flags & - abi::__base_class_type_info::__virtual_mask) { - sptr VTable = *reinterpret_cast(Object); - OffsetHere = *reinterpret_cast(VTable + OffsetHere); -} -if (isDerivedFromAtOffset(Object + OffsetHere, - VTI->base_info[base].__base_type, Base, - Offset - OffsetHere)) + abi::__base_class_type_info::__virtual_mask) + // For now, just punt on virtual bases and say 'yes'. + // FIXME: OffsetHere is the offset in the vtable of the virtual base + //offset. Read the vbase offset out of the vtable and use it. + return true; +if (isDerivedFromAtOffset(VTI->base_info[base].__base_type, + Base, Offset - OffsetHere)) return true; } @@ -154,15 +153,14 @@ static bool isDerivedFromAtOffset(sptr O /// \brief Find the derived-most dynamic base class of \p Derived at offset /// \p Offset. -static const abi::__class_type_info * -findBaseAtOffset(sptr Object, const abi::__class_type_info *Derived, - sptr Offset) { +static const abi::__class_type_info *findBaseAtOffset( +const abi::__class_type_info *Derived, sptr Offset) { if (!Offset) return Derived; if (const abi::__si_class_type_info *SI = dynamic_cast(Derived)) -return findBaseAtOffset(Object, SI->__base_type, Offset); +return findBaseAtOffset(SI->__base_type, Offset); const abi::__vmi_class_type_info *VTI = dynamic_cast(Derived); @@ -174,13 +172,12 @@ findBaseAtOffset(sptr Object, const abi: sptr OffsetHere = VTI->base_info[base].__offset_flags >> abi::__base_class_type_info::__offset_shift; if (VTI->base_info[base].__offset_flags & - abi::__base_class_type_info::__virtual_mask) { - sptr VTable = *reinterpret_cast(Object); - OffsetHere = *reinterpret_cast(VTable + OffsetHere); -} -if (const abi::__class_type_info *Base = findBaseAtOffset( -Object + OffsetHere, VTI->base_info[base].__base_type, -Offset - OffsetHere)) + abi::__base_class_type_info::__virtual_mask) + // FIXME: Can't handle virtual bases yet. + continue; +if (const abi::__class_type_info *Base = + findBaseAtOffset(VTI->base_info[base].__base_type, + Offset - OffsetHere)) return Base; } @@ -232,8 +229,7 @@ bool __ubsan::checkDynamicType(void *Obj return false; abi::__class_type_info *Base = (abi::__class_type_info*)Type; - if (!isDerivedFromAtOffset(reinterpret_cast(Object), Derived, Base, - -Vtable->Offset)) + if (!isDerivedFromAtOffset(Derived, Base, -Vtable->Offset)) return false; // Success. Cache this result. @@ -247,9 +243,8 @@ __ubsan::DynamicTypeInfo __ubsan::getDyn if (!Vtable) return DynamicTy
Re: [PATCH 3/5] IPA ICF pass
On Thu, Nov 13, 2014 at 2:17 PM, H.J. Lu wrote: > On Wed, Oct 15, 2014 at 10:03 AM, Martin Liška wrote: >> >> Hello >> >> There's final version of the patch I'm going to commit tomorrow in the >> morning (CEST). >> Thank you Honza for the review. >> >> Martin > > This caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63856 > This also caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64693 -- H.J.
Re: Fix profile merging WRT speculative edges
On 2015.01.20 at 21:04 +0100, Jan Hubicka wrote: > this patch fixes ICE in ipa_merge_profiles on speculative edges. > > Bootstrapped/regtested x86_64-linux, comitted. Also tested by Markus on > Firefox build. This needs one additional fix. See below. > PR ipa/63576 > * ipa-utils.c (ipa_merge_profiles): Merge speculative edges. > Index: ipa-utils.c > === > --- ipa-utils.c (revision 219871) > +++ ipa-utils.c (working copy) > @@ -539,7 +539,7 @@ ipa_merge_profiles (struct cgraph_node * > } >if (match) > { > - struct cgraph_edge *e; > + struct cgraph_edge *e, *e2; >basic_block srcbb, dstbb; > >/* TODO: merge also statement histograms. */ > @@ -562,19 +562,95 @@ ipa_merge_profiles (struct cgraph_node * >pop_cfun (); >for (e = dst->callees; e; e = e->next_callee) > { > - gcc_assert (!e->speculative); > + if (e->speculative) > + continue; > e->count = gimple_bb (e->call_stmt)->count; > e->frequency = compute_call_stmt_bb_frequency >(dst->decl, > gimple_bb (e->call_stmt)); > } > - for (e = dst->indirect_calls; e; e = e->next_callee) > + for (e = dst->indirect_calls, e2 = src->indirect_calls; e; > +e2 = (e2 ? e2->next_callee : NULL), e = e->next_callee) > { > - gcc_assert (!e->speculative); > - e->count = gimple_bb (e->call_stmt)->count; > - e->frequency = compute_call_stmt_bb_frequency > - (dst->decl, > - gimple_bb (e->call_stmt)); > + gcov_type count = gimple_bb (e->call_stmt)->count; > + int freq = compute_call_stmt_bb_frequency > + (dst->decl, > + gimple_bb (e->call_stmt)); > + /* When call is speculative, we need to re-distribute probabilities > + the same way as they was. This is not really correct because > + in the other copy the speculation may differ; but probably it > + is not really worth the effort. */ > + if (e->speculative) > + { > + cgraph_edge *direct, *indirect; > + cgraph_edge *direct2 = NULL, *indirect2 = NULL; > + ipa_ref *ref; > + > + e->speculative_call_info (direct, indirect, ref); > + gcc_assert (e == indirect); > + if (e2 && e2->speculative) > + e2->speculative_call_info (direct2, indirect2, ref); > + if (indirect->count || direct->count) > + { > + /* We should mismatch earlier if there is no matching > + indirect edge. */ > + if (!e2) > + { > + if (dump_file) > + fprintf (dump_file, > + "Mismatch in merging indirect edges\n"); > + } > + else if (!e2->speculative) > + indirect->count += e2->count; > + else if (e2->speculative) > + { > + if (DECL_ASSEMBLER_NAME (direct2->callee->decl) > + != DECL_ASSEMBLER_NAME (direct->callee->decl)) > + { > + if (direct2->count >= direct->count) > + { > + direct->redirect_callee (direct2->callee); > + indirect->count += indirect2->count > + + direct->count; > + direct->count = direct2->count; > + } > + else > + indirect->count += indirect2->count + > direct2->count; > + } > + else > + { > +direct->count += direct2->count; > +indirect->count += indirect2->count; > + } > + } > + int prob = RDIV (direct->count * REG_BR_PROB_BASE , > + direct->count + indirect->count); > + direct->frequency = RDIV (freq * prob, REG_BR_PROB_BASE); > + indirect->frequency = RDIV (freq * (REG_BR_PROB_BASE - prob), > + REG_BR_PROB_BASE); > + } > + else > + /* At the moment we should have only profile feedback based > +speculations when merging. */ > + gcc_unreachable (); > + } > + else if (e2->speculative) + else if (e2 && e2->speculative) Otherwise it will crash: lto1: internal compiler error: Segmentation fault 0xa12f6f crash_signal ../../gcc/gcc/toplev.c:381 0x88b190 ipa_merge_profiles(cgraph_node*, cgraph_node*) ../../gcc/gcc/ipa-utils.c:637 0x603722 lto_cgraph_replace_node ../../gcc
Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1
Seems like the thread might have died down, so just wanted to ping it. As Marcus says, this is holding up other patches so it'd be good to get something in soon. Would it be OK to commit the original patch or should we wait? Marcus Shawcroft writes: > On 14 January 2015 at 07:35, Jeff Law wrote: >> On 01/13/15 11:55, Eric Botcazou wrote: >>> >>> (1) we have a non-paradoxical subreg; (2) both (reg:ymode xregno) and (reg:xmode xregno) occupy full hard registers (no padding or unused upper bits); (3) (reg:ymode xregno) and (reg:xmode xregno) store the same number of bytes (X) in each constituent hard register; (4) the offset is a multiple of X, i.e. the data we're accessing is aligned to a register boundary; and (5) endianness is regular (no differences between words and bytes, or between registers and memory) >>> >>> >>> OK, that's a nice translation of the new code. :-) >>> >>> It seems to me that the patch wants to extend the support of generic >>> subregs >>> to modes whose sizes are not multiple of each other, which is a >>> requirement of >>> the existing code, but does that in a very specific case for the sake of >>> the >>> ARM port without saying where all the above restrictions come from. >> >> Basically we're lifting the restriction that the the sizes are multiples of >> each other. The requirements above are the set where we know it will work. >> They are target independent, but happen to match what the ARM needs. >> >> The certainly do short circuit the meat of the function, that's the whole >> point, there's this set of conditions under which we know this will work and >> when they hold, we bypass. >> >> Now one could argue that instead of bypassing we should put the code to >> handle this situation further down. I'd be leery of doing that just from a >> complexity standpoint. But one could also argue that short circuiting like >> the patch does adds complexity as well and may be a bit kludgy. Yeah, I'm worried about the complexity too. We allow subregs that have padding and subregs where the number of bytes in the mode doesn't divide equally between the number of registers. We also have subregs where a DImode value in R can take a different number of registers from a DFmode value in R, despite the two modes having the same number of bits. I've no idea how we'd generalise the code so that those cases and the new one just fall out as particular inputs to an overarching equation. Or how we make sure that the equation doesn't give nonsense results for cases that would be better off triggering an abort (e.g. DFmode subregs of CImode when DImode and DFmode occupy different numbers of registers). I don't think we want to allow subregs in all cases where there is padding. We hit a similar case with 8-byte subregs of 24-byte values stored in 16-byte registers (DImode, EImode and TImode respectively). That doesn't do what we want because all three DImode pieces of the EImode aren't independently addressable, so the abort actually helped. TBH I find even the current code too hard to understand. I can plug specific inputs in and follow what happens, but I don't have a feel for why that's the right way of handling all possible inputs. In some ways I think we've made life hard for ourselves by trying to implement all these rules in a target-independent way. Subregs on memory are easy (even though we should generally be avoiding them :-): they start SUBREG_BYTE bytes into the MEM and occupy the number of bytes in the outer mode. At least AFAIK, we never have situation where an N-bit float can occupy a different number of memory bytes from an N-bit integer. But treating REGs as an image of memory (which I think is effectively what we're doing) has caused problems. As well as being complicated, doing things this way is pretty restrictive. One of the main uses of CANNOT_CHANGE_MODE_CLASS seems to be to work around cases where the generic rules get it wrong. Sometimes it seems like it would be better to let the target define which subregs it can form and on which registers. It would be less complicated, more general, nd (in cases where it allows C_C_M_C to be removed) hopefully more optimal. >> Maybe the way forward here is for someone to try and integrate this support >> in the main part of the code and see how it looks. Then we can pick one. I still have a mental block on how to do that :-) >> The downside is since this probably isn't a regression that work would need >> to happen quickly to make it into gcc-5. >> >> Which leads to another option, get the release managers to sign off on the >> kludge after gcc-5 branches and only install the kludge on the gcc-5 branch >> and insisting the other solution go in for gcc-6 and beyond. Not sure if >> they'd do that, but it's a discussion that could happen. > > This issue is currently gating a number of patches that get big endian > working on aarch64 (all of which are on the
[COMMITTED] Fix pr libffi/64581
One of the few changes to libffi.exp that we picked up that looked quite logical. But I guess it turns out that we haven't properly configured the c++ compiler in dejagnu for libffi. Easiest thing to do is just revert this hunk. r~ PR libffi/64581 * testsuite/lib/libffi.exp (libffi_target_compile): Don't switch to C++ mode when compiling C++ source code. --- a/libffi/testsuite/lib/libffi.exp +++ b/libffi/testsuite/lib/libffi.exp @@ -219,10 +219,6 @@ proc libffi_target_compile { source dest type options } { lappend options "libs= -lpthread" } -if { [string match "*.cc" $source] } { - lappend options "c++" -} - verbose "options: $options" return [target_compile $source $dest $type $options] }
Go patch committed: Do not mark unused variables as used in closures
This patch from Chris Manghane fixes the Go frontend to not always mark variables in closures as used. This is http://golang.org/issue/6415. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r cbdec4465fa8 go/parse.cc --- a/go/parse.cc Tue Jan 20 08:11:07 2015 -0800 +++ b/go/parse.cc Tue Jan 20 13:24:35 2015 -0800 @@ -2450,7 +2450,7 @@ && (named_object->is_variable() || named_object->is_result_variable())) return this->enclosing_var_reference(in_function, named_object, - location); + may_be_sink, location); switch (named_object->classification()) { @@ -2591,11 +2591,14 @@ Expression* Parse::enclosing_var_reference(Named_object* in_function, Named_object* var, - Location location) + bool may_be_sink, Location location) { go_assert(var->is_variable() || var->is_result_variable()); - this->mark_var_used(var); + // Any left-hand-side can be a sink, so if this can not be + // a sink, then it must be a use of the variable. + if (!may_be_sink) +this->mark_var_used(var); Named_object* this_function = this->gogo_->current_function(); Named_object* closure = this_function->func_value()->closure_var(); @@ -2912,7 +2915,7 @@ ref = Expression::make_var_reference(var, location); else ref = this->enclosing_var_reference(ev[i].in_function(), var, - location); + true, location); Expression* refaddr = Expression::make_unary(OPERATOR_AND, ref, location); initializer->push_back(refaddr); @@ -3215,7 +3218,7 @@ if (in_function != NULL && in_function != this->gogo_->current_function() && (named_object->is_variable() || named_object->is_result_variable())) -return this->enclosing_var_reference(in_function, named_object, +return this->enclosing_var_reference(in_function, named_object, is_lhs, location); switch (named_object->classification()) @@ -5722,6 +5725,20 @@ Var_expression* ve = expr->var_expression(); if (ve != NULL) this->mark_var_used(ve->named_object()); + else if (expr->deref()->field_reference_expression() != NULL + && this->gogo_->current_function() != NULL) +{ + // We could be looking at a variable referenced from a closure. + // If so, we need to get the enclosed variable and mark it as used. + Function* this_function = this->gogo_->current_function()->func_value(); + Named_object* closure = this_function->closure_var(); + if (closure != NULL) + { + unsigned int var_index = + expr->deref()->field_reference_expression()->field_index(); + this->mark_var_used(this_function->enclosing_var(var_index - 1)); + } +} return expr; } diff -r cbdec4465fa8 go/parse.h --- a/go/parse.hTue Jan 20 08:11:07 2015 -0800 +++ b/go/parse.hTue Jan 20 13:24:35 2015 -0800 @@ -218,7 +218,7 @@ Typed_identifier* receiver(); Expression* operand(bool may_be_sink, bool *is_parenthesized); Expression* enclosing_var_reference(Named_object*, Named_object*, - Location); + bool may_be_sink, Location); Expression* composite_lit(Type*, int depth, Location); Expression* function_lit(); Expression* create_closure(Named_object* function, Enclosing_vars*,
Re: Cleanup and speedup inliner after conversion of heap to sreals
On Tue, Dec 16, 2014 at 2:08 PM, Jan Hubicka wrote: > Hi, > conversion to sreal makes it possible to compute badness in more streamlined > manner. Together with the sreal::normalize change this patch finally makes > fibheap badness calcualtion to be out of radar in profiling and I hope it > makes it more maintainable by eliminating the issues from roundoff errors and > overflows that was quite painful. > > Incrementally it is also possible to turn time computation into sreals. > > Bootstrapped/regtested x86_64-linux, wll wait with the commit for the > preivous changes to show up in the benchmark testers. > > Thanks for Martin and Trevor for working on sreals&fibheap changes. > > Honza > > * ipa-inline.c (cgraph_freq_base_rec, percent_rec): New functions. > (compute_uninlined_call_time): Return sreal. > (compute_inlined_call_time): Return sreal. > (big_speedup_p): Update to be computed with sreals. > (relative_time_benefit): Turn to sreal computation; return > value as numerator/denominator to save division. > (edge_badness): Rewrite to sreals; remove overflow checks > and cleanup. > (ipa_inline): Initialize cgraph_freq_base_rec and percent_rec. > (inline_small_functions): Update dumping; speedup fibheap maintenance. > (update_edge_key): Update dumping. This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64694 -- H.J.
Re: [[ARM/AArch64][testsuite] 06/36] Add vmla and vmls tests.
On 19 January 2015 at 14:35, Marcus Shawcroft wrote: > On 13 January 2015 at 15:18, Christophe Lyon > wrote: >> >> * gcc.target/aarch64/advsimd-intrinsics/vmlX.inc: New file. >> * gcc.target/aarch64/advsimd-intrinsics/vmla.c: New file. >> * gcc.target/aarch64/advsimd-intrinsics/vmls.c: New file. > > OK with the the vmlx poly ops dropped /M Thanks, here is what I have committed (I removed the 64 bits elements vectors, in addition to the poly ones). Christophe Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 219916) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,5 +1,11 @@ 2015-01-20 Christophe Lyon + * gcc.target/aarch64/advsimd-intrinsics/vmlX.inc: New file. + * gcc.target/aarch64/advsimd-intrinsics/vmla.c: New file. + * gcc.target/aarch64/advsimd-intrinsics/vmls.c: New file. + +2015-01-20 Christophe Lyon + * gcc.target/aarch64/advsimd-intrinsics/vldX_dup.c: New file. 2015-01-20 Jakub Jelinek Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX.inc === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX.inc (revision 0) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX.inc (working copy) @@ -0,0 +1,123 @@ +#define FNNAME1(NAME) exec_ ## NAME +#define FNNAME(NAME) FNNAME1(NAME) + +void FNNAME (INSN_NAME) (void) +{ +#define DECL_VMLX(T, W, N) \ + DECL_VARIABLE(vector1, T, W, N); \ + DECL_VARIABLE(vector2, T, W, N); \ + DECL_VARIABLE(vector3, T, W, N); \ + DECL_VARIABLE(vector_res, T, W, N) + + /* vector_res = vmla(vector, vector3, vector4), + then store the result. */ +#define TEST_VMLX1(INSN, Q, T1, T2, W, N)\ + VECT_VAR(vector_res, T1, W, N) = \ +INSN##Q##_##T2##W(VECT_VAR(vector1, T1, W, N), \ + VECT_VAR(vector2, T1, W, N), \ + VECT_VAR(vector3, T1, W, N)); \ + vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), \ + VECT_VAR(vector_res, T1, W, N)) + +#define TEST_VMLX(INSN, Q, T1, T2, W, N) \ + TEST_VMLX1(INSN, Q, T1, T2, W, N) + + DECL_VMLX(int, 8, 8); + DECL_VMLX(int, 16, 4); + DECL_VMLX(int, 32, 2); + DECL_VMLX(uint, 8, 8); + DECL_VMLX(uint, 16, 4); + DECL_VMLX(uint, 32, 2); + DECL_VMLX(float, 32, 2); + DECL_VMLX(int, 8, 16); + DECL_VMLX(int, 16, 8); + DECL_VMLX(int, 32, 4); + DECL_VMLX(uint, 8, 16); + DECL_VMLX(uint, 16, 8); + DECL_VMLX(uint, 32, 4); + DECL_VMLX(float, 32, 4); + + clean_results (); + + VLOAD(vector1, buffer, , int, s, 8, 8); + VLOAD(vector1, buffer, , int, s, 16, 4); + VLOAD(vector1, buffer, , int, s, 32, 2); + VLOAD(vector1, buffer, , uint, u, 8, 8); + VLOAD(vector1, buffer, , uint, u, 16, 4); + VLOAD(vector1, buffer, , uint, u, 32, 2); + VLOAD(vector1, buffer, , float, f, 32, 2); + VLOAD(vector1, buffer, q, int, s, 8, 16); + VLOAD(vector1, buffer, q, int, s, 16, 8); + VLOAD(vector1, buffer, q, int, s, 32, 4); + VLOAD(vector1, buffer, q, uint, u, 8, 16); + VLOAD(vector1, buffer, q, uint, u, 16, 8); + VLOAD(vector1, buffer, q, uint, u, 32, 4); + VLOAD(vector1, buffer, q, float, f, 32, 4); + + VDUP(vector2, , int, s, 8, 8, 0x11); + VDUP(vector2, , int, s, 16, 4, 0x22); + VDUP(vector2, , int, s, 32, 2, 0x33); + VDUP(vector2, , uint, u, 8, 8, 0x44); + VDUP(vector2, , uint, u, 16, 4, 0x55); + VDUP(vector2, , uint, u, 32, 2, 0x66); + VDUP(vector2, , float, f, 32, 2, 33.1f); + VDUP(vector2, q, int, s, 8, 16, 0x77); + VDUP(vector2, q, int, s, 16, 8, 0x88); + VDUP(vector2, q, int, s, 32, 4, 0x99); + VDUP(vector2, q, uint, u, 8, 16, 0xAA); + VDUP(vector2, q, uint, u, 16, 8, 0xBB); + VDUP(vector2, q, uint, u, 32, 4, 0xCC); + VDUP(vector2, q, float, f, 32, 4, 99.2f); + + VDUP(vector3, , int, s, 8, 8, 0xFF); + VDUP(vector3, , int, s, 16, 4, 0xEE); + VDUP(vector3, , int, s, 32, 2, 0xDD); + VDUP(vector3, , uint, u, 8, 8, 0xCC); + VDUP(vector3, , uint, u, 16, 4, 0xBB); + VDUP(vector3, , uint, u, 32, 2, 0xAA); + VDUP(vector3, , float, f, 32, 2, 10.23f); + VDUP(vector3, q, int, s, 8, 16, 0x99); + VDUP(vector3, q, int, s, 16, 8, 0x88); + VDUP(vector3, q, int, s, 32, 4, 0x77); + VDUP(vector3, q, uint, u, 8, 16, 0x66); + VDUP(vector3, q, uint, u, 16, 8, 0x55); + VDUP(vector3, q, uint, u, 32, 4, 0x44); + VDUP(vector3, q, float, f, 32, 4, 77.8f); + + TEST_VMLX(INSN_NAME, , int, s, 8, 8); + TEST_VMLX(INSN_NAME, , int, s, 16, 4); + TEST_VMLX(INSN_NAME, , int, s, 32, 2); + TEST_VMLX(INSN_NAME, , uint, u, 8, 8); + TEST_VMLX(INSN_NAME, , uint, u, 16, 4); + TEST_VMLX(INSN_NAME, , uint, u, 32, 2); + TEST_VMLX(INSN_NAME, , float, f, 32, 2); + TEST_VMLX(INSN_NAME, q, int, s, 8, 16); + TEST_VMLX(INSN_NAME, q, int, s, 16, 8); + TEST_VMLX(INSN_NAME, q, int, s, 32, 4); + TEST_VMLX(INSN_NAME, q, uint, u, 8, 16); + TEST_VMLX(INSN_NAME, q, uint, u, 16, 8); + TEST_VMLX(INSN_NAME, q, uint, u, 32, 4); + TEST_VMLX(INSN_NAME,
Re: Fix profile merging WRT speculative edges
> On 2015.01.20 at 21:04 +0100, Jan Hubicka wrote: > > this patch fixes ICE in ipa_merge_profiles on speculative edges. > > > > Bootstrapped/regtested x86_64-linux, comitted. Also tested by Markus on > > Firefox build. > > This needs one additional fix. See below. > > Otherwise it will crash: > > lto1: internal compiler error: Segmentation fault > 0xa12f6f crash_signal > ../../gcc/gcc/toplev.c:381 > 0x88b190 ipa_merge_profiles(cgraph_node*, cgraph_node*) > ../../gcc/gcc/ipa-utils.c:637 > 0x603722 lto_cgraph_replace_node > ../../gcc/gcc/lto/lto-symtab.c:124 > 0x604cf3 lto_symtab_merge_symbols_1 > ../../gcc/gcc/lto/lto-symtab.c:619 > 0x604cf3 lto_symtab_merge_symbols() > ../../gcc/gcc/lto/lto-symtab.c:647 > 0x5fa52e read_cgraph_and_symbols > ../../gcc/gcc/lto/lto.c:3109 > 0x5fa52e lto_main() > ../../gcc/gcc/lto/lto.c:3436 > Please submit a full bug report, > with preprocessed source if appropriate. I see, thanks. It means that we do have comdats that diverge in their call statements. I guess we ought to just cancel merging in that case, otherwise the profile can be completely off :( I will need to write some call target compare then. So far the code assumes that if #of BBs match then the bodies match and proceeds with merging. Martin, perhaps we can re-use some of ipa-icf infrastructure here to quickly check that at least CFG shapes and call targets match? I will commit this hack for now; hopefully it is only infrequent side case. Honza
[PATCHv2][libatomic] Avoid misaligned atomic operations
My original patch only fixed libat_fetch_op; this one applies the same fix to libat_op_fetch, as well. When using word-wide CAS to emulate atomic fetch-and-op, addresses should be word-aligned to avoid exceptions on some targets. The problem manifested in a new port I'm working on as a failure in test gcc.dg/atomic/stdatomic-op-1.c, and I've confirmed that this patch fixes it. x86_64-unknown-linux still bootstraps, but that is admittedly of little significance, since it doesn't use these routines. 2015-01-09 Andrew Waterman * fop_n.c (libat_fetch_op): Align address to word boundary. (libat_op_fetch): Likewise. --- libatomic/fop_n.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/libatomic/fop_n.c b/libatomic/fop_n.c index 307184d..854d648 100644 --- a/libatomic/fop_n.c +++ b/libatomic/fop_n.c @@ -112,9 +112,9 @@ SIZE(C2(libat_fetch_,NAME)) (UTYPE *mptr, UTYPE opval, int smodel) pre_barrier (smodel); - wptr = (UWORD *)mptr; - shift = 0; - mask = -1; + wptr = (UWORD *)((uintptr_t)mptr & -WORDSIZE); + shift = (((uintptr_t)mptr % WORDSIZE) * CHAR_BIT) ^ SIZE(INVERT_MASK); + mask = SIZE(MASK) << shift; wopval = (UWORD)opval << shift; woldval = __atomic_load_n (wptr, __ATOMIC_RELAXED); @@ -136,9 +136,9 @@ SIZE(C3(libat_,NAME,_fetch)) (UTYPE *mptr, UTYPE opval, int smodel) pre_barrier (smodel); - wptr = (UWORD *)mptr; - shift = 0; - mask = -1; + wptr = (UWORD *)((uintptr_t)mptr & -WORDSIZE); + shift = (((uintptr_t)mptr % WORDSIZE) * CHAR_BIT) ^ SIZE(INVERT_MASK); + mask = SIZE(MASK) << shift; wopval = (UWORD)opval << shift; woldval = __atomic_load_n (wptr, __ATOMIC_RELAXED); -- 2.2.1
Re: [[ARM/AArch64][testsuite] 07/36] Add vmla_lane and vmls_lane tests.
On 19 January 2015 at 14:39, Marcus Shawcroft wrote: > On 13 January 2015 at 15:18, Christophe Lyon > wrote: >> >> * gcc.target/aarch64/advsimd-intrinsics/vmlX_lane.inc: New file. >> * gcc.target/aarch64/advsimd-intrinsics/vmla_lane.c: New file. >> * gcc.target/aarch64/advsimd-intrinsics/vmls_lane.c: New file. > > OK with Tejas' comment addressed. /Marcus Here is what I have committed (removed poly, int8 and int64 variants). Christophe. Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX_lane.inc === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX_lane.inc (revision 0) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmlX_lane.inc (working copy) @@ -0,0 +1,100 @@ +#define FNNAME1(NAME) exec_ ## NAME +#define FNNAME(NAME) FNNAME1(NAME) + +void FNNAME (INSN_NAME) (void) +{ +#define DECL_VMLX_LANE(VAR) \ + DECL_VARIABLE(VAR, int, 16, 4); \ + DECL_VARIABLE(VAR, int, 32, 2); \ + DECL_VARIABLE(VAR, uint, 16, 4); \ + DECL_VARIABLE(VAR, uint, 32, 2); \ + DECL_VARIABLE(VAR, float, 32, 2); \ + DECL_VARIABLE(VAR, int, 16, 8); \ + DECL_VARIABLE(VAR, int, 32, 4); \ + DECL_VARIABLE(VAR, uint, 16, 8); \ + DECL_VARIABLE(VAR, uint, 32, 4); \ + DECL_VARIABLE(VAR, float, 32, 4) + + /* vector_res = vmlx_lane(vector, vector2, vector3, lane), + then store the result. */ +#define TEST_VMLX_LANE1(INSN, Q, T1, T2, W, N, N2, L) \ + VECT_VAR(vector_res, T1, W, N) = \ +INSN##Q##_lane_##T2##W(VECT_VAR(vector, T1, W, N), \ + VECT_VAR(vector2, T1, W, N), \ + VECT_VAR(vector3, T1, W, N2), \ + L); \ + vst1##Q##_##T2##W(VECT_VAR(result, T1, W, N), \ + VECT_VAR(vector_res, T1, W, N)) + +#define TEST_VMLX_LANE(INSN, Q, T1, T2, W, N, N2, V) \ + TEST_VMLX_LANE1(INSN, Q, T1, T2, W, N, N2, V) + + DECL_VMLX_LANE(vector); + DECL_VMLX_LANE(vector2); + DECL_VMLX_LANE(vector_res); + + DECL_VARIABLE(vector3, int, 16, 4); + DECL_VARIABLE(vector3, int, 32, 2); + DECL_VARIABLE(vector3, uint, 16, 4); + DECL_VARIABLE(vector3, uint, 32, 2); + DECL_VARIABLE(vector3, float, 32, 2); + + clean_results (); + + VLOAD(vector, buffer, , int, s, 16, 4); + VLOAD(vector, buffer, , int, s, 32, 2); + VLOAD(vector, buffer, , uint, u, 16, 4); + VLOAD(vector, buffer, , uint, u, 32, 2); + VLOAD(vector, buffer, q, int, s, 16, 8); + VLOAD(vector, buffer, q, int, s, 32, 4); + VLOAD(vector, buffer, q, uint, u, 16, 8); + VLOAD(vector, buffer, q, uint, u, 32, 4); + VLOAD(vector, buffer, , float, f, 32, 2); + VLOAD(vector, buffer, q, float, f, 32, 4); + + VDUP(vector2, , int, s, 16, 4, 0x55); + VDUP(vector2, , int, s, 32, 2, 0x55); + VDUP(vector2, , uint, u, 16, 4, 0x55); + VDUP(vector2, , uint, u, 32, 2, 0x55); + VDUP(vector2, , float, f, 32, 2, 55.3f); + VDUP(vector2, q, int, s, 16, 8, 0x55); + VDUP(vector2, q, int, s, 32, 4, 0x55); + VDUP(vector2, q, uint, u, 16, 8, 0x55); + VDUP(vector2, q, uint, u, 32, 4, 0x55); + VDUP(vector2, q, float, f, 32, 4, 55.8f); + + VDUP(vector3, , int, s, 16, 4, 0xBB); + VDUP(vector3, , int, s, 32, 2, 0xBB); + VDUP(vector3, , uint, u, 16, 4, 0xBB); + VDUP(vector3, , uint, u, 32, 2, 0xBB); + VDUP(vector3, , float, f, 32, 2, 11.34f); + + /* Choose lane arbitrarily. */ + TEST_VMLX_LANE(INSN_NAME, , int, s, 16, 4, 4, 2); + TEST_VMLX_LANE(INSN_NAME, , int, s, 32, 2, 2, 1); + TEST_VMLX_LANE(INSN_NAME, , uint, u, 16, 4, 4, 2); + TEST_VMLX_LANE(INSN_NAME, , uint, u, 32, 2, 2, 1); + TEST_VMLX_LANE(INSN_NAME, , float, f, 32, 2, 2, 1); + TEST_VMLX_LANE(INSN_NAME, q, int, s, 16, 8, 4, 3); + TEST_VMLX_LANE(INSN_NAME, q, int, s, 32, 4, 2, 1); + TEST_VMLX_LANE(INSN_NAME, q, uint, u, 16, 8, 4, 2); + TEST_VMLX_LANE(INSN_NAME, q, uint, u, 32, 4, 2, 1); + TEST_VMLX_LANE(INSN_NAME, q, float, f, 32, 4, 2, 1); + + CHECK(TEST_MSG, int, 16, 4, PRIx16, expected, ""); + CHECK(TEST_MSG, int, 32, 2, PRIx32, expected, ""); + CHECK(TEST_MSG, uint, 16, 4, PRIx16, expected, ""); + CHECK(TEST_MSG, uint, 32, 2, PRIx32, expected, ""); + CHECK_FP(TEST_MSG, float, 32, 2, PRIx32, expected, ""); + CHECK(TEST_MSG, int, 16, 8, PRIx16, expected, ""); + CHECK(TEST_MSG, int, 32, 4, PRIx32, expected, ""); + CHECK(TEST_MSG, uint, 16, 8, PRIx16, expected, ""); + CHECK(TEST_MSG, uint, 32, 4, PRIx32, expected, ""); + CHECK_FP(TEST_MSG, float, 32, 4, PRIx32, expected, ""); +} + +int main (void) +{ + FNNAME (INSN_NAME) (); + return 0; +} Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmla_lane.c === --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmla_lane.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmla_lane.c (working copy) @@ -0,0 +1,23 @@ +#include +#include "arm-neon-ref.h" +#include "compute-ref-data.h" + +#define INSN_NAME vmla +#define TEST_MSG "VMLA_LANE" + +/* Expected results. */ +VECT_VAR_DECL(expected,int,16,4) [] = { 0x3e07,
[patch, fortran] Fix PR 57023, packing for some array temporaries
Hello world, this patch fixes a longstanding regression where an upper array bound and the upper bound of an array section compared equal (using gfc_dep_compare_expr), but they weren't because the value of the upper bound had been changed in the meantime. This led to gfc_full_array_ref_p to erroneously returning true, which led to the array not being packed and thus wrong code. This patch takes the approach that any array bound which contains a dummy variable which is not INTENT(IN) may be changed by the user, and that we cannot be assured that it will not be changed. Anybody who is sensible should be using INTENT(IN) for array bounds, anyway :-) So, here is the patch. Regression-tested. OK for all affected branches? Thomas 2015-01-20 Thomas Koenig PR fortran/57023 * dependency.c (callback_dummy_intent_not_int): New function. (dummy_intent_not_in): New function. (gfc_full_array_ref_p): Use dummy_intent_not_in. 2015-01-20 Thomas Koenig PR fortran/57023 * gfortran.dg/internal_pack_15.f90: New test. Index: dependency.c === --- dependency.c (Revision 219193) +++ dependency.c (Arbeitskopie) @@ -1853,11 +1853,40 @@ gfc_check_element_vs_element (gfc_ref *lref, gfc_r return GFC_DEP_EQUAL; } +/* Callback function for checking if an expression depends on a + dummy variable which is any other than INTENT(IN). */ +static int +callback_dummy_intent_not_in (gfc_expr **ep, + int *walk_subtrees ATTRIBUTE_UNUSED, + void *data ATTRIBUTE_UNUSED) +{ + gfc_expr *e = *ep; + + if (e->expr_type == EXPR_VARIABLE && e->symtree + && e->symtree->n.sym->attr.dummy) +return e->symtree->n.sym->attr.intent != INTENT_IN; + else +return 0; +} + +/* Auxiliary function to check if subexpressions have dummy variables which + are not intent(in). +*/ + +static bool +dummy_intent_not_in (gfc_expr **ep) +{ + return gfc_expr_walker (ep, callback_dummy_intent_not_in, NULL); +} + /* Determine if an array ref, usually an array section specifies the entire array. In addition, if the second, pointer argument is provided, the function will return true if the reference is - contiguous; eg. (:, 1) gives true but (1,:) gives false. */ + contiguous; eg. (:, 1) gives true but (1,:) gives false. + If one of the bounds depends on a dummy variable which is + not INTENT(IN), also return false, because the user may + have changed the variable. */ bool gfc_full_array_ref_p (gfc_ref *ref, bool *contiguous) @@ -1921,7 +1950,8 @@ gfc_full_array_ref_p (gfc_ref *ref, bool *contiguo && (!ref->u.ar.as || !ref->u.ar.as->lower[i] || gfc_dep_compare_expr (ref->u.ar.start[i], - ref->u.ar.as->lower[i]))) + ref->u.ar.as->lower[i]) + || dummy_intent_not_in (&ref->u.ar.start[i]))) lbound_OK = false; /* Check the upper bound. */ if (ref->u.ar.end[i] @@ -1928,7 +1958,8 @@ gfc_full_array_ref_p (gfc_ref *ref, bool *contiguo && (!ref->u.ar.as || !ref->u.ar.as->upper[i] || gfc_dep_compare_expr (ref->u.ar.end[i], - ref->u.ar.as->upper[i]))) + ref->u.ar.as->upper[i]) + || dummy_intent_not_in (&ref->u.ar.end[i]))) ubound_OK = false; /* Check the stride. */ if (ref->u.ar.stride[i] ! { dg-do run } ! { dg-options "-Warray-temporaries" } ! PR 57023 ! This used to cause wrong packing because a(1:n,1:n) was ! assumed to be a full array. module mymod implicit none contains subroutine foo1(a,n) integer, dimension(n,n), intent(inout) :: a integer :: n n = n - 1 call baz(a(1:n,1:n),n) ! { dg-warning "array temporary" } end subroutine foo1 subroutine foo2(a,n) integer, dimension(n,n), intent(inout) :: a integer :: n call decrement(n) call baz(a(1:n,1:n),n) ! { dg-warning "array temporary" } end subroutine foo2 subroutine foo3(a,n) integer, dimension(n,n), intent(inout) :: a integer :: n, m m = n - 1 call baz(a(1:m,1:m),m) ! { dg-warning "array temporary" } end subroutine foo3 subroutine foo4(a,n) integer, dimension(n,n), intent(inout) :: a integer, intent(in) :: n a(1:n,1:n) = 1 end subroutine foo4 subroutine baz(a,n) integer, dimension(n,n), intent(inout) :: a integer, intent(in) :: n a = 1 end subroutine baz subroutine decrement(n) integer, intent(inout) :: n n = n - 1 end subroutine decrement end module mymod program main use mymod implicit none integer, dimension(5,5) :: a, b integer :: n b = 0 b(1:4,1:4) = 1 n = 5 a = 0 call foo1(a,n) if (any(a /= b)) call abort n = 5 a = 0 call foo2(a,n) if (any(a /= b)) call abort n = 5 a = 0 call foo3(a,n) if (any(a /= b)) call abort n = 5 a = 0 call foo4(a,n) if (any(a /= 1)) call abort end program main
Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1
> Seems like the thread might have died down, so just wanted to ping it. > As Marcus says, this is holding up other patches so it'd be good to get > something in soon. Would it be OK to commit the original patch or should > we wait? Yes, go ahead, but add a FIXME or ??? comment. -- Eric Botcazou
Re: Fix 59828 - Broken assembly on ppc* with two -mcpu= options
On Tue, Jan 20, 2015 at 09:26:12AM -0500, David Edelsohn wrote: > On Tue, Jan 20, 2015 at 12:41 AM, Alan Modra wrote: > > On Mon, Jan 19, 2015 at 10:43:29PM -0500, David Edelsohn wrote: > >> On Fri, Jan 17, 2014 at 10:58 PM, Alan Modra wrote: > >> > This patch cures PR59828 by translating all the -mcpu options at once, > >> > in order, to their equivalent assembler -m options by using a new spec > >> > function. In the process this removes some duplication. > >> > >> ASM_CPU_SPEC is too fragile a mechanism. I would much prefer to > >> expand on the ".machine" directive that I added to > >> rs6000_file_start(). The initial implementation explicitly avoids > >> .machine when -mcpu= or --with-cpu= is present as a conservative > >> start. > >> > >> It seems much better to select a .machine directive based on the > >> actual target ISA flag bits enabled than translating CPU command line > >> options to ASM options. Patches to replace ASM_CPU_SPEC with .machine > >> and expand functionality for AIX are welcome. > > > > This might make sense when looking only at gcc, but when considering > > the whole toolchain I think you'll run into difficulty. gas and other > > powerpc assemblers have always been invoked with -m options to select > > the cpu, so if you do away with ASM_CPU_SPEC and rely on .machine then > > you will be exercising the assembler in a new way. I am sure that > > this will not work for all powerpc assemblers currently in use. > > It is stressing .machine more than that feature has been in the past, > but it is functionality that is suppose to work. .machine already has > been stressed more with IFUNC pushing and popping ISAs. > > Are you concerned about the fundamental functionality of the pseudo-op > or a particular GAS release missing support for a particular ISA? AIX > supports .machine, but I think that it expects slightly different > processor names. I am not certain about LLVM-AS, but it normally is > not fed external assembly language files. I'm concerned about initialisation that might happen based on a -m option that doesn't happen with .machine. A quick look over the gas source showed we currently have at least one case like this: ppc_dwarf2_line_min_insn_length is only set from command line -m options. This means that if you want debug with VLE insns, you cannot currently invoke gas without -mvle. ".machine vle" by itself won't work. I think there may be other similar problems with setting the bfd arch/mach pair. -- Alan Modra Australia Development Lab, IBM
[PATCH] pr 64076 - tolerate different definitions of symbols in lto
From: Trevor Saunders Hi, Same patch as before, but now with a test case. I checked this fails without the patch and passes with it, ok? Trev gcc/ * ipa-visibility.c (update_visibility_by_resolution_info): Only assert when not in lto mode. --- gcc/ipa-visibility.c | 18 +- gcc/testsuite/g++.dg/lto/pr64076.H | 20 gcc/testsuite/g++.dg/lto/pr64076_0.C | 10 ++ gcc/testsuite/g++.dg/lto/pr64076_1.C | 5 + 4 files changed, 48 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/g++.dg/lto/pr64076.H create mode 100644 gcc/testsuite/g++.dg/lto/pr64076_0.C create mode 100644 gcc/testsuite/g++.dg/lto/pr64076_1.C diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c index 71894af..0791a1c 100644 --- a/gcc/ipa-visibility.c +++ b/gcc/ipa-visibility.c @@ -425,11 +425,19 @@ update_visibility_by_resolution_info (symtab_node * node) if (node->same_comdat_group) for (symtab_node *next = node->same_comdat_group; next != node; next = next->same_comdat_group) - gcc_assert (!next->externally_visible - || define == (next->resolution == LDPR_PREVAILING_DEF_IRONLY - || next->resolution == LDPR_PREVAILING_DEF - || next->resolution == LDPR_UNDEF - || next->resolution == LDPR_PREVAILING_DEF_IRONLY_EXP)); + { + if (!next->externally_visible) + continue; + + bool same_def + = define == (next->resolution == LDPR_PREVAILING_DEF_IRONLY + || next->resolution == LDPR_PREVAILING_DEF + || next->resolution == LDPR_UNDEF + || next->resolution == LDPR_PREVAILING_DEF_IRONLY_EXP); + gcc_assert (in_lto_p || same_def); + if (!same_def) + return; + } if (node->same_comdat_group) for (symtab_node *next = node->same_comdat_group; diff --git a/gcc/testsuite/g++.dg/lto/pr64076.H b/gcc/testsuite/g++.dg/lto/pr64076.H new file mode 100644 index 000..6afe37a --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr64076.H @@ -0,0 +1,20 @@ +struct Base { + virtual void f() = 0; +}; + +struct X : public Base { }; +struct Y : public Base { }; +struct Z : public Base { }; +struct T : public Base { }; + +struct S : public X, public Y, public Z +#ifdef XXX +, public T +#endif +{ + void f() +#ifdef XXX + { } +#endif + ; +}; diff --git a/gcc/testsuite/g++.dg/lto/pr64076_0.C b/gcc/testsuite/g++.dg/lto/pr64076_0.C new file mode 100644 index 000..fb9b060 --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr64076_0.C @@ -0,0 +1,10 @@ +// { dg-lto-do link } + +#define XXX +#include "pr64076.H" + +int main() +{ + S s; + return 0; +} diff --git a/gcc/testsuite/g++.dg/lto/pr64076_1.C b/gcc/testsuite/g++.dg/lto/pr64076_1.C new file mode 100644 index 000..4bd0081 --- /dev/null +++ b/gcc/testsuite/g++.dg/lto/pr64076_1.C @@ -0,0 +1,5 @@ +// { dg-options -fno-lto } + +#include "pr64076.H" + +void S::f() { } -- 2.1.4
Re: [PING] [PATCH] Fix parameters of __tsan_vptr_update
On Mon, Jan 19, 2015 at 11:09 PM, Bernd Edlinger wrote: > > Hi, > > On Mon, 19 Jan 2015 18:49:21, Konstantin Serebryany wrote: >> >> [text-only] >> >> On Mon, Jan 19, 2015 at 7:42 AM, Mike Stump wrote: >>> On Jan 19, 2015, at 12:43 AM, Dmitry Vyukov wrote: I can't really make my mind on this. I would mildly prefer sleep's (if they work reliably!). >>> >>> Let me state it more forcefully. >> You don't have to convince us here. >> I'd love to get rid of sleep calls in the tsan test suite -- they are >> a minor but a constant annoyance. >> But I also want to keep the tests *very simple*, i.e. >> 1. Single file w/o any non-system includes, no linking of extra >> libraries/objects >> 2. Not too much extra code. (ideally, 1 line for init, 1 line for >> "signal", 1 line for "wait") >> 3. Strictly posix or c++11 (unless we are testing something specific) >> >> Your idea with barrier_wait/dlsym sounds interesting, but I can't see >> the code in this mail thread. >> What do I miss? >> > > We discussed two alternatives to sleep: > > 1. step function, optionally with sched_yield to make it somewhat less busy > waiting: > __attribute__((no_sanitize_thread)) > void step (int i) > { >while (__atomic_load_n (&serial, __ATOMIC_ACQUIRE) != i - 1) > sched_yield(); >__atomic_store_n (&serial, i, __ATOMIC_RELEASE); > } This will not work: tsan will still instrument atomics in a function with __attribute__((no_sanitize_thread)) (This applies to both clang and gcc variants) > 2. tsan-invisible barriers: > > cat tsan_barrier.h > /* TSAN-invisible barriers. Link with -ldl. */ > #include > #include > > static __typeof(pthread_barrier_wait) *barrier_wait; > > static > void barrier_init (pthread_barrier_t *barrier, unsigned count) > { > void *h = dlopen ("libpthread.so.0", RTLD_LAZY); > barrier_wait = (__typeof (pthread_barrier_wait) *) > dlsym (h, "pthread_barrier_wait"); > pthread_barrier_init (barrier, NULL, count); > } So, we will have a single extra header file, but no extra .c files. This sounds tolerable. I am not sure how portable that is, but today's tsan works only on modern Linux anyway. If you want to contribute the code, please send a patch to upstream LLVM: we may not be able to take the code from gcc source tree to LLVM, the other direction is easy. Or please give us some time to fix the tests in upstream ourselves and than port them to the gcc test suite. Dmitry, wdyt? > > > We preferred the second alternative, because it does not do busy waiting. > We include this header file in every positive test case and link with -ldl. -ldl is not required, since -fsanitize=thread already adds that. --kcc > > Bernd. > >
Re: [Patch, libstdc++/64680] Conform the standard regex interface
On Tue, Jan 20, 2015 at 9:04 AM, Paolo Carlini wrote: > When we end up doing this to save run time, let's at least add in a comment > the PR #, like > > // PR libstdc++/64680 > > before test02(). Certainly. -- Regards, Tim Shen commit a150869847b7b02f57873fc18853b144a61c8880 Author: timshen Date: Tue Jan 20 00:20:14 2015 -0800 PR libstdc++/64680 * include/bits/regex.h (basic_regex<>::basic_regex, basic_regex<>::operator=, basic_regex<>::imbue): Conform the standard interface. * testsuite/28_regex/basic_regex/assign/char/cstring.cc: New testcase. diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h index 6de883a..07c78b7 100644 --- a/libstdc++-v3/include/bits/regex.h +++ b/libstdc++-v3/include/bits/regex.h @@ -442,7 +442,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 */ explicit basic_regex(const _Ch_type* __p, flag_type __f = ECMAScript) - : basic_regex(__p, __p + _Rx_traits::length(__p), __f) + : basic_regex(__p, __p + char_traits<_Ch_type>::length(__p), __f) { } /** @@ -553,7 +553,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 */ basic_regex& operator=(const _Ch_type* __p) - { return this->assign(__p, flags()); } + { return this->assign(__p); } + + /** + * @brief Replaces a regular expression with a new one constructed from + * an initializer list. + * + * @param __l The initializer list. + * + * @throws regex_error if @p __l is not a valid regular expression. + */ + basic_regex& + operator=(initializer_list<_Ch_type> __l) + { return this->assign(__l.begin(), __l.end()); } /** * @brief Replaces a regular expression with a new one constructed from @@ -564,7 +576,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 template basic_regex& operator=(const basic_string<_Ch_type, _Ch_traits, _Alloc>& __s) - { return this->assign(__s, flags()); } + { return this->assign(__s); } // [7.8.3] assign /** @@ -712,7 +724,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 imbue(locale_type __loc) { std::swap(__loc, _M_loc); - _M_automaton = nullptr; + _M_automaton.reset(); return __loc; } diff --git a/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc b/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc index 19528b6..6794fff 100644 --- a/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc +++ b/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc @@ -36,9 +36,19 @@ void test01() re.assign(cs); } +// basic_regex::operator=() resets flags. libstdc++/64680 +void test02() +{ + bool test __attribute__((unused)) = true; + + std::regex re("[[:alnum:]]", std::regex_constants::basic); + re = "\\w+"; +} + int main() { test01(); + test02(); return 0; }
Re: [patch, Fortran] PR61933 Inquire on Internal Units
On 01/19/2015 11:28 PM, Tobias Burnus wrote: Hi Jerry, hi all, sorry for the slow patch review. I also still want to review your other inquire patch. Jerry DeLisle wrote: The fundamental problem: if the variable containing the unit number in an INQUIRE statement is of type KIND greater than 4 and the value is outside the range of a KIND=4 we cannot test for it within the run-time library. Unit numbers are passed to the run-time in the IOPARM structures as a KIND=4. KIND=8 are cast into the KIND=4. The test case gfortran.dg/negative_unit_int8.f illustrates a case where a bogus unit number can get passed to the library. Regression tested on x86-64 and Joost's case in the PR now works as expected. OK for trunk? Mostly OK, however, some remarks are below. --- snip--- I don't know where this number is used, but I really should be a #define; if it is shared with libgfortran, it belongs to libgfortran.h. You wrote that -1 is also reserved and used; is the -1 somewhere defined? [Disclaimer: I have only browsed the other patch and do not recall whether it add, handles or #defines -1 - or whether -1 is already defined somewhere.] I have added the following to libgfortran.h and used them (see patch) /* Special unit numbers used to convey certain conditions. Numbers -3 thru -9 available. NEWUNIT values start at -10. */ #define GFC_INTERNAL_UNIT -1 #define GFC_INVALID_UNIT -2 --- snip --- The conditions could be combined with a fold_build2_loc(...,TRUTH_AND_EXPR,...). I have combined the conditions using TRUTH_OR_EXPR which is what we want. I also rolled the one helper function I had into the caller since I now only build one block in the combined condition. The new -fdump-tree-orginal result looks good: inquire_parm.4.common.unit = (integer(kind=4)) i; D.3393 = i; if (D.3393 < 0 || D.3393 > 2147483647) { inquire_parm.4.common.unit = -2; } _gfortran_st_inquire (&inquire_parm.4); The updated patch is attached. Regression tested completely again. OK for Trunk? Thanks for the review. Regards, Jerry Index: gcc/fortran/libgfortran.h === --- gcc/fortran/libgfortran.h (revision 219925) +++ gcc/fortran/libgfortran.h (working copy) @@ -68,6 +68,10 @@ | GFC_RTCHECK_RECURSION | GFC_RTCHECK_DO \ | GFC_RTCHECK_POINTER | GFC_RTCHECK_MEM) +/* Special unit numbers used to convey certain conditions. Numbers -3 + thru -9 available. NEWUNIT values start at -10. */ +#define GFC_INTERNAL_UNIT -1 +#define GFC_INVALID_UNIT -2 /* Possible values for the CONVERT I/O specifier. */ /* Keep in sync with GFC_FLAG_CONVERT_* in gcc/flags.h. */ Index: gcc/fortran/trans-io.c === --- gcc/fortran/trans-io.c (revision 219925) +++ gcc/fortran/trans-io.c (working copy) @@ -512,7 +512,37 @@ st_parameter_XXX structure. This is a pass by value. */ static unsigned int -set_parameter_value (stmtblock_t *block, bool has_iostat, tree var, +set_parameter_value (stmtblock_t *block, tree var, enum iofield type, + gfc_expr *e) +{ + gfc_se se; + tree tmp; + gfc_st_parameter_field *p = &st_parameter_field[type]; + tree dest_type = TREE_TYPE (p->field); + + gfc_init_se (&se, NULL); + gfc_conv_expr_val (&se, e); + + se.expr = convert (dest_type, se.expr); + gfc_add_block_to_block (block, &se.pre); + + if (p->param_type == IOPARM_ptype_common) +var = fold_build3_loc (input_location, COMPONENT_REF, + st_parameter[IOPARM_ptype_common].type, + var, TYPE_FIELDS (TREE_TYPE (var)), NULL_TREE); + + tmp = fold_build3_loc (input_location, COMPONENT_REF, dest_type, var, + p->field, NULL_TREE); + gfc_add_modify (block, tmp, se.expr); + return p->mask; +} + + +/* Similar to set_parameter_value except generate runtime + error checks. */ + +static unsigned int +set_parameter_value_chk (stmtblock_t *block, bool has_iostat, tree var, enum iofield type, gfc_expr *e) { gfc_se se; @@ -550,7 +580,6 @@ gfc_trans_io_runtime_check (has_iostat, cond, var, LIBERROR_BAD_UNIT, "Unit number in I/O statement too large", &se.pre); - } se.expr = convert (dest_type, se.expr); @@ -568,6 +597,69 @@ } +/* Build code to check the unit range if KIND=8 is used. Similar to + set_parameter_value_chk but we do not generate error calls for + inquire statements. */ + +static unsigned int +set_parameter_value_inquire (stmtblock_t *block, tree var, + enum iofield type, gfc_expr *e) +{ + gfc_se se; + gfc_st_parameter_field *p = &st_parameter_field[type]; + tree dest_type = TREE_TYPE (p->field); + + gfc_init_se (&se, NULL); + gfc_conv_expr_val (&se, e); + + /* If we're inquiring on a UNIT number, we need to check to make + sure it exists for larger than kind = 4. */ + if (type == IOPARM_common_unit && e->ts.kind > 4) +{ + stmtblock_t newblock; + tree cond1
Re: [Patch, libstdc++/64680] Conform the standard regex interface
On Tue, Jan 20, 2015 at 8:05 PM, Tim Shen wrote: > Certainly. Removed dg-do compile flag, so that the testcase really works. -- Regards, Tim Shen commit fc08df0cbf03ad571414c5551b6eb014c27efe4a Author: timshen Date: Tue Jan 20 00:20:14 2015 -0800 PR libstdc++/64680 * include/bits/regex.h (basic_regex<>::basic_regex, basic_regex<>::operator=, basic_regex<>::imbue): Conform the standard interface. * testsuite/28_regex/basic_regex/assign/char/cstring.cc: New testcase. diff --git a/libstdc++-v3/include/bits/regex.h b/libstdc++-v3/include/bits/regex.h index 6de883a..2b09da6 100644 --- a/libstdc++-v3/include/bits/regex.h +++ b/libstdc++-v3/include/bits/regex.h @@ -442,7 +442,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 */ explicit basic_regex(const _Ch_type* __p, flag_type __f = ECMAScript) - : basic_regex(__p, __p + _Rx_traits::length(__p), __f) + : basic_regex(__p, __p + char_traits<_Ch_type>::length(__p), __f) { } /** @@ -553,7 +553,19 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 */ basic_regex& operator=(const _Ch_type* __p) - { return this->assign(__p, flags()); } + { return this->assign(__p); } + + /** + * @brief Replaces a regular expression with a new one constructed from + * an initializer list. + * + * @param __l The initializer list. + * + * @throws regex_error if @p __l is not a valid regular expression. + */ + basic_regex& + operator=(initializer_list<_Ch_type> __l) + { return this->assign(__l.begin(), __l.end()); } /** * @brief Replaces a regular expression with a new one constructed from @@ -564,7 +576,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 template basic_regex& operator=(const basic_string<_Ch_type, _Ch_traits, _Alloc>& __s) - { return this->assign(__s, flags()); } + { return this->assign(__s); } // [7.8.3] assign /** @@ -644,7 +656,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 flag_type __flags = ECMAScript) { return this->assign(basic_regex(__s.data(), __s.data() + __s.size(), - _M_loc, _M_flags)); + _M_loc, __flags)); } /** @@ -712,7 +724,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 imbue(locale_type __loc) { std::swap(__loc, _M_loc); - _M_automaton = nullptr; + _M_automaton.reset(); return __loc; } diff --git a/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc b/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc index 19528b6..445006b 100644 --- a/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc +++ b/libstdc++-v3/testsuite/28_regex/basic_regex/assign/char/cstring.cc @@ -1,5 +1,4 @@ -// { dg-do compile } -// { dg-options "-std=gnu++11" } +// { dg-options "-std=c++11" } // 2009-06-05 Stephen M. Webb // @@ -36,9 +35,19 @@ void test01() re.assign(cs); } +// basic_regex::operator=() resets flags. libstdc++/64680 +void test02() +{ + bool test __attribute__((unused)) = true; + + std::regex re("[[:alnum:]]", std::regex_constants::basic); + re = "\\w+"; +} + int main() { test01(); + test02(); return 0; }
[PATCH] [PR target/59946] Avoid problems with addressing modes with -mpcrel
Well, I've got a m68k tree handy, so might as well continue pushing through these low priority bugs. The problem here is the m68k doesn't support pc-rel addressing modes in both slots of a comparison instruction. And while the operand predicates and constraints try to do the right thing, it's a rather convoluted mess because in some circumstances the output template with use %1,%0 and in others %0,%1. It seems *far* easier to just punt this. -mpcrel isn't that critical for the m68k. And the m68k certainly isn't a priority these days. I reviewed the other uses of pc-relative addressing and I think we're OK. This is really an issue that just affects the comparison patterns. I built the stage1-stage3 compilers with this change. The stage3 library build isn't complete, but I don't expect any issues. I confirmed that with -mpcrel -m68000 that we'll load the address of the object into a temporary, then indirect through that for the comparison. So the right things are happening here. Installed on the trunk. No plans to backport. commit db4a2c00a2dfaeb2ee5ed02e7dcdbd20479ebfa1 Author: law Date: Wed Jan 21 06:17:50 2015 + 2015-01-20 Jeff Law PR target/59946 * config/m68k/m68k.md (Comparison expanders and patterns): Do not allow pc-relative addresses in operand predicates or constraints. PR target/59946 * gcc.target/m68k/pr59946.c: New test. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@219927 138bc75d-0d04-0410-961f-82ee72b054a4 diff --git a/gcc/ChangeLog b/gcc/ChangeLog index ded76c4..9e4ad22 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,9 @@ +2015-01-20 Jeff Law + + PR target/59946 + * config/m68k/m68k.md (Comparison expanders and patterns): Do not + allow pc-relative addresses in operand predicates or constraints. + 2015-01-21 Bin Cheng * config/arm/arm.c (arm_cortex_a53_tune, arm_cortex_a57_tune): Prefer diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md index 2a314c3..d34ad1d 100644 --- a/gcc/config/m68k/m68k.md +++ b/gcc/config/m68k/m68k.md @@ -489,10 +489,19 @@ ;; A composite of the cmp, cmpa, cmpi & cmpm m68000 op codes. +;; +;; In theory we ought to be able to use some 'S' constraints and +;; operand predicates that allow PC-rel addressing modes in the +;; comparison patterns and expanders below. But we would have to be +;; cognizant of the fact that PC-rel addresses are not allowed for +;; both operands and determining whether or not we emit the operands in +;; order or reversed is not trivial to do just based on the constraints +;; and operand predicates. So to be safe, just don't allow the PC-rel +;; versions in the various comparison expanders, patterns, for comparisons. (define_insn "" [(set (cc0) -(compare (match_operand:SI 0 "nonimmediate_operand" "rKT,rKs,mSr,mSa,>") - (match_operand:SI 1 "general_src_operand" "mSr,mSa,KTr,Ksr,>")))] +(compare (match_operand:SI 0 "nonimmediate_operand" "rKT,rKs,mr,ma,>") + (match_operand:SI 1 "general_operand" "mr,ma,KTr,Ksr,>")))] "!TARGET_COLDFIRE" { if (GET_CODE (operands[0]) == MEM && GET_CODE (operands[1]) == MEM) @@ -529,7 +538,7 @@ (define_expand "cbranchhi4" [(set (cc0) - (compare (match_operand:HI 1 "nonimmediate_src_operand" "") + (compare (match_operand:HI 1 "nonimmediate_operand" "") (match_operand:HI 2 "m68k_subword_comparison_operand" ""))) (set (pc) (if_then_else (match_operator 0 "ordered_comparison_operator" @@ -551,8 +560,8 @@ (define_insn "" [(set (cc0) -(compare (match_operand:HI 0 "nonimmediate_src_operand" "rnmS,d,n,mS,>") - (match_operand:HI 1 "general_src_operand" "d,rnmS,mS,n,>")))] +(compare (match_operand:HI 0 "nonimmediate_operand" "rnm,d,n,m,>") + (match_operand:HI 1 "general_operand" "d,rnm,m,n,>")))] "!TARGET_COLDFIRE" { if (GET_CODE (operands[0]) == MEM && GET_CODE (operands[1]) == MEM) @@ -568,7 +577,7 @@ (define_expand "cbranchqi4" [(set (cc0) - (compare (match_operand:QI 1 "nonimmediate_src_operand" "") + (compare (match_operand:QI 1 "nonimmediate_operand" "") (match_operand:QI 2 "m68k_subword_comparison_operand" ""))) (set (pc) (if_then_else (match_operator 0 "ordered_comparison_operator" @@ -580,7 +589,7 @@ (define_expand "cstoreqi4" [(set (cc0) - (compare (match_operand:QI 2 "nonimmediate_src_operand" "") + (compare (match_operand:QI 2 "nonimmediate_operand" "") (match_operand:QI 3 "m68k_subword_comparison_operand" ""))) (set (match_operand:QI 0 "register_operand") (match_operator:QI 1 "ordered_comparison_operator" @@ -590,8 +599,8 @@ (define_insn "" [(set (cc0) -(compare (match_operand:QI 0 "nonimmediate_src_operand" "dn,dmS,>") - (match_operand:QI 1 "general_src_