vec_ld versus vec_vsx_ld on power8

2015-03-13 Thread Ewart Timothée
Hello all, I have a issue/question using VMX/VSX on Power8 processor on a little endian system. Using intrinsics function, if I perform an operation with vec_vsx_ld(…) - vet_vsx_st(), the compiler will add a permutation, and then perform an operations (memory correctly aligned) lxvd2x … xxpermd

Re: Undefined behavior due to 6.5.16.1p3

2015-03-13 Thread Vincent Lefevre
On 2015-03-12 13:55:50 -0600, Martin Sebor wrote: > On 03/12/2015 03:10 AM, Vincent Lefevre wrote: > >Well, this depends on the interpretation of effective types in the > >case of a union. For instance, when writing > > > > union { char a[16]; int b; } u; > > u.b = 1; > > > >you don't set the m

Why not implementation of interrupt attribute on IA32/x86-64

2015-03-13 Thread Didier Garcin
Hi, many OS hobbyist developpers would be pleased GCC implements the interrupt or interrupt_handler attribute for Intel architecture. Would it be so difficult to implement for this architecture ? Could you plan it ? Thanks a lot for answer. Best regards Didier

Re: Why not implementation of interrupt attribute on IA32/x86-64

2015-03-13 Thread Andi Kleen
Didier Garcin writes: > many OS hobbyist developpers would be pleased GCC implements the > interrupt or interrupt_handler attribute for Intel architecture. > > Would it be so difficult to implement for this architecture ? There are lots of different ways to implement interrupts on x86 (e.g. what

PR65416, alloca on xtensa

2015-03-13 Thread Max Filippov
Hi Sterling, I've got an issue building gdb for xtensa linux with gcc, reported it here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65416 Looking at it I've got two questions, can you help me with them? 1. in windowed ABI stack pointer update is always split into two opcodes: add and movsp.

vec_ld versus vec_vsx_ld on power8

2015-03-13 Thread Bill Schmidt
Hi Tim, I'll discuss the loads here for simplicity; the situation for stores is analogous. There are a couple of differences between lvx and lxvd2x. The most important one is that lxvd2x supports unaligned loads, while lvx does not. You'll note that lvx will zero out the lower 4 bits of the eff

Re: vec_ld versus vec_vsx_ld on power8

2015-03-13 Thread Ewart Timothée
thank you very much for this answer. I know my memory is aligned so I will use vec_ld/st only. best Tim

Re: PR65416, alloca on xtensa

2015-03-13 Thread augustine.sterl...@gmail.com
On Fri, Mar 13, 2015 at 7:54 AM, Max Filippov wrote: > 1. in windowed ABI stack pointer update is always split into two opcodes: > add and movsp. How gcc optimization passes are supposed to know that > 'movsp' is related to 'add' and that stack allocation is complete only after > movsp? The

Re: vec_ld versus vec_vsx_ld on power8

2015-03-13 Thread Bill Schmidt
Hi Tim, Actually, I left out another very good reason why you may want to use vec_vsx_ld/st. Sorry for forgetting this. As you saw, vec_ld translates into the lvx instruction. This instruction loads a sequence of 16 bytes into a vector register. For big endian, the first byte in memory is load

RE: PR65416, alloca on xtensa

2015-03-13 Thread Marc Gauthier
augustine.sterl...@gmail.com wrote: > On Fri, Mar 13, 2015 at 7:54 AM, Max Filippov wrote: [...] > > 2. alloca seems to make an additional 16-bytes padding to each stack > > allocation: alloca(1) results in moving sp down by 32 bytes, > > alloca(17) > > moves it by 48 bytes, etc. This paddin

Re: PR65416, alloca on xtensa

2015-03-13 Thread augustine.sterl...@gmail.com
On Fri, Mar 13, 2015 at 10:04 AM, Marc Gauthier wrote: > Other than the required 16-byte stack alignment, there's nothing in > the ABI that requires these extra 16 bytes. Perhaps there was a bad > implementation of the alloca exception handler at some point a long > time ago that prompted the ext

Re: vec_ld versus vec_vsx_ld on power8

2015-03-13 Thread Ewart Timothée
Hello, I am super confuse now scenario 1, what I have in m code: machine boots in LE. 1) memory: LE 2) I load (ld_vec) 3) register : LE 4) VSU compute in LE 5) I store (st_vec) 6) memory: LE scenario 2: ( I did not test but it is what I get if I order gcc to compiler in BE) machine boot in BE

Re: vec_ld versus vec_vsx_ld on power8

2015-03-13 Thread Bill Schmidt
Hi Tim, Sorry to have confused you. This stuff is a bit boggling the first 200 times you look at it... For both 32-bit and 64-bit floating-point, you should use ld_vsx_vec on both BE and LE machines, and the compiler will take care of doing the right thing for you in both cases. You do not have

[PATCH] jit docs: Add "Packaging notes" section

2015-03-13 Thread David Malcolm
On Wed, 2015-03-04 at 11:09 -0500, David Malcolm wrote: > On Tue, 2015-03-03 at 11:49 +0100, Matthias Klose wrote: > > Both gccjit and gnat now use sphinx to build the documentation. While not a > > direct part of the build process, it would be nice to document the > > requirements > > on sphinx,

Re: PR65416, alloca on xtensa

2015-03-13 Thread Max Filippov
On Fri, Mar 13, 2015 at 8:08 PM, augustine.sterl...@gmail.com wrote: > On Fri, Mar 13, 2015 at 10:04 AM, Marc Gauthier wrote: >> Other than the required 16-byte stack alignment, there's nothing in >> the ABI that requires these extra 16 bytes. Perhaps there was a bad >> implementation of the all

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-13 Thread Aditya K
--- > Date: Tue, 10 Mar 2015 11:20:07 +0100 > Subject: Re: Proposal for adding splay_tree_find (to find elements without > updating the nodes). > From: richard.guent...@gmail.com > To: stevenb@gmail.com > CC: hiradi...@msn.com; gcc@gcc.gnu.org > > On Mon, Ma

Re: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-13 Thread Jonathan Wakely
Are you sure your compare_variables functor is correct? Subtracting the two values seems very strange for a strict weak ordering. (Also "compare_variables" is a pretty poor name!)

RE: Proposal for adding splay_tree_find (to find elements without updating the nodes).

2015-03-13 Thread Aditya K
You're right. I'll change this to: /* A stable comparison functor to sort trees.  */ struct tree_compare_decl_uid {   bool  operator ()(const tree &xa, const tree &xb) const   {     return DECL_UID (xa) < DECL_UID (xb);   } }; New patch attached. Thanks, -Aditya --

Re: PR65416, alloca on xtensa

2015-03-13 Thread Segher Boessenkool
On Fri, Mar 13, 2015 at 05:54:48PM +0300, Max Filippov wrote: > 2. alloca seems to make an additional 16-bytes padding to each stack > allocation: alloca(1) results in moving sp down by 32 bytes, alloca(17) > moves it by 48 bytes, etc. This sounds like PR 50938, 47353, 34548, maybe more? Happ

Re: PR65416, alloca on xtensa

2015-03-13 Thread Max Filippov
On Fri, Mar 13, 2015 at 11:18 PM, Segher Boessenkool wrote: > On Fri, Mar 13, 2015 at 05:54:48PM +0300, Max Filippov wrote: >> 2. alloca seems to make an additional 16-bytes padding to each stack >> allocation: alloca(1) results in moving sp down by 32 bytes, alloca(17) >> moves it by 48 bytes

Re: Re: Why not implementation of interrupt attribute on IA32/x86-64

2015-03-13 Thread David Fernandez
Hi, This is slightly off-topic, but there seems to be lots of different interrupt attributes in gcc, one for each different processor, which, in many instances, seem almost the same with different names. also, gcc could decide on the attribute behaviour depending on the target it compiles for

Re: PR65416, alloca on xtensa

2015-03-13 Thread Segher Boessenkool
On Fri, Mar 13, 2015 at 11:36:47PM +0300, Max Filippov wrote: > >> 2. alloca seems to make an additional 16-bytes padding to each stack > >> allocation: alloca(1) results in moving sp down by 32 bytes, alloca(17) > >> moves it by 48 bytes, etc. > > > > This sounds like PR 50938, 47353, 34548, m

Re: PR65416, alloca on xtensa

2015-03-13 Thread Segher Boessenkool
On Fri, Mar 13, 2015 at 03:56:38PM -0500, Segher Boessenkool wrote: > On Fri, Mar 13, 2015 at 11:36:47PM +0300, Max Filippov wrote: > > >> 2. alloca seems to make an additional 16-bytes padding to each stack > > >> allocation: alloca(1) results in moving sp down by 32 bytes, alloca(17) > > >> m