illegal insn created in ira

2011-05-08 Thread roy rosen
Hi, In my port I have an error: Before ira I have the following insn: (insn 3859 4277 4366 57 (set (reg:BI 2038) (subreg:BI (reg/v:SI 181 [ realsz ]) 3)) 76 {movbi} (expr_list:REG_EQUAL (const_int 1 [0x1]) (nil))) During ira this insn is transformed (I guess because reg 181

Re: inline assembly vs. intrinsic functions

2011-03-28 Thread roy rosen
2011/3/24 Ian Lance Taylor : > roy rosen writes: > >>> You build a RECORD_TYPE holding the fields you want to return.  You >>> define the appropriate builtin functions to return that record type. >> >> How is that done? using define_insn? How do I tell i

Re: inline assembly vs. intrinsic functions

2011-03-24 Thread roy rosen
2011/3/22 Ian Lance Taylor : > roy rosen writes: > >> 2010/10/26 Ian Lance Taylor : >>> roy rosen writes: >>> >>>> I am trying to demonstrate my port capabilities. >>>> I am writing an application which needs to use instructions like max

Re: inline assembly vs. intrinsic functions

2011-03-17 Thread roy rosen
2010/10/26 Ian Lance Taylor : > roy rosen writes: > >> I am trying to demonstrate my port capabilities. >> I am writing an application which needs to use instructions like max >> a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs. >> Is that possible to write

Re: register allocation

2011-01-05 Thread roy rosen
2011/1/3 Jeff Law : > On 12/27/10 08:43, roy rosen wrote: >>> >>> I'd recommend to try ira-improv branch.  I think that part of the problem >>> is >>> in usage of cover classes.  The branch removes the cover classes and >>> permits >>

Re: register allocation

2010-12-27 Thread roy rosen
2010/12/23 Vladimir Makarov : > On 12/23/2010 03:13 AM, roy rosen wrote: >> >> Hi All, >> >> I am looking at the code generated by my port and it seems that I have >> a problem that too many copies between registers are generated. >> I looked a bit at the r

register allocation

2010-12-23 Thread roy rosen
Hi All, I am looking at the code generated by my port and it seems that I have a problem that too many copies between registers are generated. I looked a bit at the register allocation and wanted to verify that I understand its behavior. Is that true that it first chooses a register class for eac

Re: software pipelining

2010-12-08 Thread roy rosen
> On 10.11.2010 12:32, roy rosen wrote: >> >> Hi, >> >> I was wondering if gcc has software pipelining. >> I saw options -fsel-sched-pipelining -fselective-scheduling >> -fselective-scheduling2 but I don't see any pipelining happening >> (tried

combine two load insns

2010-12-04 Thread roy rosen
Hi, If I have two load SI insns. Is there any way to combine them into one load DI insn? Not using peephole which can catch only this limited case of being sequential insns. I have seen something done in ARM (*arith_adjacentmem) but it is very awkward and would not be realistic if the DI is being

Re: inline assembly vs. intrinsic functions

2010-11-15 Thread roy rosen
no matter what I do? 2010/11/15 Joern Rennecke : > Quoting roy rosen : > >> Is there any another way to give attributes to inline assembly insns? > > See define_asm_attributes. >

Re: inline assembly vs. intrinsic functions

2010-11-15 Thread roy rosen
Is there any another way to give attributes to inline assembly insns? 2010/10/26 Ian Lance Taylor : > roy rosen writes: > >> If I want the compiler to understand the inline assembly is it >> possible to write define_insn which would match the pattern that GCC >> creates

Re: pipeline description

2010-11-11 Thread roy rosen
is used for the first insn before it is written by the second insn. How do I let GCC know about these things (When exactly each operand is used and when it is written)? Is it in these hooks? In which port can I see a good example for that? Thanks, Roy. 2010/11/4 Ian Lance Taylor : > roy ro

software pipelining

2010-11-10 Thread roy rosen
Hi, I was wondering if gcc has software pipelining. I saw options -fsel-sched-pipelining -fselective-scheduling -fselective-scheduling2 but I don't see any pipelining happening (tried with ia64). Is there a gcc VLIW port in which I can see it working? For an example function like int nor(char* _

Re: define_split

2010-11-08 Thread roy rosen
2010/11/8 Michael Meissner : > On Thu, Oct 28, 2010 at 09:11:44AM +0200, roy rosen wrote: >> Hi all, >> >> I am trying to use define_split, but it seems to me that I don't >> understand how it is used. >> It says in the gccint.pdf (which I use as my tutorial (

Re: UNITS_PER_SIMD_WORD

2010-11-08 Thread roy rosen
This is what I done. It works well. Thanks to everybody. 2010/11/8 Michael Meissner : > On Mon, Nov 01, 2010 at 04:52:28PM +0200, roy rosen wrote: >> Hi All, >> >> Is it possible to define UNITS_PER_SIMD_WORD as a global variable and >> to set this varibale usin

pipeline description

2010-11-03 Thread roy rosen
Hi, I am writing now the pipeline description in order to get a parallel code. My machine has many restrictions regarding which instruction can be parallelized with another. I am under the assumption that for each insn only one define_insn_reservation is matched. Is that correct? If so then the nu

UNITS_PER_SIMD_WORD

2010-11-01 Thread roy rosen
Hi All, Is it possible to define UNITS_PER_SIMD_WORD as a global variable and to set this varibale using a pragma (even once for a compilation) and that way to be able to compile one file with UNITS_PER_SIMD_WORD = 8 and another file with UNITS_PER_SIMD_WORD = 16? Thanks, Roy.

Re: define_split

2010-10-28 Thread roy rosen
2010/10/29 Ian Lance Taylor : > roy rosen writes: > >> I am trying to use define_split, but it seems to me that I don't >> understand how it is used. >> It says in the gccint.pdf (which I use as my tutorial (is there >> anything better or more up to date?)) >

define_split

2010-10-28 Thread roy rosen
Hi all, I am trying to use define_split, but it seems to me that I don't understand how it is used. It says in the gccint.pdf (which I use as my tutorial (is there anything better or more up to date?)) that the combiner only uses the define_split if it doesn't find any define_insn to match. This i

Re: inline assembly vs. intrinsic functions

2010-10-26 Thread roy rosen
If I want the compiler to understand the inline assembly is it possible to write define_insn which would match the pattern that GCC creates for the inline assembly and then GCC would be able to 'know' some attributes about this insn and would be able to parallelize it? 2010/10/26 roy r

Re: inline assembly vs. intrinsic functions

2010-10-26 Thread roy rosen
I didn't give the full details of the instruction but for example a max instruction which gets an array and returns both the max value and its index in the array will need to return more than one argument. 2010/10/26 Ian Lance Taylor : > roy rosen writes: > >> I am trying to de

inline assembly vs. intrinsic functions

2010-10-25 Thread roy rosen
Hi, I am trying to demonstrate my port capabilities. I am writing an application which needs to use instructions like max a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs. Is that possible to write an intrinsic function for that? I think not because that means that I need to pass d,e,f by

combiner

2010-10-25 Thread roy rosen
In my port I get to such a situation: (insn 60 59 61 4 a.c:65 (set (subreg:SI (reg:HI 129 [ __prephitmp_4 ]) 0) (zero_extract:SI (subreg:SI (reg/v:DI 138 [ v4hi1 ]) 4) (const_int 16 [0x10]) (const_int 16 [0x10]))) 53 {extzv} (nil)) (insn 61 60 62 4 a.c:65 (set (reg

complex numbers in gcc

2010-08-17 Thread roy rosen
Hi all, In my port the architecture has some specific instructions that can handle complex arithmetic. I tried to use them but I see that pass_lower_complex decompose the complex numbers. I tried to remove this pass from the passes' list but I saw that the subsequent passes require that this pass

Re: constraints and predicates

2010-08-05 Thread roy rosen
I haven't mentioned that I am using gcc 4.6 latest version. To generalize the question. If I use an operand like lc_operand (below) and leave the constraint open, is it guaranteed that the register that would be chosen would be of class lc? 2010/8/3 roy rosen : > Hi All, > > If

constraints and predicates

2010-08-03 Thread roy rosen
Hi All, If I don't use a constraint, is it possible that during ira I get a register which is not acceptable by the predicate? In my port I have the following to support HW loops: (define_predicate "lc_operand" (match_operand 0 "register_operand") { unsigned int regno; if (GET

vectorization

2010-07-18 Thread roy rosen
Hi, In my architecture I have simd instructions with several simd levels. I have load and store which operate on 8 half words. I have add and sub for 4 half words I have mul which operates on 2 half words. How can I utilize all of them? Is that enough just to describe each one of these instruction

Re: invalid insn generated

2010-06-30 Thread roy rosen
Taylor : > roy rosen writes: > > > In my port I get to gen_reload to the lines > > > > /* If IN is a simple operand, use gen_move_insn. */ > > else if (OBJECT_P (in) || GET_CODE (in) == SUBREG) > > { > > static int xxx; > >

invalid insn generated

2010-06-23 Thread roy rosen
Hi, In my port I get to gen_reload to the lines /* If IN is a simple operand, use gen_move_insn. */ else if (OBJECT_P (in) || GET_CODE (in) == SUBREG) { static int xxx; xxx = OBJECT_P (in); tem = emit_insn (gen_move_insn (out, in)); /* IN may contain a LABEL_R

complex arithmetics

2010-06-10 Thread roy rosen
Hi All, I was wondering if there is any architecture which implemented complex arithmetic in GCC i.e. used modes like CHI or HC. I would really like to look at an example for that. Thanks, Roy.

vectorization issue

2010-05-26 Thread roy rosen
Hi, I have tried vectorization and encountered a problem which I can see is common to some ports (I tried ia64 and bfin). For this function: #define ts unsigned short void f(ts* __restrict__ a, ts* __restrict__ b, ts* __restrict__ x) { int i; for (i=0;i<1024;i++) x[i] = a[i] + b[

scheduling on VLIW architecture

2010-05-06 Thread roy rosen
Hi all. I work on a VLIW architecture. The sched2 pass adds a TImode to insns which should start a new issue group. But, after this pass, other passes change the insns, so the sched2 work that was done is not correct anymore (the groups of insns might be invalid). In particular I see that the com

Re: peephole optimizations

2010-05-04 Thread roy rosen
Hi, 2010/5/3, Ian Lance Taylor : > roy rosen writes: > > > 1. Is that true that if I try to match in the pattern two insns and in > > my code between these insns there is another insn which does not have > > any dependency connection to the other two, Is that true that th

peephole optimizations

2010-05-03 Thread roy rosen
Hi All, I have tried to write some peephole patterns and I now have some questions regarding the way it is working. 1. Is that true that if I try to match in the pattern two insns and in my code between these insns there is another insn which does not have any dependency connection to the other t

Re: vectorization, scheduling and aliasing

2010-04-27 Thread roy rosen
Hi, I have looked a bit more and tried also ia-64 and bfin and actually I can't find a single example where vectorized code using __restrict__ variables would break the dependency between stores and loads. for this simple program: unsigned short xxx(unsigned short* __restrict__ a, unsigned short

Re: vectorization, scheduling and aliasing

2010-04-26 Thread roy rosen
Hi Richard, Here is the relevant block from the dump: : __vect_var__26_6 = *__vect_p_14_19; *__vect_p_18_25 = __vect_var__26_6; # PT = nonlocal { __PARM_RESTRICT_2 } (restr) __vect_p_22_11 = __vect_p_14_19 + 8; # PT = nonlocal { __PARM_RESTRICT_1 } (restr) __vect_p_27_12 = __vect_p_18

Re: vectorization, scheduling and aliasing

2010-04-26 Thread roy rosen
Hi Richard, 2010/4/23, Richard Guenther : > On Thu, Apr 22, 2010 at 6:04 PM, roy rosen wrote: > > Hi Richard, > > > > 2010/4/14, Richard Guenther : > >> On Wed, Apr 14, 2010 at 8:48 AM, roy rosen wrote: > >> > Hi All, > >> > > >>

Re: vectorization, scheduling and aliasing

2010-04-22 Thread roy rosen
Hi Richard, 2010/4/14, Richard Guenther : > On Wed, Apr 14, 2010 at 8:48 AM, roy rosen wrote: > > Hi All, > > > > I have implemented some vectorization features in my gcc port. > > > > In the generated code for this function I can see a scheduling problem: &g

vectorization, scheduling and aliasing

2010-04-13 Thread roy rosen
Hi All, I have implemented some vectorization features in my gcc port. In the generated code for this function I can see a scheduling problem: int xxx(int* __restrict__ a, int* __restrict__ b) { int __restrict__ i; for (i = 0; i < 8; i++) { a[i] = b[i]; } return 0; }

Re: lower subreg optimization

2010-04-07 Thread roy rosen
2010/4/6, Jim Wilson : > On 04/06/2010 02:24 AM, roy rosen wrote: > > (insn 33 32 34 7 a.c:25 (set (subreg:V2HI (reg:V4HI 114) 0) > > (plus:V2HI (subreg:V2HI (reg:V4HI 112) 0) > > (subreg:V2HI (reg:V4HI 113) 0))) 118 {addv2hi3} (nil)) > > > >

Re: compiler operations research

2010-04-07 Thread roy rosen
Thanks Dave, I'll have a look at these. Roy. 2010/4/7, Dave Korn : > On 07/04/2010 12:29, roy rosen wrote: > > Hi, > > > > Are there any known methodologies/tools/flows that enable operations > > research on the compiler generated assembly? > > Some

compiler operations research

2010-04-07 Thread roy rosen
Hi, Are there any known methodologies/tools/flows that enable operations research on the compiler generated assembly? The reasoning behind the question is that compiler heuristics complexity are restricted by compilation time, while test environment can run for a long time taking into account bot

lower subreg optimization

2010-04-06 Thread roy rosen
Hi, I have encountered several problems with lower subreg optimization in my port. In some cases I noticed that insns are decomposed in subreg1 pass and do not get recomposed later which causes at the end using two insns instead of one. For example I have the following dump before subreg1 (note

implementing load 8 byte instruction

2010-03-18 Thread roy rosen
Hi, I am trying to implement a simple load 8 bytes instruction. I tried to use movdi so that it would allocate two sequential registers for the load. It starts well but in pass subreg1 the insns are decomposed and all DI operands are replaced with SI. I understand that this is a desireable optimz