Re: Insn canonicalization not only with constant

2007-02-15 Thread Sami Khawam

Hi Andrew,

You mean using a DI rotate left by 4 and then saving the output as SI 
(saving the hi part and ignoring the low one) ?


Also, how is canonicalization detected anyway? Are there rules that gcc 
follows? How can they be changed?


Sami

Andrew Pinski wrote:

  output = (operand1 >> 28) | (operand2 << 4)


Isn't that a rotate?  if so you can use either rotate or rotatert instead.




maybe vectorizer-bug regarding unhandled data-ref

2007-02-15 Thread Thomas Veith

Hi,

while playing with gcc-4.3 rev. 121994, i encountered a problem with 
autovectorisation.


In the following simple code, the inner loop of c1() becomes vectorized as 
expected, but the inner loop of c2() not because of


  test2.c:15: note: = analyze_loop_nest =
  test2.c:15: note: === vect_analyze_loop_form ===
  test2.c:15: note: === get_loop_niters ===
  test2.c:15: note: ==> get_loop_niters:(unsigned int) n_6(D)
  test2.c:15: note: Symbolic number of iterations is (unsigned int) n_6(D)
  test2.c:15: note: === vect_analyze_data_refs ===

  test2.c:15: note: get vectype with 4 units of type float
  test2.c:15: note: vectype: vector float
  test2.c:15: note: not vectorized: unhandled data-ref
  test2.c:15: note: bad data references.

(even with -ftree-vectorizer-verbose=99 there is no more info than that)

The only difference between the two functions is that in c1() static 
arrays are used and in c2() pointer to arrays.. Is this a problem with 
aliasing/alignment of pointer parameters or a vectorizer bug? And is there 
a work-around?


Best regards,
Thomas

--

float a[256],b[16],o[271];

void c1()
{
  for(int i=0;i<256;i++) {
for(int j=0;j<16;j++) {
  o[i+j]+=a[i]*b[j];
}
  }
}

void c2(int m, int n, float *a, float *b, float *o)
{
  for(int i=0;i

Re: maybe vectorizer-bug regarding unhandled data-ref

2007-02-15 Thread Dorit Nuzman
> Hi,
>
> while playing with gcc-4.3 rev. 121994, i encountered a problem with
> autovectorisation.
>
> In the following simple code, the inner loop of c1() becomes vectorized
as
> expected, but the inner loop of c2() not because of
>
>test2.c:15: note: = analyze_loop_nest =
>test2.c:15: note: === vect_analyze_loop_form ===
>test2.c:15: note: === get_loop_niters ===
>test2.c:15: note: ==> get_loop_niters:(unsigned int) n_6(D)
>test2.c:15: note: Symbolic number of iterations is (unsigned int)
n_6(D)
>test2.c:15: note: === vect_analyze_data_refs ===
>
>test2.c:15: note: get vectype with 4 units of type float
>test2.c:15: note: vectype: vector float
>test2.c:15: note: not vectorized: unhandled data-ref
>test2.c:15: note: bad data references.
>
> (even with -ftree-vectorizer-verbose=99 there is no more info than that)
>
> The only difference between the two functions is that in c1() static
> arrays are used and in c2() pointer to arrays.. Is this a problem with
> aliasing/alignment of pointer parameters or a vectorizer bug? And is
there
> a work-around?
>

The first problem is that a[i] is invariant in the inner-loop, and the
vectorizer wants to work only with data-references that have a nice
evolution in the loop (i.e. advance between iterations of the loop). In
other words - it assumes that invariant accesses had been moved out of the
loop before vectorization:

"
ptr is loop invariant.

create_data_ref: failed to create a dr for *pretmp.27_46
"

The work around for that is to manually move the invariant a[i] out of the
inner-loop, put it into a temporary, and use that temporary in the
inner-loop.

The second problem is aliasing - the vectorizer can't tell that the write
through pointer o doesn't overlap with the read through pointer b.

The work around for that is to add the "__restrict" qualifier to the
declaration of the pointers.

To fix the first problem in the compiler, we can teach the vectorizer to
work with invariant datarefs. This is easy to do, but I think the right
solution is to enhance loop-invariant-motion pass to use an aliasing oracle
that would tell it that the invariant load can be safely moved out of the
loop (given that the pointers are __restrict qualified). I think such a
solution is in the works?
Do people think it's worth while to work around this invariant-motion issue
in the vectorizer?

The second problem would be fixed in the near future - a patch that addds
support for run-time aliasing checks is in the works (should be ready
within a week or so I think).

dorit

> Best regards,
> Thomas
>
> --
>
> float a[256],b[16],o[271];
>
> void c1()
> {
>for(int i=0;i<256;i++) {
>  for(int j=0;j<16;j++) {
>o[i+j]+=a[i]*b[j];
>  }
>}
> }
>
> void c2(int m, int n, float *a, float *b, float *o)
> {
>for(int i=0;i  for(int j=0;jo[i+j]+=a[i]*b[j];
>  }
>}
> }



Re: maybe vectorizer-bug regarding unhandled data-ref

2007-02-15 Thread Daniel Berlin

On 2/15/07, Dorit Nuzman <[EMAIL PROTECTED]> wrote:

> Hi,
>
> while playing with gcc-4.3 rev. 121994, i encountered a problem with
> autovectorisation.
>
> In the following simple code, the inner loop of c1() becomes vectorized
as
> expected, but the inner loop of c2() not because of
>
>test2.c:15: note: = analyze_loop_nest =
>test2.c:15: note: === vect_analyze_loop_form ===
>test2.c:15: note: === get_loop_niters ===
>test2.c:15: note: ==> get_loop_niters:(unsigned int) n_6(D)
>test2.c:15: note: Symbolic number of iterations is (unsigned int)
n_6(D)
>test2.c:15: note: === vect_analyze_data_refs ===
>
>test2.c:15: note: get vectype with 4 units of type float
>test2.c:15: note: vectype: vector float
>test2.c:15: note: not vectorized: unhandled data-ref
>test2.c:15: note: bad data references.
>
> (even with -ftree-vectorizer-verbose=99 there is no more info than that)
>
> The only difference between the two functions is that in c1() static
> arrays are used and in c2() pointer to arrays.. Is this a problem with
> aliasing/alignment of pointer parameters or a vectorizer bug? And is
there
> a work-around?
>

The first problem is that a[i] is invariant in the inner-loop, and the
vectorizer wants to work only with data-references that have a nice
evolution in the loop (i.e. advance between iterations of the loop). In
other words - it assumes that invariant accesses had been moved out of the
loop before vectorization:

"
ptr is loop invariant.

create_data_ref: failed to create a dr for *pretmp.27_46
"

The work around for that is to manually move the invariant a[i] out of the
inner-loop, put it into a temporary, and use that temporary in the
inner-loop.

The second problem is aliasing - the vectorizer can't tell that the write
through pointer o doesn't overlap with the read through pointer b.

The work around for that is to add the "__restrict" qualifier to the
declaration of the pointers.

To fix the first problem in the compiler, we can teach the vectorizer to
work with invariant datarefs. This is easy to do, but I think the right
solution is to enhance loop-invariant-motion pass to use an aliasing oracle
that would tell it that the invariant load can be safely moved out of the
loop (given that the pointers are __restrict qualified). I think such a
solution is in the works?


It is.


Do people think it's worth while to work around this invariant-motion issue
in the vectorizer?


Probably not, it's just going to make your code more complex for no real gain.


Re: Insn canonicalization not only with constant

2007-02-15 Thread Rask Ingemann Lambertsen
On Wed, Feb 14, 2007 at 08:30:52PM +, Sami Khawam wrote:
> Hi Rask,
> 
> Basically the CPU has the 'SCALE_28_4' instruction which does the following:
>  output = (operand1 >> 28) | (operand2 << 4)
> 
> From my understanding the OR operation (ior), doesn't get canonicalized 
> since it's second operand (in this case (lshiftrt:SI (match_operand:SI 2 
> "register_operand" "r") (const_int 4)) ) is not a constant.

   OK, I see what you mean. The reason you can get both (ior (ashift ...)
(lshiftrt ...)) and (ior (lshiftrt ...) (ashift ...)) is that simplify-rtx.c
has no rule to canonicalize such expressions and that LSHIFTRT and
ASHIFT have the same precedence.

   Hmm, in simplify_binary_operation_1(), it says:

  /* Convert (ior (ashift A CX) (lshiftrt A CY)) where CX+CY equals the
 mode size to (rotate A CX).  */
   
   Right after that is code to make sure ASHIFT is the first operand for the
simplification attempts that follow. You could try adding code to do this in
general, but I don't know where such code should be added.

   Btw, I found this in rtlanal.c:

/* Return a value indicating whether OP, an operand of a commutative
   operation, is preferred as the first or second operand.  The higher
   the value, the stronger the preference for being the first operand.
   We use negative values to indicate a preference for the first operand
   and positive values for the second operand.  */

int
commutative_operand_precedence (rtx op)
{
  enum rtx_code code = GET_CODE (op);

  /* Constants always come the second operand.  Prefer "nice" constants.  */
  if (code == CONST_INT)
return -7;
[...]

   The comment disagrees with the code.

-- 
Rask Ingemann Lambertsen


Re: Insn canonicalization not only with constant

2007-02-15 Thread Sami Khawam

   OK, I see what you mean. The reason you can get both (ior (ashift ...)
(lshiftrt ...)) and (ior (lshiftrt ...) (ashift ...)) is that simplify-rtx.c
has no rule to canonicalize such expressions and that LSHIFTRT and
ASHIFT have the same precedence.

   Hmm, in simplify_binary_operation_1(), it says:

  /* Convert (ior (ashift A CX) (lshiftrt A CY)) where CX+CY equals the
 mode size to (rotate A CX).  */


ok, so that means that in that specific shift example I could go away 
with a rotate operation (even though it has to be of mode DI -> SI).



   Right after that is code to make sure ASHIFT is the first operand for the
simplification attempts that follow. You could try adding code to do this in
general, but I don't know where such code should be added.


I will look more into this. It might be that there is no simple way to 
activate canonicalization for the general case (i.e. any insn that 
defined in the machine description), and maybe it has to be done to 
every specific type of operation.



   Btw, I found this in rtlanal.c:

int
commutative_operand_precedence (rtx op)

>   :

  :



It seems like commutative_operand_precedence() is only used twice to 
swap operand1 and operand2 - so the fact that it returns low values (or 
high, since the comment in the code seems wrong) for general operands 
shouldn't affect the ability to canonicalize them.


Sami



Re: [Autovect]dependencies of virtual defs/uses

2007-02-15 Thread Dorit Nuzman
"Jiahua He" <[EMAIL PROTECTED]> wrote on 12/02/2007 22:54:08:

> Oh, I see. For reduction and induction, you don't need to deal with
> the condition with vdef. I am considering how to implement an idiom
> with vdef, like SCAN (prefix sum). And by the way, do you support
> idioms with vuses?
>

You mean detecting this pattern?:

for i
  a[i] += a[i-1];

I don't know if analyzing vdefs/vuses would help you much to detect this
pattern - maybe you're better off computing the dependence-distance (i.e.
use compute_data_dependences_for_loop, and look at DDR_DIST_VECTS).

dorit

> Jiahua
>
>
> 2007/2/12, Dorit Nuzman <[EMAIL PROTECTED]>:
> > > Thanks! In fact, I should ask how to deal with idiom (such as
> > > reduction, induction) recognition for virtual defs/uses.
> > >
> >
> > Just curious - what is this for? (are you interested in this in the
context
> > of vectorization? is there a specific example you have in mind?)
> >
> > dorit
> >
> > > Jiahua
> > >
> > >
> > > 2007/2/12, Daniel Berlin <[EMAIL PROTECTED]>:
> > > > On 2/12/07, Jiahua He <[EMAIL PROTECTED]> wrote:
> > > > > Hi,
> > > > >
> > > > > I am reading the code of autovect branch and curious about how to
> > deal
> > > > > with the dependencies of virtual defs/uses. In the function
> > > > > vect_analyze_scalar_cycles( ), I found the statement "Skip
virtual
> > > > > phi's. The data dependences that are associated with virtual
> > defs/uses
> > > > > ( i.e., memory accesses) are analyzed elsewhere." But where is
the
> > > > > code? I tried to search  for "vect_induction_def" and
> > > > > "vect_reduction_def" and found that they are not used to assign
> > > > > elsewhere. Is the analysis not implemented yet? Thanks in
advance!
> > > >
> > > > They show up as data references because of tree-data-reference.c
> > > marking them.
> > > > At lets, that's how other linear loop transforms handles it.
> > > > Not sure about how vectorizer deals with it specifically
> > > >
> > > > >
> > > > > Jiahua
> > > > >
> > > >
> >
> >



Re: [Autovect]dependencies of virtual defs/uses

2007-02-15 Thread Jiahua He

2007/2/15, Dorit Nuzman <[EMAIL PROTECTED]>:

"Jiahua He" <[EMAIL PROTECTED]> wrote on 12/02/2007 22:54:08:

> Oh, I see. For reduction and induction, you don't need to deal with
> the condition with vdef. I am considering how to implement an idiom
> with vdef, like SCAN (prefix sum). And by the way, do you support
> idioms with vuses?
>

You mean detecting this pattern?:

for i
  a[i] += a[i-1];


a[i] = a[i-1] + b[i]



I don't know if analyzing vdefs/vuses would help you much to detect this
pattern - maybe you're better off computing the dependence-distance (i.e.
use compute_data_dependences_for_loop, and look at DDR_DIST_VECTS).


Thinking in the same way.

Jiahua




dorit

> Jiahua
>
>
> 2007/2/12, Dorit Nuzman <[EMAIL PROTECTED]>:
> > > Thanks! In fact, I should ask how to deal with idiom (such as
> > > reduction, induction) recognition for virtual defs/uses.
> > >
> >
> > Just curious - what is this for? (are you interested in this in the
context
> > of vectorization? is there a specific example you have in mind?)
> >
> > dorit
> >
> > > Jiahua
> > >
> > >
> > > 2007/2/12, Daniel Berlin <[EMAIL PROTECTED]>:
> > > > On 2/12/07, Jiahua He <[EMAIL PROTECTED]> wrote:
> > > > > Hi,
> > > > >
> > > > > I am reading the code of autovect branch and curious about how to
> > deal
> > > > > with the dependencies of virtual defs/uses. In the function
> > > > > vect_analyze_scalar_cycles( ), I found the statement "Skip
virtual
> > > > > phi's. The data dependences that are associated with virtual
> > defs/uses
> > > > > ( i.e., memory accesses) are analyzed elsewhere." But where is
the
> > > > > code? I tried to search  for "vect_induction_def" and
> > > > > "vect_reduction_def" and found that they are not used to assign
> > > > > elsewhere. Is the analysis not implemented yet? Thanks in
advance!
> > > >
> > > > They show up as data references because of tree-data-reference.c
> > > marking them.
> > > > At lets, that's how other linear loop transforms handles it.
> > > > Not sure about how vectorizer deals with it specifically
> > > >
> > > > >
> > > > > Jiahua
> > > > >
> > > >
> >
> >




new port to older gcc: Toshiba Media Processor (MeP)

2007-02-15 Thread DJ Delorie

On behalf of Red Hat I would like to publish patches to add support
for the Toshiba Media Processor (MeP) to GCC 3.4.

We don't expect this port to be accepted into the gcc source tree
as-is, as the 3.4 branch is closed to new ports, and this port needs
some core gcc changes. We don't yet have a port to the gcc 4.x family.

I have posted details, patches, and new files here:

http://people.redhat.com/dj/mep/

The target is mep-elf.

DJ
Thanks,


what is difference between gcc-ada and GNAT????

2007-02-15 Thread sameer sinha

hi,
 can any one tell me what is the difference between gcc-ada and
differnt other compiler for Ada 95 like GNAT GPL, GNAT Pro, 
what is procedure to build only language Ada by using source code og gcc-4.1???


Makefile.def and fixincludes/Makefile.in inconsistency?

2007-02-15 Thread Brooks Moses

Why is it that Makefile.def includes:


// "missing" indicates that that module doesn't supply
// that recursive target in its Makefile.

[...]

host_modules= { module= fixincludes;
missing= info;
missing= dvi;
missing= pdf;
missing= TAGS;
missing= install-info;
missing= installcheck; };


when fixincludes/Makefile.in includes:


dvi :
pdf :
info :
html :
install-html :
installcheck :


Am I correct in guessing that the "missing" lines in Makefile.def are 
not currently needed?  Or are they merely present in the GCC fixincludes 
but missing in the fixincludes directories in some other trees that 
share the top-level build files?


- Brooks