RE: Optimize flag breaks code on many versions of gcc (not all)

2006-06-19 Thread Dave Korn
On 19 June 2006 00:04, Paolo Carlini wrote:

> Zdenek Dvorak wrote:
> 
>> ... I suspect there is something wrong with your
>> code (possibly invoking some undefined behavior, using uninitialized
>> variable, sensitivity to rounding errors, or something like that).
>> 
>> 
> A data point apparently in favor of this suspect is that the "problem"
> goes away if double is replaced everywhere with long double...
> 
> Paolo.

  Is this another case of http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
then?

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Coroutines

2006-06-19 Thread Ross Ridge
Ross Ridge wrote:
>Hmm?  I don't see how the "Lua-style" coroutines you're looking are any
>lightweight than what Maurizio Vitale is looking for.  They're actually
>more heavyweight because you need to implement some method of returning
>values to the "coroutine" being yeilded to.

Dustin Laurence wrote:
>I guess that depends on whether the userspace thread package in question
>provides for a return value as pthreads does.

Maurizio Vitale clearly wasn't looking for pthreads.

> In any case, coroutines don't need a scheduler, even a cooperative one.

He also made it clear he wanted schedule his threads himself, just like
you want to do.  In fact, what he seems to be trying to implement are
true symmetric coroutines.

Ross Ridge



gcc port based on MIPS

2006-06-19 Thread kernel coder

hi,
 I'm trying to port gcc for a processor which is very similar to
MIPS.Today i just tried to compile gcc-4.1.0 for this processor by
changing configuration files.
First i changed the config.sub file in base directory and just added
the name of processor ABC.
Then i changed the configure.ac file in gcc/ subdirectory and added
following lines.

 ABC*)
   conftest_s='
   .section .tdata,"awT",@progbits
x:
   .word 2
   .text
   addiu $4, $28, %tlsgd(x)
   addiu $4, $28, %tlsldm(x)
   lui $4, %dtprel_hi(x)
   addiu $4, $4, %dtprel_lo(x)
   lw $4, %gottprel(x)($28)
   lui $4, %tprel_hi(x)
   addiu $4, $4, %tprel_lo(x)'
   tls_first_major=2
   tls_first_minor=16
   tls_as_opt='-32 --fatal-warnings'
   ;;
As you can see it was just copy paste of  mips*-*-*) option.

Then i did following changes to config.gcc file in gcc/ subdirectory

ABC*)
   cpu_type=ABC
   ;;
- - - - - - - - - - -
- - - -- - - - - - --
ABC*)
   tm_file="dbxelf.h elfos.h svr4.h linux.h ${tm_file} ABC/linux.h"
   ;;


Then i made  a directory  gcc-4.1.0/gcc/config/ABC/.I copied all files
of gcc-4.1.0/gcc/config/mips to ABC directory and renamed following
files.

mips.h  --   ABC.h
mips.md   --ABC.md
mips.c  --ABC.c
mips-modes.def -ABC-modes.def
mips-protos.h- ABC-protos.h
mips.opt   - ABC.opt

But when i issued the make all-gcc command .Following error occured

../../gcc-4.1.0/gcc/config/ABC/ABC.md: unknown mode `V2SF'

Would u please explain why this error is being generated.Also a bit of
explaination of 'V2SF' mode will helpful.

Then i removed the 'V2SF' mode from patterns in ABC.md file.But now
following error was generated.

../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value
`' for `mode' attribute
../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value
`' for `mode' attribute
../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value `'
for `mode' attribute
../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value `'
for `mode' attribute


Would you please tell me why this error is being generated.

thanks,
shahzad


Re: Usage of -ftrapv

2006-06-19 Thread Ben Elliston
> I'd like to catch automatically over/underflows on floating point
> and integer arithmetic. I thought -ftrapv would do the trick but I
> don't really understand how it works.

By the way, -ftrapv only works on integral types.

Ben


Re: Optimize flag breaks code on many versions of gcc (not all)

2006-06-19 Thread Seongbae Park

On 6/19/06, Dave Korn <[EMAIL PROTECTED]> wrote:

On 19 June 2006 00:04, Paolo Carlini wrote:

> Zdenek Dvorak wrote:
>
>> ... I suspect there is something wrong with your
>> code (possibly invoking some undefined behavior, using uninitialized
>> variable, sensitivity to rounding errors, or something like that).
>>
>>
> A data point apparently in favor of this suspect is that the "problem"
> goes away if double is replaced everywhere with long double...
>
> Paolo.

  Is this another case of http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
then?

cheers,
  DaveK


It is the same case. Fundamentally, this is not fixable by the compiler alone
without significant performance penalty.
There are very few implementations [1] that are completely IEEE754 conformant
and making them to be so is often prohibitively expensive,
hence it's not done or at least not by default.
So whenever you're programming a floating-point code,
you need to be aware of the caveats of a particular implementation.
Beside x86's well-known extended precision issue,
other processors have things like flush-denormal-inputs-to-zero,
or multiply-add instruction that is not equivalent to separate muitply and add.

I think it's not fair to expect gcc to somehow "fix" this whole mess alone.
Of course, whenever there's a reasonable workaround for a particular issue,
I'm sure gcc developers will try to accomodate it,
but IMHO this one (bug 323) isn't such.

[1] by implementation, I mean the combination of:
microprocessor, OS, compiler and runtime libraries..
--
#pragma ident "Seongbae Park, compiler, http://seongbae.blogspot.com";


Re: Usage of -ftrapv

2006-06-19 Thread Eric Botcazou
> By the way, -ftrapv only works on integral types.

When it works.  Last time I took a look, it was easily wiped out by
optimization.

-- 
Eric Botcazou


Re: gcc-4.1.0 cross-compile for MIPS

2006-06-19 Thread Kai Ruottu

David Daney kirjoitti:

kernel coder wrote:

hi,
   I'm trying to cross compile gcc-4.1.0 for mipsel
platform.Following is the sequence of commands which i'm using

../gcc-4.1.0/configure --target=mipsel --without-headres
--prefix=/home/shahzad/install/ --with-newlib --enable-languages=c



Perhaps you should try to disable libssp.  Try adding (untested) 
--disable-libmudflap  --disable-libssp
I tried the 'mipsel-elf' target (to which the bare 'mipsel' leads) with 
gcc-4.1.1
and using '--with-newlib --enable-languages=c,c++ --disable-shared'.  
The last
(maybe) required  because earlier  builds with other '-elf' targets  
stopped  when
trying to check the 'libgcc_s.so' existence...   But no 
'--without-headers'  was
used, instead copying the generic newlib headers into the $tooldir 
($prefix/$target).


After that everything succeeded: 'gcc' and 'libiberty', 'libstdc++-v3' 
and 'libssp'
for the target.  So disabling the libssp is vain.  There was no 
libmudflap build...


So, if forgetting that '--disable-shared',  the build worked just as 
earlier with

the earlier GCC versions!  And 'kernel coder' using :

../gcc-4.1.0/configure --target=mipsel --prefix=/home/shahzad/install  \
--with-newlib --enable-languages=c,c++

should have worked after having copied those newlib headers to be  ready 
for the

fixinc, limits.h check etc. the GCC build tries to do with them.




Re: Optimize flag breaks code on many versions of gcc (not all)

2006-06-19 Thread Richard Guenther

On 6/19/06, Seongbae Park <[EMAIL PROTECTED]> wrote:

On 6/19/06, Dave Korn <[EMAIL PROTECTED]> wrote:
> On 19 June 2006 00:04, Paolo Carlini wrote:
>
> > Zdenek Dvorak wrote:
> >
> >> ... I suspect there is something wrong with your
> >> code (possibly invoking some undefined behavior, using uninitialized
> >> variable, sensitivity to rounding errors, or something like that).
> >>
> >>
> > A data point apparently in favor of this suspect is that the "problem"
> > goes away if double is replaced everywhere with long double...
> >
> > Paolo.
>
>   Is this another case of http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
> then?
>
> cheers,
>   DaveK

It is the same case. Fundamentally, this is not fixable by the compiler alone
without significant performance penalty.
There are very few implementations [1] that are completely IEEE754 conformant
and making them to be so is often prohibitively expensive,
hence it's not done or at least not by default.
So whenever you're programming a floating-point code,
you need to be aware of the caveats of a particular implementation.
Beside x86's well-known extended precision issue,
other processors have things like flush-denormal-inputs-to-zero,
or multiply-add instruction that is not equivalent to separate muitply and add.

I think it's not fair to expect gcc to somehow "fix" this whole mess alone.
Of course, whenever there's a reasonable workaround for a particular issue,
I'm sure gcc developers will try to accomodate it,
but IMHO this one (bug 323) isn't such.

[1] by implementation, I mean the combination of:
microprocessor, OS, compiler and runtime libraries..


Using -mfpmath=sse -msse2 is a workaround if you have a processor that supports
SSE2 instructions.  As opposed to -ffloat-store, it works reliably and
with no performance
impact.

Richard.


Re: addressability checks in the gimplifier

2006-06-19 Thread Olivier Hainque
Hello,

As a followup to my previous message enquiring about the intent
underlying various addressability checks in the gimplifier, attached
is an example of patch which addresses the issues we're observing.

It for instance fixes an ICE in in expand_expr_addr_expr_1 on the
testcase below:

   procedure P5 is

  type Long_Message is record
 Data : String (1 .. 16);
  end record;

  type Short_Message is record
 B : Boolean;
 Data : String (1 .. 4);
  end record;
  pragma Pack (Short_Message);

  procedure Process (LM : Long_Message; Size : Natural) is
 SM : Short_Message;
  begin
 SM.Data (1 .. Size) := LM.Data (1 .. Size);
  end;

   begin
  null;
   end;

which is the one producing the tree excerpt quoted in the previous
message (for SM.Data (1 .. Size) in Process).

The patch bootstraps fine with languages="all,ada" on i686-pc-linux-gnu,
and introduces no new regression.

Regarding gimple predicates typically not recursing down trees (in
accordance with the grammar), as I said

<< I'm pretty sure I'm missing implicit assumptions and/or bits of design
   intents in various places, so would appreciate input on the case and
   puzzles described above.
>>
 
So this patch is posted here primarily for discussion purposes.  I'd
welcome suggestions on better ways to address this, if the approach is
indeed considered inappropriate.

Thanks in advance for your help,

With Kind Regards,

Olivier








2006-06-19  Olivier Hainque  <[EMAIL PROTECTED]>

* tree-gimple.c (is_gimple_lvalue, is_gimple_addressable): Account for
possibly nested bitfield component refs, not addressable while still
valid lvalues.

*** tree-gimple.c.ori   Tue May 30 15:55:07 2006
--- tree-gimple.c   Mon Jun 19 16:50:38 2006
*** rhs_predicate_for (tree lhs)
*** 139,149 
  bool
  is_gimple_lvalue (tree t)
  {
!   return (is_gimple_addressable (t)
! || TREE_CODE (t) == WITH_SIZE_EXPR
! /* These are complex lvalues, but don't have addresses, so they
!go here.  */
! || TREE_CODE (t) == BIT_FIELD_REF);
  }
  
  /*  Return true if T is a GIMPLE condition.  */
--- 139,148 
  bool
  is_gimple_lvalue (tree t)
  {
!   return (TREE_CODE (t) == WITH_SIZE_EXPR
! || INDIRECT_REF_P (t)
! || handled_component_p (t)
! || is_gimple_variable (t));
  }
  
  /*  Return true if T is a GIMPLE condition.  */
*** is_gimple_condexpr (tree t)
*** 159,166 
  bool
  is_gimple_addressable (tree t)
  {
!   return (is_gimple_id (t) || handled_component_p (t)
! || INDIRECT_REF_P (t));
  }
  
  /* Return true if T is function invariant.  Or rather a restricted
--- 158,181 
  bool
  is_gimple_addressable (tree t)
  {
!   if (is_gimple_id (t) || INDIRECT_REF_P (t))
! return true;
! 
!   switch (TREE_CODE (t))
! {
! case COMPONENT_REF:
!   return
!   !DECL_BIT_FIELD (TREE_OPERAND (t, 1))
!   && is_gimple_addressable (TREE_OPERAND (t, 0));
! 
! case VIEW_CONVERT_EXPR:
! case ARRAY_REF:   case ARRAY_RANGE_REF:
! case REALPART_EXPR:   case IMAGPART_EXPR:
!   return is_gimple_addressable (TREE_OPERAND (t, 0));
! 
! default:
!   return false;
! }
  }
  
  /* Return true if T is function invariant.  Or rather a restricted
*** gimplify.c.ori  Tue May 30 15:54:59 2006
--- gimplify.c  Mon Jun 19 16:55:00 2006
*** gimplify_modify_expr (tree *expr_p, tree
*** 3422,3430 
  return ret;
  
/* If we've got a variable sized assignment between two lvalues (i.e. does
!  not involve a call), then we can make things a bit more straightforward
!  by converting the assignment to memcpy or memset.  */
!   if (TREE_CODE (*from_p) == WITH_SIZE_EXPR)
  {
tree from = TREE_OPERAND (*from_p, 0);
tree size = TREE_OPERAND (*from_p, 1);
--- 3422,3431 
  return ret;
  
/* If we've got a variable sized assignment between two lvalues (i.e. does
!  not involve a call), we can make things a bit more straightforward by
!  converting the assignment to memcpy or memset as soon as both operands
!  can have their address taken.  */
!   if (TREE_CODE (*from_p) == WITH_SIZE_EXPR && is_gimple_addressable (*to_p))
  {
tree from = TREE_OPERAND (*from_p, 0);
tree size = TREE_OPERAND (*from_p, 1);


Re: gcc port based on MIPS

2006-06-19 Thread Ian Lance Taylor
"kernel coder" <[EMAIL PROTECTED]> writes:

> But when i issued the make all-gcc command .Following error occured
> 
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md: unknown mode `V2SF'
> 
> Would u please explain why this error is being generated.Also a bit of
> explaination of 'V2SF' mode will helpful.

V2SF should normally be defined by ABC/ABC-modes.def.  It is normally
found by these lines in config.gcc when you run configure:

if test -f ${srcdir}/config/${cpu_type}/${cpu_type}-modes.def
then
extra_modes=${cpu_type}/${cpu_type}-modes.def
fi

V2SF will be created by the line
VECTOR_MODES (FLOAT, 8);
in ABC-modes.def (as copied from mips-modes.def).

> Then i removed the 'V2SF' mode from patterns in ABC.md file.But now
> following error was generated.
> 
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value
> `' for `mode' attribute
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value
> `' for `mode' attribute
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value `'
> for `mode' attribute
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value `'
> for `mode' attribute
> 
> 
> Would you please tell me why this error is being generated.

Hard to say without knowing what you changed.  Did you simply delete
the ANYF macro, or forget to remove the V2SF case from it?

MD file macros are documented here:
http://gcc.gnu.org/onlinedocs/gccint/Macros.html

Ian


Re: gcc port based on MIPS

2006-06-19 Thread kernel coder

V2SF will be created by the line
VECTOR_MODES (FLOAT, 8);


Yes you are absolutely right.When i changed the name of file
ABC-modes.def to 1ABC-modes.def ,i got the following error

make[1]: *** No rule to make target
`../../gcc-4.1.0/gcc/config/ABC/ABC-modes.def', needed by
`build/genmodes.o'.  Stop.
This shows that ABC-modes.def is being used and it has the  required macro

VECTOR_MODES (FLOAT, 8);

Then why still the following error is being generated.


> ../../gcc-4.1.0/gcc/config/ABC/ABC.md: unknown mode `V2SF'


As far as my changes to ABC.md file are concerned .They are as fellows

(define_mode_macro ANYF [(SF "TARGET_HARD_FLOAT")
(DF "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT")])
;;   (V2SF "TARGET_PAIRED_SINGLE_FLOAT")])

- - - - - -- - - - - - - - - -
-- - - - - - - - - -- - - - -
(define_mode_attr divide_condition
 [DF (SF "!TARGET_FIX_SB1 || flag_unsafe_math_optimizations")])
;;   (V2SF "TARGET_SB1 && (!TARGET_FIX_SB1 ||
flag_unsafe_math_optimizations)")])


As you can see i just omitted the entries of V2SF.




On 19 Jun 2006 10:40:45 -0700, Ian Lance Taylor <[EMAIL PROTECTED]> wrote:

"kernel coder" <[EMAIL PROTECTED]> writes:

> But when i issued the make all-gcc command .Following error occured
>
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md: unknown mode `V2SF'
>
> Would u please explain why this error is being generated.Also a bit of
> explaination of 'V2SF' mode will helpful.

V2SF should normally be defined by ABC/ABC-modes.def.  It is normally
found by these lines in config.gcc when you run configure:

if test -f ${srcdir}/config/${cpu_type}/${cpu_type}-modes.def
then
extra_modes=${cpu_type}/${cpu_type}-modes.def
fi

V2SF will be created by the line
VECTOR_MODES (FLOAT, 8);
in ABC-modes.def (as copied from mips-modes.def).

> Then i removed the 'V2SF' mode from patterns in ABC.md file.But now
> following error was generated.
>
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value
> `' for `mode' attribute
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value
> `' for `mode' attribute
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value `'
> for `mode' attribute
> ../../gcc-4.1.0/gcc/config/ABC/ABC.md:228: unknown value `'
> for `mode' attribute
>
>
> Would you please tell me why this error is being generated.

Hard to say without knowing what you changed.  Did you simply delete
the ANYF macro, or forget to remove the V2SF case from it?

MD file macros are documented here:
http://gcc.gnu.org/onlinedocs/gccint/Macros.html

Ian



Re: gcc port based on MIPS

2006-06-19 Thread Ian Lance Taylor
"kernel coder" <[EMAIL PROTECTED]> writes:

> > V2SF will be created by the line
> > VECTOR_MODES (FLOAT, 8);
> 
> Yes you are absolutely right.When i changed the name of file
> ABC-modes.def to 1ABC-modes.def ,i got the following error
> 
> make[1]: *** No rule to make target
> `../../gcc-4.1.0/gcc/config/ABC/ABC-modes.def', needed by
> `build/genmodes.o'.  Stop.
> This shows that ABC-modes.def is being used and it has the  required macro
> 
> VECTOR_MODES (FLOAT, 8);
> 
> Then why still the following error is being generated.
> 
> > > ../../gcc-4.1.0/gcc/config/ABC/ABC.md: unknown mode `V2SF'

I don't know.  You'll have to debug it.


> As far as my changes to ABC.md file are concerned .They are as fellows
> 
> (define_mode_macro ANYF [(SF "TARGET_HARD_FLOAT")
>  (DF "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT")])
> ;;   (V2SF "TARGET_PAIRED_SINGLE_FLOAT")])
> 
> - - - - - -- - - - - - - - - -
>  -- - - - - - - - - -- - - - -
> (define_mode_attr divide_condition
>   [DF (SF "!TARGET_FIX_SB1 || flag_unsafe_math_optimizations")])
> ;;   (V2SF "TARGET_SB1 && (!TARGET_FIX_SB1 ||
> flag_unsafe_math_optimizations)")])
> 
> 
> As you can see i just omitted the entries of V2SF.

I hope that isn't really what you did, since that would comment out
the "])" close brackes in each case.

Ian


Re: gcc port based on MIPS

2006-06-19 Thread Andrew Pinski
> 
> "kernel coder" <[EMAIL PROTECTED]> writes:
> 
> > (define_mode_attr divide_condition
> >   [DF (SF "!TARGET_FIX_SB1 || flag_unsafe_math_optimizations")])
> > ;;   (V2SF "TARGET_SB1 && (!TARGET_FIX_SB1 ||
> > flag_unsafe_math_optimizations)")])
> > 
> > 
> > As you can see i just omitted the entries of V2SF.
> 
> I hope that isn't really what you did, since that would comment out
> the "])" close brackes in each case.

Except the above line has ]) also :).

-- Pinski



Question regarding the "Clean up how cse works" project

2006-06-19 Thread Steven Bosscher
Hi,

I have a question about the "Clean up how cse works" project on
http://gcc.gnu.org/projects/optimize.html

Let me first explain what I am trying to do.  I have seen Vlad's patch
to make CSE path following remember its state at the end of a path, so
that when a new path is followed, a re-scan is unnecessary for all the
insns up to the point where a state is stored.  I believe that in the
long run we want CSE to work on extended basic blocks, i.e. more like
a tree walk, without looking path following as we know it.  And IMHO
that special hash table implementation in cse.c shouldn't be necessary
so I'd like to replace it with libiberty's hashtab (which seems to do
Just Fine for e.g. cselib).

So far I've mostly used Vlad's code for learning but it seems to me that
his code is not easily adapted to my "do CSE on extended basic blocks"
idea, and I have also found out that with his patch the order of the
elements in the hash table is not restored properly.  This even causes
some test suite failures for me with gcc 3.2 (the last version that the
patch will apply to without too much effort).
I don't really see an easy way to efficiently implement a scoped hash
table the cse.c way, such that you can invalidate and roll back while
maintaining the order of the linked list of equal-valued exprs.  And the
cse.c hash table is alos simply too slow (fixed number of buckets, so
potentially quadratic if you record lots of expressions).

The first problem I immediately ran into while trying to figure out how
to make cse.c use libiberty's hashtab is that we seem to use different
"equivalent" checks depending on how strict we want to be.  For some
lookups, apparently if we only want to find the first_same_value element
in the hash table, we lookup without validating in exp_equiv_p.  For other
lookups we call exp_equiv_p with validate set to true.  The most obvious
example is lookup_as_function.

In the projects page "Clean up how cse works" there is a scheme described
to make cse.c work without first_same_value and next_same_value. Apparently,
at some point someone decided that we should not _have_ this whole thing
with  multiple expressions describing the same value.  I couldn't agree
more.

But I am _still_ not sufficiently familiar with cse.c to fully understand
what it can do (other than sending email).  E.g. I have been trying to
figure out why we record multiple expressions with the same value in the
hash table.  I would like to know what the benefit is, and whether we would
lose optimizations if I make it go away.

It turns out that we sometimes record widely different expressions that get
the same value (due to canonicalization and so on).  Usually the different
expression with the same value comes from a REG_EQUAL note.  When you look
from gdb what we are recording, you get things like:

2: debug_rtx (elt->exp) = (reg:SI 91)
void
1: dump_class (classp) = Equivalence chain for (reg:SI 82):
(reg:SI 82)
(plus:SI (reg:SI 81)
(reg:SI 59 [ D.6710 ]))
(mult:SI (reg:SI 59 [ D.6710 ])
(const_int 3 [0x3]))

2: debug_rtx (elt->exp) = (reg:SI 92)
void
1: dump_class (classp) = Equivalence chain for (reg:SI 83):
(reg:SI 83)
(ashift:SI (reg:SI 82)
(const_int 2 [0x2]))
(mult:SI (reg:SI 59 [ D.6710 ])
(const_int 12 [0xc]))

2: debug_rtx (elt->exp) = (reg:QI 184 [ D.5910 ])
void
1: dump_class (classp) = Equivalence chain for (reg:QI 385):
(reg:QI 385)
(eq:QI (reg/v:SI 136 [ spec_long ])
(const_int 0 [0x0]))
(eq:QI (reg:CCZ 17 flags)
(const_int 0 [0x0]))

In my collection of cc1-i files (half a million lines of preprocessed
code), at -O2, we record multiple expressions with the same value in
4460 cases.  My guess is that in most of these cases we record a
SET_SRC and a REG_EQUAL note.  I of course still need to make sure
that assumption is correct ;-)
In all cases where the value leader is a constant, we can apparently
fold_rtx the expression to that constant so those are not interesting
expressions to count as dups.  That still leaves more than 3000 cases.

I suspect we may benefit from recording these different expressions
in e.g. find_best_addr.  For some machines, the ashift may be better
and for others the mult is cheaper.  So to know that these expressions
have the same value is very important.

That brings me back to the CSE project on the projects page.  Assume
we'd be looking at the first case again, which comes from the following
insns:

(insn 54 53 55 6 (parallel [
(set (reg:SI 82)
(plus:SI (reg:SI 81)
(reg:SI 59 [ D.6710 ])))
(clobber (reg:CC 17 flags))
]) 208 {*addsi_1} (nil)
(expr_list:REG_EQUAL (mult:SI (reg:SI 59 [ D.6710 ])
(const_int 3 [0x3]))
(nil)))

(insn 77 76 78 8 (set (reg:SI 91)
(reg:SI 82)) 40 {*movsi_1} (nil)
(expr_list:REG_EQUAL (mult:SI (reg:SI 59 [ D.6710 ])
(const_int 3 [0x3]))
(nil)))

With the current cse, we simply brute-force record the REG_EQUAL note

Re: Output of contrib/compare_tests

2006-06-19 Thread Mike Stump

On Jun 18, 2006, at 2:35 PM, Mike Stein wrote:

Is someone else interested in the daily output


Or while (1) do, if you have the bandwidth... :-)

But, please, just email the results to yourself and try that for a  
week.  :-)


This will help shake out the trivial things.  You'll also need to add  
in some other information, like revision numbers of things under test  
and platform, so that people can know what it is you're reporting.


Also, be sure to care and feed the system...


Re: MIPS RDHWR instruction reordering

2006-06-19 Thread Daniel Jacobowitz
On Fri, Jun 16, 2006 at 02:12:29PM -0700, Ian Lance Taylor wrote:
> The computation of the address of x was moved outside the
> conditional--that is, both the rdhwr and the addu moved.  You'll have
> to figure out why.  gcc shouldn't move instructions outside of a
> conditional unless they are cheap and don't trap.  This instruction
> doesn't trap, but it's not cheap.

What metric gets used for this - rtx_cost?

-- 
Daniel Jacobowitz
CodeSourcery


Re: MIPS RDHWR instruction reordering

2006-06-19 Thread Ian Lance Taylor
Daniel Jacobowitz <[EMAIL PROTECTED]> writes:

> On Fri, Jun 16, 2006 at 02:12:29PM -0700, Ian Lance Taylor wrote:
> > The computation of the address of x was moved outside the
> > conditional--that is, both the rdhwr and the addu moved.  You'll have
> > to figure out why.  gcc shouldn't move instructions outside of a
> > conditional unless they are cheap and don't trap.  This instruction
> > doesn't trap, but it's not cheap.
> 
> What metric gets used for this - rtx_cost?

I'm not sure, because I'm not sure what is hoisting the instruction.

I tried recreating this, but I couldn't.  I get this:

foo:
.frame  $sp,0,$31   # vars= 0, regs= 0/0, args= 0, gp= 0
.mask   0x,0
.fmask  0x,0
.setnoreorder
.cpload $25
.setreorder
.setnoreorder
.setnomacro
beq $4,$0,$L7
.setpush
.setmips32r2
rdhwr   $3,$29
.setpop
.setmacro
.setreorder

lw  $2,%gottprel(x)($28)
addu$2,$2,$3
lw  $2,0($2)
j   $31
$L7:
.setnoreorder
.setnomacro
j   $31
move$2,$0
.setmacro
.setreorder

This of course is not ideal, since it unconditionally executes the
rdhwr instruction.  But it is not the same as what the OP reported.

This case happens because reorg.c ignores the cost of the instruction
in fill_slots_from_thread.  I believe that reorg.c should not move an
expensive instruction which is only conditionally executed into a
delay slot.  That is probably a bug.

We can see a similar case with this:

int foo(int arg, int x)
{
if (arg)
  return x * x;
return 0;
}

which yields this:

foo:
.frame  $sp,0,$31   # vars= 0, regs= 0/0, args= 0, gp= 0
.mask   0x,0
.fmask  0x,0
.setnoreorder
.cpload $25
.setreorder
.setnoreorder
.setnomacro
beq $4,$0,$L7
mult$5,$5
.setmacro
.setreorder

mflo$2
j   $31
$L7:
.setnoreorder
.setnomacro
j   $31
move$2,$0
.setmacro
.setreorder

which executes the "mult" instruction unconditionally which is
probably not desirable since it will tie up the multiplication
pipeline.

Ian


Re: Optimize flag breaks code on many versions of gcc (not all)

2006-06-19 Thread tbp

On 6/19/06, Richard Guenther <[EMAIL PROTECTED]> wrote:

Using -mfpmath=sse -msse2 is a workaround if you have a processor that supports
SSE2 instructions.  As opposed to -ffloat-store, it works reliably and
with no performance
impact.

Such slab test can be turned into a branchless sequence of SSE
min/max, even for filtering infinities around dir ~= 0; it's much
simpler and efficient to intersect 4 rays against one box at once
though.
Without intrinsics a NaN oblivious version would be like:

static float minf(const float a, const float b) { return (a < b) ? a : b; }
static float maxf(const float a, const float b) { return (a > b) ? a : b; }

bool_t intersect_ray_box(const aabb_t &box, const rt::mono::ray_t
&ray, float &lmin, float &lmax)
{
float
l1  = (box.min.x - ray.pos.x) * ray.inv_dir.x,
l2  = (box.max.x - ray.pos.x) * ray.inv_dir.x;
lmin= minf(l1,l2);
lmax= maxf(l1,l2);

l1  = (box.min.y - ray.pos.y) * ray.inv_dir.y;
l2  = (box.max.y - ray.pos.y) * ray.inv_dir.y;
lmin= maxf(minf(l1,l2), lmin);
lmax= minf(maxf(l1,l2), lmax);

l1  = (box.min.z - ray.pos.z) * ray.inv_dir.z;
l2  = (box.max.z - ray.pos.z) * ray.inv_dir.z;
lmin= maxf(minf(l1,l2), lmin);
lmax= minf(maxf(l1,l2), lmax);

return (lmax >= lmin) & (lmax >= 0.f);
}