Regarding code portability across different gcc/g++ versions

2010-09-29 Thread #SINHA SHARAD#
Hi,
 
I had a big piece of code that ran smoothly on gcc 3.2.2. For some reason, 
I had to start using that code on a machine with GCC 4.2.1. Now, it would throw 
segmentation faults (invalid free pointer etc) and abort the program. I presume 
this happens because the glibc with gcc 4.2.1 is smarter than the one with gcc 
3.2.2. Hence, what was missed during execution with 3.2.2 was caught in 4.2.1
 
While it is great to catch as many errors as possible, will it not be 
better that execution support for code running on earlier versions was 
provided? May be what was missed in earlier versions should be flagged as 
"error with the current gcc version" or something like that and it does not 
abort the program thus continuing its execution leaving the developer with the 
option to fix the error later.
 
Since, the code size in my case is very big and the original developer is not 
there to support, it is extremely difficult to resolve this issue.
 
Regards,
Sharad Sinha
 

Research Scholar,
Center for High Performance Embedded Systems,
Level 3, Border X-Block,
Research Techno Plaza,
Nanyang Technological University,
Singapore-637553



Re: Regarding code portability across different gcc/g++ versions

2010-09-29 Thread Andrew Haley
On 09/29/2010 08:07 AM, #SINHA SHARAD# wrote:
> Hi,
>  
> I had a big piece of code that ran smoothly on gcc 3.2.2. For
> some reason, I had to start using that code on a machine with GCC
> 4.2.1. Now, it would throw segmentation faults (invalid free pointer
> etc) and abort the program. I presume this happens because the glibc
> with gcc 4.2.1 is smarter than the one with gcc 3.2.2. Hence, what
> was missed during execution with 3.2.2 was caught in 4.2.1

Maybe; it's hard to say without more investigation.

> While it is great to catch as many errors as possible, will it
> not be better that execution support for code running on earlier
> versions was provided?

That's not generally possible, because we don't know all the crazy
things programmers do.

> May be what was missed in earlier versions should be flagged as
> "error with the current gcc version" or something like that and it
> does not abort the program thus continuing its execution leaving the
> developer with the option to fix the error later.

We don't deliberately generate code that segfaults, I assure you.

> Since, the code size in my case is very big and the original
> developer is not there to support, it is extremely difficult to
> resolve this issue.

I suggest you start with Valgrind's memory checker.

Andrew.


Re: Bugzilla not whining [was Re: Bugzilla outage Thursday, September 23, 18:00GMT-21:00GMT]

2010-09-29 Thread Dave Korn
On 28/09/2010 22:24, Frédéric Buclin wrote:
> Le 28. 09. 10 11:25, Dave Korn a écrit :
>> I'm no longer
>> receiving my nightly emails that the whine is supposed to be sending me.
> 
> This should be fixed now. Let me know if you still don't get nightly emails.
> 
> Frédéric

  Working fine now, thank you.

cheers,
  DaveK


Re: Regarding code portability across different gcc/g++ versions

2010-09-29 Thread Manuel López-Ibáñez
On 29 September 2010 10:29, Andrew Haley  wrote:
> On 09/29/2010 08:07 AM, #SINHA SHARAD# wrote:
>> Hi,
>>
>>     I had a big piece of code that ran smoothly on gcc 3.2.2. For
>> some reason, I had to start using that code on a machine with GCC
>> 4.2.1. Now, it would throw segmentation faults (invalid free pointer
>> etc) and abort the program. I presume this happens because the glibc
>> with gcc 4.2.1 is smarter than the one with gcc 3.2.2. Hence, what
>> was missed during execution with 3.2.2 was caught in 4.2.1
>
> Maybe; it's hard to say without more investigation.
>
>>     While it is great to catch as many errors as possible, will it
>> not be better that execution support for code running on earlier
>> versions was provided?
>
> That's not generally possible, because we don't know all the crazy
> things programmers do.
>
>> May be what was missed in earlier versions should be flagged as
>> "error with the current gcc version" or something like that and it
>> does not abort the program thus continuing its execution leaving the
>> developer with the option to fix the error later.
>
> We don't deliberately generate code that segfaults, I assure you.
>
>> Since, the code size in my case is very big and the original
>> developer is not there to support, it is extremely difficult to
>> resolve this issue.
>
> I suggest you start with Valgrind's memory checker.
>

This should be in the FAQ.

http://gcc.gnu.org/wiki/FAQ

And it should mention: http://gcc.gnu.org/bugs/#upgrading

Cheers,

Manuel.


Worse code generated by PRE

2010-09-29 Thread Bingfeng Mei
Hello, 
I have been examining a significant performance regression 
between 4.5 and 4.4 in our port. I found that Partial Redundancy
Elimination introduced in 4.5 causes the issue. The following
pseudo code explains the problem:

BB 3:
r118 <-  r114 + 2

BB 4: 
R114 <-  r114 + 2
...
Conditional jump to BB 4

After PRE

BB 3: 
r123 <-  r114 + 2
r118 <-  r123

BB 4:
r114 <- r123
conditional jump to BB 5

BB5: 
r123 <- r114 + 2
jump to BB 4


A simple loop (BB 4) is divided into two basic blocks (BB 4 & 5). 
An extra jump instruction is introduced. On some targets, this
jump can be removed by bb-reorder pass. On our target, it cannot
be reordered due to complex doloop_end pattern we generate later. 
Additionally, since bb-reorder is done in very late phase, the code
miss some optimization opportunity such as auto_inc_dec. I don't
see any benefit here to do PRE like this. Maybe we should exclude
such case in the first place? I read the relevant text in 
"Advanced Compiler Design Implementation", the example used is linear
CFG and it doesn't mention how to handle loop case. 

Any suggestion is greatly appreciated. 

Thanks,
Bingfeng Mei
 








Re: Worse code generated by PRE

2010-09-29 Thread Richard Guenther
On Wed, Sep 29, 2010 at 2:16 PM, Bingfeng Mei  wrote:
> Hello,
> I have been examining a significant performance regression
> between 4.5 and 4.4 in our port. I found that Partial Redundancy
> Elimination introduced in 4.5 causes the issue. The following
> pseudo code explains the problem:
>
> BB 3:
> r118 <-  r114 + 2
>
> BB 4:
> R114 <-  r114 + 2
> ...
> Conditional jump to BB 4
>
> After PRE
>
> BB 3:
> r123 <-  r114 + 2
> r118 <-  r123
>
> BB 4:
> r114 <- r123
> conditional jump to BB 5
>
> BB5:
> r123 <- r114 + 2
> jump to BB 4
>
>
> A simple loop (BB 4) is divided into two basic blocks (BB 4 & 5).
> An extra jump instruction is introduced. On some targets, this
> jump can be removed by bb-reorder pass. On our target, it cannot
> be reordered due to complex doloop_end pattern we generate later.
> Additionally, since bb-reorder is done in very late phase, the code
> miss some optimization opportunity such as auto_inc_dec. I don't
> see any benefit here to do PRE like this. Maybe we should exclude
> such case in the first place? I read the relevant text in
> "Advanced Compiler Design Implementation", the example used is linear
> CFG and it doesn't mention how to handle loop case.

PRE basically sinks the computation into the latch block (possibly
creating that).  Note that without a testcase it's hard to tell whether
this is ok in general.  PRE tries to avoid generation of new induction
variables and cross-iteration data-dependences, see insert_into_preds_of_block.
Note that 4.4 in principle performs the same optimization (you might
figure that PRE in 4.4 is generally disabled for -Os but enabled in 4.5,
but only for hot execution traces following existing practice to tune
code-size/performance on a fine-grained basis).

Richard.

> Any suggestion is greatly appreciated.
>
> Thanks,
> Bingfeng Mei
>
>
>
>
>
>
>
>


RE: Worse code generated by PRE

2010-09-29 Thread Bingfeng Mei
Richard,
Here is the test code. 
typedef short int16_t;
typedef unsigned short uint16_t;

void MemSet16(
int16_t *pBuf,  /* Buffer address */
int16_t Val,/* Value to be set */
uint16_tBytes   /* Total size in bytes */
)

{
uint16_t Idx;

for(Idx=0; Idx<(Bytes>>1); Idx++)
*pBuf++ = Val;
}

I grepped insert_into_preds_of_block and found it is called only by 
tree-ssa-pre.c. Actually, I am referring to RTL PRE pass in gcse.c
and lcm.c.


Before PRE: 


;; Start of basic block ( 2) -> 3
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u9(55){ }u10(57){ }u11(62){ }}
;; lr  in55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114 115
;; lr  use   55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114
;; lr  def   110 118 119 120 121
;; live  in  55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114 115
;; live  gen 110 118 119 120 121
;; live  kill   

;; Pred edge  2 [91.0%]  (fallthru)
(note 34 33 35 3 [bb 3] NOTE_INSN_BASIC_BLOCK)

(insn 35 34 36 3 tst.c:4 (set (reg/f:SI 118)
(plus:SI (reg/v/f:SI 114 [ pBuf ])
(const_int 2 [0x2]))) 273 {addsi3} (nil))

(insn 36 35 37 3 tst.c:4 (set (reg:HI 119)
(plus:HI (reg:HI 113 [ D.3441 ])
(const_int -1 [0x]))) 276 {addhi3} 
(expr_list:REG_DEAD (reg:HI 113 [ D.3441 ])
(nil)))

(insn 37 36 38 3 tst.c:4 (set (reg:SI 120)
(zero_extend:SI (reg:HI 119))) 1056 {zero_extendhisi2} 
(expr_list:REG_DEAD (reg:HI 119)
(nil)))

(insn 38 37 39 3 tst.c:4 (set (reg:SI 121)
(ashift:SI (reg:SI 120)
(const_int 1 [0x1]))) 389 {ashlsi3} (expr_list:REG_DEAD (reg:SI 120)
(nil)))

(insn 39 38 43 3 tst.c:4 (set (reg/f:SI 110 [ D.3464 ])
(plus:SI (reg/f:SI 118)
(reg:SI 121))) 273 {addsi3} (expr_list:REG_DEAD (reg:SI 121)
(expr_list:REG_DEAD (reg/f:SI 118)
(nil
;; End of basic block 3 -> ( 4)
;; lr  out   55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
;; live  out 55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115


;; Succ edge  4 [100.0%]  (fallthru)

;; Start of basic block ( 4 3) -> 4
;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u18(55){ }u19(57){ }u20(62){ }}
;; lr  in55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
;; lr  use   55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
;; lr  def   114 122
;; live  in  55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
;; live  gen 114 122
;; live  kill   

;; Pred edge  4 [91.0%] 
;; Pred edge  3 [100.0%]  (fallthru)
(code_label 43 39 40 4 3 "" [1 uses])

(note 40 43 41 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

(insn 41 40 42 4 tst.c:14 (set (mem:HI (reg/v/f:SI 114 [ pBuf ]) [2 *pBuf+0 S2 
A16])
(reg/v:HI 115 [ Val ])) 236 {*movhhi} (nil))

(insn 42 41 44 4 tst.c:14 (set (reg/v/f:SI 114 [ pBuf ])
(plus:SI (reg/v/f:SI 114 [ pBuf ])
(const_int 2 [0x2]))) 273 {addsi3} (nil))

(insn 44 42 45 4 tst.c:13 (set (reg:BI 122)
(ne:BI (reg/v/f:SI 114 [ pBuf ])
(reg/f:SI 110 [ D.3464 ]))) 1006 {cmp_simode} (nil))

(jump_insn 45 44 48 4 tst.c:13 (set (pc)
(if_then_else (ne (reg:BI 122)
(const_int 0 [0x0]))
(label_ref 43)
(pc))) 1085 {cbranchbi4} (expr_list:REG_DEAD (reg:BI 122)
(expr_list:REG_BR_PROB (const_int 9100 [0x238c])
(expr_list:REG_PRED_WIDTH (const_int 4 [0x4])
(nil
 -> 43)
;; End of basic block 4 -> ( 4 5)
;; lr  out   55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
;; live  out 55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115


After PRE:

;; Start of basic block ( 2) -> 3
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u9(55){ }u10(57){ }u11(62){ }}
;; lr  in55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114 115
;; lr  use   55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114
;; lr  def   110 118 119 120 121
;; live  in  55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114 115
;; live  gen 110 118 119 120 121
;; live  kill   

;; Pred edge  2 [91.0%]  (fallthru)
(note 34 33 35 3 [bb 3] NOTE_INSN_BASIC_BLOCK)

(insn 35 34 53 3 tst.c:4 (set (reg/f:SI 123 [ pBuf ])
(plus:SI (reg/v/f:SI 114 [ pBuf ])
(const_int 2 [0x2]))) 273 {addsi3} (nil))

(insn 53 35 36 3 tst.c:4 (set (reg/f:SI 118)
(reg/f:SI 123 [ pBuf ])) -1 (nil))

(insn 36 53 37 3 tst.c:4 (set (reg:HI 119)
(plus:HI (reg:HI 113 [ D.3441 ])
(const_int -1 [0x]))) 276 {addhi3} 
(expr_list:REG_DEAD (reg:HI 113 [ D.3441 ])
(nil)))

(insn 37 36 38 3 tst.c:4 (set (reg:SI 120)
(zero_extend:SI (reg:HI 119))) 1056 {zero_extendhisi2} 
(expr_list:REG_DEAD (reg:HI 119)
(nil)))

(insn 38 37 39 3 tst.c:4 (set (reg:SI 121)

Re: Worse code generated by PRE

2010-09-29 Thread Xinliang David Li
The optimization does look bad -- splitting backedge to allow
expression hoisting rarely removes any redundancy -- unless the loop
is really short trip counted. Besides it introduces extra copy, jump
instruction and increases register pressure.

David

On Wed, Sep 29, 2010 at 5:55 AM, Bingfeng Mei  wrote:
> Richard,
> Here is the test code.
> typedef short int16_t;
> typedef unsigned short uint16_t;
>
> void MemSet16(
>                int16_t         *pBuf,  /* Buffer address */
>                int16_t         Val,    /* Value to be set */
>                uint16_t        Bytes   /* Total size in bytes */
>                )
>
> {
>        uint16_t Idx;
>
>        for(Idx=0; Idx<(Bytes>>1); Idx++)
>                *pBuf++ = Val;
> }
>
> I grepped insert_into_preds_of_block and found it is called only by
> tree-ssa-pre.c. Actually, I am referring to RTL PRE pass in gcse.c
> and lcm.c.
>
>
> Before PRE:
>
>
> ;; Start of basic block ( 2) -> 3
> ;; bb 3 artificial_defs: { }
> ;; bb 3 artificial_uses: { u9(55){ }u10(57){ }u11(62){ }}
> ;; lr  in        55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114 115
> ;; lr  use       55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114
> ;; lr  def       110 118 119 120 121
> ;; live  in      55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114 115
> ;; live  gen     110 118 119 120 121
> ;; live  kill
>
> ;; Pred edge  2 [91.0%]  (fallthru)
> (note 34 33 35 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
>
> (insn 35 34 36 3 tst.c:4 (set (reg/f:SI 118)
>        (plus:SI (reg/v/f:SI 114 [ pBuf ])
>            (const_int 2 [0x2]))) 273 {addsi3} (nil))
>
> (insn 36 35 37 3 tst.c:4 (set (reg:HI 119)
>        (plus:HI (reg:HI 113 [ D.3441 ])
>            (const_int -1 [0x]))) 276 {addhi3} 
> (expr_list:REG_DEAD (reg:HI 113 [ D.3441 ])
>        (nil)))
>
> (insn 37 36 38 3 tst.c:4 (set (reg:SI 120)
>        (zero_extend:SI (reg:HI 119))) 1056 {zero_extendhisi2} 
> (expr_list:REG_DEAD (reg:HI 119)
>        (nil)))
>
> (insn 38 37 39 3 tst.c:4 (set (reg:SI 121)
>        (ashift:SI (reg:SI 120)
>            (const_int 1 [0x1]))) 389 {ashlsi3} (expr_list:REG_DEAD (reg:SI 
> 120)
>        (nil)))
>
> (insn 39 38 43 3 tst.c:4 (set (reg/f:SI 110 [ D.3464 ])
>        (plus:SI (reg/f:SI 118)
>            (reg:SI 121))) 273 {addsi3} (expr_list:REG_DEAD (reg:SI 121)
>        (expr_list:REG_DEAD (reg/f:SI 118)
>            (nil
> ;; End of basic block 3 -> ( 4)
> ;; lr  out       55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
> ;; live  out     55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
>
>
> ;; Succ edge  4 [100.0%]  (fallthru)
>
> ;; Start of basic block ( 4 3) -> 4
> ;; bb 4 artificial_defs: { }
> ;; bb 4 artificial_uses: { u18(55){ }u19(57){ }u20(62){ }}
> ;; lr  in        55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
> ;; lr  use       55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
> ;; lr  def       114 122
> ;; live  in      55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
> ;; live  gen     114 122
> ;; live  kill
>
> ;; Pred edge  4 [91.0%]
> ;; Pred edge  3 [100.0%]  (fallthru)
> (code_label 43 39 40 4 3 "" [1 uses])
>
> (note 40 43 41 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
>
> (insn 41 40 42 4 tst.c:14 (set (mem:HI (reg/v/f:SI 114 [ pBuf ]) [2 *pBuf+0 
> S2 A16])
>        (reg/v:HI 115 [ Val ])) 236 {*movhhi} (nil))
>
> (insn 42 41 44 4 tst.c:14 (set (reg/v/f:SI 114 [ pBuf ])
>        (plus:SI (reg/v/f:SI 114 [ pBuf ])
>            (const_int 2 [0x2]))) 273 {addsi3} (nil))
>
> (insn 44 42 45 4 tst.c:13 (set (reg:BI 122)
>        (ne:BI (reg/v/f:SI 114 [ pBuf ])
>            (reg/f:SI 110 [ D.3464 ]))) 1006 {cmp_simode} (nil))
>
> (jump_insn 45 44 48 4 tst.c:13 (set (pc)
>        (if_then_else (ne (reg:BI 122)
>                (const_int 0 [0x0]))
>            (label_ref 43)
>            (pc))) 1085 {cbranchbi4} (expr_list:REG_DEAD (reg:BI 122)
>        (expr_list:REG_BR_PROB (const_int 9100 [0x238c])
>            (expr_list:REG_PRED_WIDTH (const_int 4 [0x4])
>                (nil
>  -> 43)
> ;; End of basic block 4 -> ( 4 5)
> ;; lr  out       55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
> ;; live  out     55 [r55] 57 [r57] 62 [__arg_pointer_register__] 110 114 115
>
>
> After PRE:
>
> ;; Start of basic block ( 2) -> 3
> ;; bb 3 artificial_defs: { }
> ;; bb 3 artificial_uses: { u9(55){ }u10(57){ }u11(62){ }}
> ;; lr  in        55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114 115
> ;; lr  use       55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114
> ;; lr  def       110 118 119 120 121
> ;; live  in      55 [r55] 57 [r57] 62 [__arg_pointer_register__] 113 114 115
> ;; live  gen     110 118 119 120 121
> ;; live  kill
>
> ;; Pred edge  2 [91.0%]  (fallthru)
> (note 34 33 35 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
>
> (insn 35 34 53 3 tst.c:4 (set (reg/f:SI 123 [ pBuf ])
>        (plus:SI (reg/v/f:SI 114 [ pBuf ])
>            (const_int 2 [0x2]))) 273 {addsi3} (nil))
>
> (insn 5

Re: Clarification on who can approve Objective-C/Objective-C++ parser patches

2010-09-29 Thread Nicola Pero
Thanks Joseph

Is it confirmed that this is the opinion of the C++ FE maintainers as well ?

Can we get that clarified ?  Do they want to review Objective-C++ patches ?

(I'm still personally of the opinion the Objective-C++ maintainer should 
approve Objective-C++
patches, but Mike tells me he's been told he can't approve any changes inside 
gcc/cp, not even 
if they are Objective-C++-only, so I'm asking again)

Thanks!

-Original Message-
From: "Joseph S. Myers" 
Sent: Thursday, 23 September, 2010 17:05
To: "Nicola Pero" 
Cc: "g...@gnu.org" 
Subject: Re: Clarification on who can approve Objective-C/Objective-C++ parser 
patches

On Thu, 23 Sep 2010, Nicola Pero wrote:

> For example, if I post a patch that changes a piece of code in 
> gcc/c-parser.c which is only ever used if (c_dialect_objc ()), then I 
> assume that it is part of the Objective-C front-end, and the 
> Objective-C/Objective-C++ maintainers are in charge of approving it.  
> Once they approve it, I can commit.
> 
> Is that correct ?

Yes.  I generally expect ObjC maintainers to review changes to those parts 
of c-parser.c.

-- 
Joseph S. Myers
jos...@codesourcery.com




check_cxa_atexit_available

2010-09-29 Thread Richard Henderson
The test program in target-supports.exp is broken, since
it doesn't preclude the use of cleanups instead.  Indeed,
the init/cleanup3.C seems to be essentially identical to
the target-supports test.

Any suggestions that doesn't essentially reverse this
situation?  I.e. I could switch the target-supports test
to grep the assembly for __cxa_atexit, but I suspect that
would more or less automatically cause the cleanup3.C test
to pass.


r~


Re: Regarding code portability across different gcc/g++ versions

2010-09-29 Thread Jonathan Wakely
On 29 September 2010 08:07, #SINHA SHARAD# wrote:
> Hi,
>
>   I presume this happens because the glibc with gcc 4.2.1 is smarter than the 
> one with gcc 3.2.2. Hence, what was missed during execution with 3.2.2 was 
> caught in 4.2.1

N.B. glibc does not come with GCC, you can generally use a new GCC on
a machine with an old glibc and vice versa - they are separate
projects and are released independently.


Handling NaNs in FP comparisons

2010-09-29 Thread Michael Eager

Hi --

I'm working with a processor which sets the condition
bits when a NaN is used as an operand in a compare
in a way which is the same as a valid ordered compare.
There is a flag bit which is set for a NaN compare,
but it may also be set in a non-NaN compare.

float a = 1.0, b = 2.0, x = NaN;
(a < b) generates the same condition flags as (a < x).

IEEE std requires all comparisons involving a NaN to fail
(or trap).

Are there other processors which do this?  How do they
handle generating IEEE std compliant code?

A related problem is that CSE will optimize FP comparisons
and garble the result.  (This doesn't happen with soft-fp.)

  int r = 0, s = 0;
  float a = 1.0, x = NaN;

  r = (a <= x);
  s = (a > x);

should result in r == s == 0.  CSE translates this (more
or less) into

  r = (a <= x);
  s = !r;

Is there a way to prevent CSE from optimizing FP comparisons?


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: Handling NaNs in FP comparisons

2010-09-29 Thread Richard Henderson
On 09/29/2010 04:31 PM, Michael Eager wrote:
> float a = 1.0, b = 2.0, x = NaN;
> (a < b) generates the same condition flags as (a < x).
...
> Are there other processors which do this?  How do they
> handle generating IEEE std compliant code?

It looks like there is a bunch of code under config
that's conditionalized on flag_finite_math_only,
which disables support for NaN and Inf.

At a glance, rs6000_generate_compare may be relevant.

> 
> A related problem is that CSE will optimize FP comparisons
> and garble the result.  (This doesn't happen with soft-fp.)
> 
>   int r = 0, s = 0;
>   float a = 1.0, x = NaN;
> 
>   r = (a <= x);
>   s = (a > x);
> 
> should result in r == s == 0.  CSE translates this (more
> or less) into
> 
>   r = (a <= x);
>   s = !r;
> 
> Is there a way to prevent CSE from optimizing FP comparisons?

Add the missing check vs HONOR_NANS.  This is clearly a bug.


r~