date:20061204

writing a new pass: association with an option string

2006-12-04 Thread Andrea Callia D'Iddio


Dear all,
I wrote a new pass for gcc. Actually the pass is always executed, but
I'd like to execute it only if I specify an option from shell (ex. gcc
--mypass pippo.c). How can I do?

Thanks to all

Andrea

Re: writing a new pass: association with an option string

2006-12-04 Thread Revital1 Eres



[EMAIL PROTECTED] wrote on 04/12/2006 10:48:25:

> Dear all,
> I wrote a new pass for gcc. Actually the pass is always executed, but
> I'd like to execute it only if I specify an option from shell (ex. gcc
> --mypass pippo.c). How can I do?

Maybe adding a gate to your pass which is controlled by a new flag;
for example see pass_vectorize in passes.c
and gate_tree_vectorize in tree-ssa-loop.c.

Revital

Re: [C/C++] same warning/error, different text

2006-12-04 Thread Gabriel Dos Reis

Mark Mitchell <[EMAIL PROTECTED]> writes:

| Manuel López-Ibáñez wrote:
| > The message for the following error:
| > 
| > enum e {  E3 = 1 / 0 };
| > 
| > is in C: error: enumerator value for 'E3' not integer constant
| > and in C++: error: enumerator value for 'E3' is not an integer constant
| > 
| > Is there someone against fixing this? What would be the preferred message?
| 
| I slightly prefer the more-grammatical C++ version, but, if there's any
| controversy at all, I'm perfectly happy with the C version too, and it's
| certainly a good thing to use the same message in both languages.

Indeed; my preference also goes to the C++ diagnostic.

-- Gaby

Reload Problem in delete_output_reload

2006-12-04 Thread Unruh, Erwin

Hello,

I have a problem with delete_output_reload. It sometimes deletes
instructions
which are needed. Here an analysis of a recent case (In a private
version of
the S390 port). The original S390 shows almost the same reloads, but
chooses
different registers.

Before reload we have

(insn 1597 1697 1598 0 x.c:238 (set (reg:DI 1393)
(ashift:DI (reg:DI 1391)
(const_int 8 [0x8]))) 349 {*ashldi3_31}
(insn_list:REG_DEP_TRUE 1596 (nil))
(nil))

(insn 1598 1597 1623 0 x.c:238 (parallel [
(set (reg:DI 1393)
(plus:DI (reg:DI 1391)
(reg:DI 1393)))
(clobber (reg:CC 33 %cc))
]) 177 {*adddi3_31} (insn_list:REG_DEP_OUTPUT 1594
(insn_list:REG_DEP_TRUE 1597 (insn_list:REG_DEP_TRUE 1596 (nil
(expr_list:REG_DEAD (reg:DI 1391)
(expr_list:REG_UNUSED (reg:CC 33 %cc)
(expr_list:REG_EQUAL (mult:DI (reg:DI 1384 [+114 ])
(const_int 9934259357961 [0x9090909]))
(nil)

Both registers 1391 and 1393 will be put on the stack. The offset is
more than
7000, so we need a secondary reload. The report in *.greg is

Reloads for insn # 1597
Reload 0: reload_in (SI) = (const_int 4080 [0xff0])
ADDR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0)
reload_in_reg: (const_int 4080 [0xff0])
reload_reg_rtx: (reg:SI 4 4)
Reload 1: reload_in (SI) = (const_int 4080 [0xff0])
ADDR_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0)
reload_in_reg: (const_int 4080 [0xff0])
reload_reg_rtx: (reg:SI 4 4)
Reload 2: reload_in (DI) = (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15)
(const_int
4080 [0xff0]))
(const_int 3144
[0xc48])) [0 S8 A8])
reload_out (DI) = (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15)
(const_int
4080 [0xff0]))
(const_int 3136
[0xc40])) [0 S8 A8])
GENERAL_REGS, RELOAD_OTHER (opnum = 0), can't combine
reload_in_reg: (reg:DI 1391)
reload_out_reg: (reg:DI 1393)
reload_reg_rtx: (reg:DI 2 2)

Reloads for insn # 1598
Reload 0: reload_in (SI) = (const_int 4080 [0xff0])
ADDR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum = 0)
reload_in_reg: (const_int 4080 [0xff0])
reload_reg_rtx: (reg:SI 2 2)
Reload 1: reload_in (SI) = (const_int 4080 [0xff0])
ADDR_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0)
reload_in_reg: (const_int 4080 [0xff0])
reload_reg_rtx: (reg:SI 2 2)
Reload 2: reload_in (SI) = (const_int 4080 [0xff0])
ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 2)
reload_in_reg: (const_int 4080 [0xff0])
reload_reg_rtx: (reg:SI 2 2)
Reload 3: reload_in (DI) = (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15)
(const_int
4080 [0xff0]))
(const_int 3144
[0xc48])) [0 S8 A8])
reload_out (DI) = (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15)
(const_int
4080 [0xff0]))
(const_int 3136
[0xc40])) [0 S8 A8])
GENERAL_REGS, RELOAD_OTHER (opnum = 0), can't combine
reload_in_reg: (reg:DI 1391)
reload_out_reg: (reg:DI 1393)
reload_reg_rtx: (reg:DI 0 0)
Reload 4: ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum = 2), can't
combine, secondary_reload_p
reload_reg_rtx: (reg:SI 3 3)
Reload 5: reload_in (SI) = (plus:SI (plus:SI (reg/f:SI 15 15)
(const_int 4080
[0xff0]))
(const_int 3136
[0xc40]))
ADDR_REGS, RELOAD_FOR_INPUT (opnum = 2), inc by 8
reload_in_reg: (plus:SI (plus:SI (reg/f:SI 15 15)
(const_int 4080
[0xff0]))
(const_int 3136
[0xc40]))
reload_reg_rtx: (reg:SI 2 2)
secondary_in_reload = 4
secondary_in_icode = reload_insi

These reloads are ok. In do_output_reload it is noted that both
insn_1597.Reload_2 and insn_1598.Reload_3 write to the same stack slot.
So the
compiler decides to remove the first reload and use register (reg:DI 2)
directly. In this analysis it misses the fact that (reg:SI 2) is used
for input
reloads of insn 1598. After Reload the generated instructions are:

(insn 1597 16833 16836 0 x.c:238 (set (reg:DI 2 2)
(ashift:DI (reg:DI 2 2)
(const_int 8 [0x8]))) 349 {*ashldi3_31}
(insn_list:REG_DEP_TRUE 1596 (nil))
(nil))

(insn 16836 1597 16838 0 x.c:238 (set (reg:SI 2 2)
(const_int 4080 [0xff0])) 56 {*movsi_esa} (nil)
(nil))

(insn 16838 16836 16837 0 x.c:238 (set (reg:DI 0 0)
(mem/c:DI

Re: block reordering at GIMPLE level?

2006-12-04 Thread Roberto COSTA


Hello,
CLI back-end uses GIMPLE representation (to be more precise, a subset of 
GIMPLE, the code undergoes a CLI-specific GIMPLE lowering at the end of 
middle-end passes) and just emits bytecode in a CLI-specific assembler 
generation pass.
Because of that, we (I mean CLI back-end project) wouldn't even have to 
redefine our CFG, we already use CFG for GIMPLE.
I think it's interesting for us to check whether the existing RTL 
reordering pass may be reused with little or no modification and, if 
not, to see if it can be made it more IL independent.


Cheers,
Roberto


Jan Hubicka wrote:

Hi,
I know little about CLI, but assuming that your backend is nonstandard
enought so it seems to make sense to replace the RTL bits I guess it
would make sense to make the bb-reorder run on GIMPLE level too, while
keeping bb-reorder on RTL level for common compilation path.  This
is example of pass that has very little dependency on the particular IL
so our CFG manipualtion abstraction can be probably extended rather
easilly to make it pracically IL independent.

The reason why it is run late is that the RTL backend modify CFG enough
to make this important.  CLI might have the same property.  What you
might want to consider is to simply port our CFG code to CLI IL
representation, whatever it is and share the pass.

The tracer pass, very similar to bb-rorder in nature, has been ported to
work on gimple, but the implementation is not in mainline yet.  You
might want to take a look at changes neccesary as bb-reorder should be
about the same (minus the SSA updating since you probably want to
bb-reroder after leaving SSA form)

Honza

Hello,

While working on our CLI port, I realized that we were missing, among 
others, the block reordering pass. This is because we emit CLI code 
before the RTL passes are reached.
Looking at the backend optimizations, it is clear that some modify the 
CFG. But my understanding is that loop optimizations and unrolling are 
also being moved to GIMPLE. I do not know about others.


Could it be that sometime all the optimizations that modify the CFG are 
run on GIMPLE?
Is there any plan/interest in having a block layout pass running at 
GIMPLE level?


Cheers,

--
Erven.

Re: block reordering at GIMPLE level?

2006-12-04 Thread Jan Hubicka

> Hello,
> CLI back-end uses GIMPLE representation (to be more precise, a subset of 
> GIMPLE, the code undergoes a CLI-specific GIMPLE lowering at the end of 
> middle-end passes) and just emits bytecode in a CLI-specific assembler 
> generation pass.
> Because of that, we (I mean CLI back-end project) wouldn't even have to 
> redefine our CFG, we already use CFG for GIMPLE.
> I think it's interesting for us to check whether the existing RTL 
> reordering pass may be reused with little or no modification and, if 
> not, to see if it can be made it more IL independent.

The BB reordering pass got some IL specific parts for hot/cold function
splitting, but the rest should just work fine with little changes and
cleanups. (the main algorithm is basically duplicating blocks via
already virtualized interface and constructing new ordering via bb->aux
pointers.  Then it rely on RTL specific cfglayout code to do the actual
reordering that you can just do on gimple with little effort since the
GIMPLE BBs are easilly reorderable).

Does CLI's conditional jumps have one destination and falltrhough or two
destinations?  If you have no natural fallthrough edges, reordering
blocks is easy.  If you do have fallthrough after conditional jump, you
will need to immitate cfglayout code inserting unconditional jumps into
edges.

Honza
> 
> Cheers,
> Roberto

Re: block reordering at GIMPLE level?

2006-12-04 Thread Roberto COSTA


Jan Hubicka wrote:

Hello,
CLI back-end uses GIMPLE representation (to be more precise, a subset of 
GIMPLE, the code undergoes a CLI-specific GIMPLE lowering at the end of 
middle-end passes) and just emits bytecode in a CLI-specific assembler 
generation pass.
Because of that, we (I mean CLI back-end project) wouldn't even have to 
redefine our CFG, we already use CFG for GIMPLE.
I think it's interesting for us to check whether the existing RTL 
reordering pass may be reused with little or no modification and, if 
not, to see if it can be made it more IL independent.


The BB reordering pass got some IL specific parts for hot/cold function
splitting, but the rest should just work fine with little changes and
cleanups. (the main algorithm is basically duplicating blocks via
already virtualized interface and constructing new ordering via bb->aux
pointers.  Then it rely on RTL specific cfglayout code to do the actual
reordering that you can just do on gimple with little effort since the
GIMPLE BBs are easilly reorderable).


Good!


Does CLI's conditional jumps have one destination and falltrhough or two
destinations?  If you have no natural fallthrough edges, reordering
blocks is easy.  If you do have fallthrough after conditional jump, you
will need to immitate cfglayout code inserting unconditional jumps into
edges.


CLI conditionals (except the switch statement) have one destination and 
fall-through.
In the current CLI emission code, an unconditional jump is generated for 
the ELSE operand of a COND_EXPR statement, unless the target basic block 
is the one that follows in the current layout.


Cheers,
Roberto

Re: writing a new pass: association with an option string

2006-12-04 Thread Diego Novillo


Andrea Callia D'Iddio wrote on 12/04/06 03:48:

Dear all,
I wrote a new pass for gcc. Actually the pass is always executed, but
I'd like to execute it only if I specify an option from shell (ex. gcc
--mypass pippo.c). How can I do?

Create a new flag in common.opt and read its value in the gate function 
of your pass.  I *believe* this is documented somewher in the internals 
manual, but I'm not sure.


You can check how other passes do it.  See, for instance, 
flag_tree_vectorize in common.opt and in the vectorizer's gating predicate.

Re: writing a new pass: association with an option string

2006-12-04 Thread Revital1 Eres



> Create a new flag in common.opt and read its value in the gate function
> of your pass.  I *believe* this is documented somewher in the internals
> manual, but I'm not sure.

See also -
http://gcc.gnu.org/wiki/WritingANewPass

setting up stack frame, regression (gcc 4.1.0) ?

2006-12-04 Thread Kimmo Fredriksson


Hi,

Consider this (simplified) test case:

int bar (int a)
{
   int b;
   memcpy (&b, &a, sizeof (int));
   return b;
}

gcc 3.4.3 compiles this to (-O3 -fomit-frame-pointer)

bar:
   movl4(%esp), %eax
   ret

But gcc 4.1.0 (and gcc 4.0.0 as well) generates:

bar:
   subl$16, %esp
   movl20(%esp), %eax
   addl$16, %esp
   ret

The local variable is optimized away, so what's the point adjusting esp?

If I don't use memcpy, but a simple assignment, then both versions
produce the same
(good) code. Host and target are i386-redhat-linux.

Kimmo

Re: writing a new pass: association with an option string

2006-12-04 Thread Dorit Nuzman

See slides 18-35 in
http://gcc.gnu.org/wiki/OptimizationCourse?action=AttachFile&do=get&target=ExampleGCC_middle.pdf
(linked from  http://gcc.gnu.org/wiki/OptimizationCourse).

dorit

> Dear all,
> I wrote a new pass for gcc. Actually the pass is always executed, but
> I'd like to execute it only if I specify an option from shell (ex. gcc
> --mypass pippo.c). How can I do?
>
> Thanks to all
>
> Andrea

Re: [PATCH]: Require MPFR 2.2.1

2006-12-04 Thread Richard Guenther


On 12/3/06, Kaveh R. GHAZI <[EMAIL PROTECTED]> wrote:

This patch updates configure to require MPFR 2.2.1 as promised here:
http://gcc.gnu.org/ml/gcc/2006-12/msg00054.html

Tested on sparc-sun-solaris2.10 using mpfr-2.2.1, mpfr-2.2.0 and an older
mpfr included with gmp-4.1.4.  Only 2.2.1 passed (as expected).

I'd like to give everyone enough time to update their personal
installations and regression testers before installing this.  Does one
week sound okay?  If there are no objections, that's what I'd like to do.


Please don't.  It'll be a hassle for us again and will cause automatic testers
to again miss some days or weeks during stage1 (given christmas
holiday season is near).
Rather defer to the start of stage3 please.

Thanks,
Richard.

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Richard Guenther


On 12/3/06, Toon Moene <[EMAIL PROTECTED]> wrote:

Richard,

Somewhere, in a mail lost in my large e-mail clash with my ISP
(verizon), you said that gfortran couldn't profit from the pow(x, 1./3.)
-> cbrt(x) conversion because gfortran didn't "know" of cbrt.

Could you be more explicit about this - I'd like to repair this
deficiency, if at all possible.

Thanks in advance !


It's a matter of making the cbrt builtin available - I have a patch for this,
but wondered if the fortran frontend can rely on the cbrt library call being
available?  Or available in a fast variant, not a fallback implementation in
libgfortran which does pow (x, 1./3.) which will then of course pessimize
pow (x, 2./3.) -> tmp = cbrt(x); tmp * tmp  expansion.

Richard.

Re: MPFR precision when FLT_RADIX != 2

2006-12-04 Thread Geert Bosch



On Dec 3, 2006, at 12:44, Kaveh R. GHAZI wrote:


In case i370 support is revived or a format not using base==2 is
introduced, I could proactively fix the MPFR precision setting for any
base that is a power of 2 by multiplying the target float precision by
log2(base).  In the i370 case I would multiply by log2(16) which is 4.
When base==2, then the log2(2) is 1 so the multiplication  
simplifies to

the current existing behavior.


That would not be correct, as the actual precision in bits of
a base==16 floating-point number depends on the magnitude of the number.
The gaps between adjacent hexadecimal floating-point numbers can be
2, 4 or 8 times that of binary floats with a mantissa of the same
number of bits (including any implicit leading 1's).

Example: In a floating-point format with 24 binary digits, the even
integers 0x102 through 0x10e would be representable, while
a system with 6 hexadecimal digits there would be a gap between
0x100 and 0x110.

So, while the sum 16777216.0 + 2.0 does not depend on rounding
direction in IEEE single precision math, it would depend on
rounding direction for the IBM 370's single precision type.

For GCC's purpose, it seems that hexadecimal floating-point systems
can be regarded as a historical curiosity and adding significant
complexity for supporting some optimizations for them seems not
worth the distributed cost of maintenance. For IBM 370 math, we
should always either:
  - call library functions for evaluation
  - convert to IEEE, operate, convert back

  -Geert

Re: [PATCH]: Require MPFR 2.2.1

2006-12-04 Thread Mike Stump


On Dec 4, 2006, at 8:23 AM, Richard Guenther wrote:

On 12/3/06, Kaveh R. GHAZI <[EMAIL PROTECTED]> wrote:

This patch updates configure to require MPFR 2.2.1 as promised here:
http://gcc.gnu.org/ml/gcc/2006-12/msg00054.html

Tested on sparc-sun-solaris2.10 using mpfr-2.2.1, mpfr-2.2.0 and  
an older

mpfr included with gmp-4.1.4.  Only 2.2.1 passed (as expected).

I'd like to give everyone enough time to update their personal
installations and regression testers before installing this.  Does  
one
week sound okay?  If there are no objections, that's what I'd like  
to do.


Please don't.  It'll be a hassle for us again and will cause  
automatic testers

to again miss some days or weeks during stage1


I agree, please don't, if it is at all possible to avoid it.  If you  
want to update, let's update once late in the stage3 cycle.

Re: MPFR precision when FLT_RADIX != 2

2006-12-04 Thread Steve Kargl

On Sun, Dec 03, 2006 at 12:44:11PM -0500, Kaveh R. GHAZI wrote:
> 
> I'm not sure if these issues come up for fortran in prior releases.  I
> think i370 was removed before 4.0/f95 and decimal floats were added in
> 4.2, which is not yet released.
> 

This issue hasn't come up yet.  In gfc_arith_init_1(), you'll
find 

  /* These are the numbers that are actually representable by the
 target.  For bases other than two, this needs to be changed.  */
  if (int_info->radix != 2)
gfc_internal_error ("Fix min_int calculation");

which prevents radix != 2 real kinds.

-- 
Steve

Re: [PATCH]: Require MPFR 2.2.1

2006-12-04 Thread Diego Novillo


Richard Guenther wrote on 12/04/06 11:23:

On 12/3/06, Kaveh R. GHAZI <[EMAIL PROTECTED]> wrote:


I'd like to give everyone enough time to update their personal 
installations and regression testers before installing this.  Does

 one week sound okay?  If there are no objections, that's what I'd
 like to do.


Please don't.  It'll be a hassle for us again and will cause 
automatic testers to again miss some days or weeks during stage1 
(given christmas holiday season is near). Rather defer to the start 
of stage3 please.



Agreed, please don't.  The whole MPFR thing is already fairly annoying.
I have just updated all my machines with a special RPM I got from Jakub. 
 I don't want to go through that again so soon.

Re: [PATCH]: Require MPFR 2.2.1

2006-12-04 Thread Andrew MacLeod

On Mon, 2006-12-04 at 17:23 +0100, Richard Guenther wrote:
> On 12/3/06, Kaveh R. GHAZI <[EMAIL PROTECTED]> wrote:
> > This patch updates configure to require MPFR 2.2.1 as promised here:
> > http://gcc.gnu.org/ml/gcc/2006-12/msg00054.html
> >
> > Tested on sparc-sun-solaris2.10 using mpfr-2.2.1, mpfr-2.2.0 and an older
> > mpfr included with gmp-4.1.4.  Only 2.2.1 passed (as expected).
> >
> > I'd like to give everyone enough time to update their personal
> > installations and regression testers before installing this.  Does one
> > week sound okay?  If there are no objections, that's what I'd like to do.
> 
> Please don't.  It'll be a hassle for us again and will cause automatic testers
> to again miss some days or weeks during stage1 (given christmas
> holiday season is near).
> Rather defer to the start of stage3 please.
> 

Yes. Please leave this for a while. 2.2.1 is only required to fix a few
bugs, not to build and run the compiler. I think its enough to just
update the recommended version for now. 

Those that care enough about those bugs can then go and get the new
version, the rest of us can happily plod along without having to touch
every machine again.

Andrew

Richard Guenther appointed middle-end maintainer

2006-12-04 Thread David Edelsohn

I am pleased to announce that the GCC Steering Committee has
appointed Richard Guenther as non-algorithmic middle-end maintainer.

Please join me in congratulating Richi on his new role.  Richi,
please update your listings in the MAINTAINERS file.

Happy hacking!
David

Re: Richard Guenther appointed middle-end maintainer

2006-12-04 Thread Paolo Carlini


David Edelsohn wrote:


I am pleased to announce that the GCC Steering Committee has
appointed Richard Guenther as non-algorithmic middle-end maintainer.

Please join me in congratulating Richi on his new role.


Yes, congratulations Richard!

Paolo.

Re: Announce: MPFR 2.2.1 is released

2006-12-04 Thread Joe Buck

On Sat, Dec 02, 2006 at 12:01:45PM -0500, Kaveh R. GHAZI wrote:
> Hi Vincent, thanks for making this release.  Since this version of mpfr
> fixes important bugs encountered by GCC, I've updated the gcc
> documentation and error messages to refer to version 2.2.1.
> 
> I have NOT (yet) updated gcc's configure to force the issue.  I'll wait a
> little while to let people upgrade.

Kaveh,

IMHO, you should *never* update gcc's configure to force the issue.  To do
so would be unprecedented.

configure doesn't refuse to build gcc with older binutils versions, even
though those versions cause some tests to fail that pass with newer
versions.  Similarly, people aren't forced to upgrade their glibc
because some tests fail with older versions.

In my view, the only time configure should fail because of an
old library version is if going ahead with the build would produce a
completely nonfunctional compiler.  I wouldn't care if a warning message
is generated.

Re: setting up stack frame, regression (gcc 4.1.0) ?

2006-12-04 Thread Andrew Pinski

On Mon, 2006-12-04 at 15:47 +0200, Kimmo Fredriksson wrote:
> Hi,
> 
> Consider this (simplified) test case:
> 
> int bar (int a)
> {
> int b;
> memcpy (&b, &a, sizeof (int));
> return b;
> }
> The local variable is optimized away, so what's the point adjusting esp?

This testcase has already been fixed in 4.2.0.  There are other related
memcpy cases which have not been fixed yet though. 

The reason why the pointer adjust is happening is that the local
variable is not really optimized away until after it was already placed
on the stack.

Thanks,
Andrew Pinski

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Howard Hinnant


On Dec 4, 2006, at 11:27 AM, Richard Guenther wrote:


On 12/3/06, Toon Moene <[EMAIL PROTECTED]> wrote:

Richard,

Somewhere, in a mail lost in my large e-mail clash with my ISP
(verizon), you said that gfortran couldn't profit from the pow(x,  
1./3.)

-> cbrt(x) conversion because gfortran didn't "know" of cbrt.

Could you be more explicit about this - I'd like to repair this
deficiency, if at all possible.

Thanks in advance !


It's a matter of making the cbrt builtin available - I have a patch  
for this,
but wondered if the fortran frontend can rely on the cbrt library  
call being
available?  Or available in a fast variant, not a fallback  
implementation in
libgfortran which does pow (x, 1./3.) which will then of course  
pessimize

pow (x, 2./3.) -> tmp = cbrt(x); tmp * tmp  expansion.


Is pow(x, 1./3.) == cbrt(x) ?

My handheld calculator says (imagining a 3 decimal digit machine):

pow(64.0, .333) == 3.99

In other words, can pow assume that if it sees .333, that the client  
actually meant the non-representable 1/3?  Or must pow assume that . 
333 means .333?  My inclination is that if pow(x, 1./3.) (computed  
correctly to the last bit) ever differs from cbrt(x) (computed  
correctly to the last bit) then this substitution should not be done.


-Howard

Re: Reload Problem in delete_output_reload

2006-12-04 Thread Ulrich Weigand

Erwin Unruh wrote:

> I have a problem with delete_output_reload. It sometimes deletes
> instructions
> which are needed. Here an analysis of a recent case (In a private
> version of
> the S390 port). The original S390 shows almost the same reloads, but
> chooses
> different registers.

What GCC version your compiler based on?

> Reloads for insn # 1598
> Reload 0: reload_in (SI) =3D (const_int 4080 [0xff0])
>   ADDR_REGS, RELOAD_FOR_OUTPUT_ADDRESS (opnum =3D 0)
>   reload_in_reg: (const_int 4080 [0xff0])
>   reload_reg_rtx: (reg:SI 2 2)
> Reload 1: reload_in (SI) =3D (const_int 4080 [0xff0])
>   ADDR_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum =3D 0)
>   reload_in_reg: (const_int 4080 [0xff0])
>   reload_reg_rtx: (reg:SI 2 2)
> Reload 2: reload_in (SI) =3D (const_int 4080 [0xff0])
>   ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum =3D 2)
>   reload_in_reg: (const_int 4080 [0xff0])
>   reload_reg_rtx: (reg:SI 2 2)
> Reload 3: reload_in (DI) =3D (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15)
> (const_int
> 4080 [0xff0]))
> (const_int 3144
> [0xc48])) [0 S8 A8])
>   reload_out (DI) =3D (mem/c:DI (plus:SI (plus:SI (reg/f:SI 15 15)
> (const_int
> 4080 [0xff0]))
> (const_int 3136
> [0xc40])) [0 S8 A8])
>   GENERAL_REGS, RELOAD_OTHER (opnum =3D 0), can't combine
>   reload_in_reg: (reg:DI 1391)
>   reload_out_reg: (reg:DI 1393)
>   reload_reg_rtx: (reg:DI 0 0)
> Reload 4: ADDR_REGS, RELOAD_FOR_INPUT_ADDRESS (opnum =3D 2), can't
> combine, secondary_reload_p
>   reload_reg_rtx: (reg:SI 3 3)
> Reload 5: reload_in (SI) =3D (plus:SI (plus:SI (reg/f:SI 15 15)
> (const_int 4080
> [0xff0]))
> (const_int 3136
> [0xc40]))
>   ADDR_REGS, RELOAD_FOR_INPUT (opnum =3D 2), inc by 8
>   reload_in_reg: (plus:SI (plus:SI (reg/f:SI 15 15)
> (const_int 4080
> [0xff0]))
> (const_int 3136
> [0xc40]))
>   reload_reg_rtx: (reg:SI 2 2)
>   secondary_in_reload =3D 4
>   secondary_in_icode =3D reload_insi
> 
> These reloads are ok. In do_output_reload it is noted that both
> insn_1597.Reload_2 and insn_1598.Reload_3 write to the same stack slot.
> So the compiler decides to remove the first reload and use register
> (reg:DI 2) > directly. In this analysis it misses the fact that (reg:SI 2)
> is used for input reloads of insn 1598.

This should actually be caught by the free_for_value_p check in
choose_reload_regs.  You cannot inherit a value for a RELOAD_OTHER
reload (3) in a register that is already in use for a RELOAD_FOR_INPUT_
ADDRESS reload (2).

Could you try to find out why this doesn't work correctly?

> One critical point is the timing on the variables reg_reloaded_valid and
> spill_reg_store.
> Within the function emit_reload_insns they are first checked (within
> do_output_reload) and later updated (after the reload instructions are
> written).
> So they reflect the state before the "reload sequence". Not all usages
> reflect
> this semantics. Especially the check within delete_output_reload is not
> correct.

I'm not sure how delete_output_reload comes into play here.  The decision
to inherit was already made long ago, in choose_reload_regs, and that is
already incorrect.  Even if the output reload for insn 1597 were not 
deleted at this point, the code would still be incorrect.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  [EMAIL PROTECTED]

Re: Optimizing a 16-bit * 8-bit -> 24-bit multiplication

2006-12-04 Thread David Nicol


here's an ignorant, naive, and very likely wrong attempt:

what happens if you mask off the high and low bytes of the
larger number, do two 8,8->16 multiplies, left shift the result
of the result of the higher one, and add, as a macro?

#define _mul8x16(c,s)  (  \
  (long int) ((c) * (unsigned char) ( (s) & 0x00FF) )\
  +
 (long int) ( \
(long int) ( (c) * (unsigned char) ( ( (s) & 0xFF00 ) >> 8)) \
<< 8  \
  ) \
)

what would that do?  I don't know which if any of the casting is needed,
or what exactly you have to do to suppress upgrading the
internal representations to 32-bit until the left-shift and the add;
one would expect that multiplying a char by a short on this platform
would produce that code, and that the avr-gcc-list would be the
right place to find someone who could make that happen.

since you know the endianness of your machine, you could reasonably
pull chars out of the long directly instead of shifting and masking, also
you could store the to-be-shifted result directly into an address one byte
off from the address of the integer that you will add the low result to --
that's what you're proposing unions for, right?


On 12/1/06, Shaun Jackman <[EMAIL PROTECTED]> wrote:

I would like to multiply a 16-bit number by an 8-bit number and
produce a 24-bit result on the AVR. The AVR has a hardware 8-bit *
8-bit -> 16-bit multiplier.

If I multiply a 16-bit number by a 16-bit number, it produces a 16-bit
result, which isn't wide enough to hold the result.

If I cast one of the operands to 32-bit and multiply a 32-bit number
by a 16-bit number, GCC generates a call to __mulsi3, which is the
routine to multiply a 32-bit number by a 32-bit number and produce a
32-bit result and requires ten 8-bit * 8-bit multiplications.

A 16-bit * 8-bit -> 24-bit multiplication only requires two 8-bit *
8-bit multiplications. A 16-bit * 16-bit -> 32-bit multiplication
requires four 8-bit * 8-bit multiplications.

I could write a mul24_16_8 (16-bit * 8-bit -> 24-bit) function using
unions and 8-bit * 8-bit -> 16-bit multiplications, but before I go
down that path, is there any way to coerce GCC into generating the
code I desire?

Cheers,
Shaun




--
perl -le'1while(1x++$_)=~/^(11+)\1+$/||print'

Re: Announce: MPFR 2.2.1 is released

2006-12-04 Thread Richard Kenner

> IMHO, you should *never* update gcc's configure to force the issue.  To do
> so would be unprecedented.

I'm not in favor of this either, but aren't there precedents with either
automake, autoconf, or both?

Re: Announce: MPFR 2.2.1 is released

2006-12-04 Thread Joe Buck

On Mon, Dec 04, 2006 at 02:09:25PM -0500, Richard Kenner wrote:
> > IMHO, you should *never* update gcc's configure to force the issue.  To do
> > so would be unprecedented.
> 
> I'm not in favor of this either, but aren't there precedents with either
> automake, autoconf, or both?

The ordinary user who builds gcc from source does not need *any* version
of automake, autoconf, etc., so any strict requirements that are imposed
on these tools have an impact only on gcc developers.

Re: how to test multiple warnings?

2006-12-04 Thread Manuel López-Ibáñez

Dear Janis,

I am having problems implementing your proposal.

The following testcase should fail with current mainline for every
dg-bogus.  It actually passes perfectly :-(.  I have tried removing
the  dg-warning tests but then only the first dg-bogus fails, while
the other dg-bogus pass. The results are also unexpected if you remove
only one or two dg-warning.

Any idea of what is going on ?

Cheers,

Manuel.

/* { dg-do compile } */
/* { dg-options "-std=c99 -Woverflow" } */

#include 

/* Test for duplicated warnings.  */
int
g (void)
{
 return - - - - -INT_MIN; /* { dg-warning "warning: integer overflow
in expression" } */
 /* { dg-bogus "integer overflow in expression.*integer overflow in
expression" "duplicate" { target *-*-* } 10 } */
}

int
g1 (void)
{
 return 2 - (0 * (INT_MAX + 1)); /* { dg-warning "warning: integer
overflow in expression" } */
 /* { dg-bogus "integer overflow in expression.*integer overflow in
expression" "duplicate" { target *-*-* } 17 } */
}

int
g2 (void)
{
 return ((INT_MAX + 1) * 0) - 2; /* { dg-warning "warning: integer
overflow in expression" } */
 /* { dg-bogus "integer overflow in expression.*integer overflow in
expression" "duplicate" { target *-*-* } 24 } */
}

On 01/12/06, Janis Johnson <[EMAIL PROTECTED]> wrote:

On Thu, Nov 30, 2006 at 07:25:47PM +, Manuel López-Ibáñez wrote:
> Hi,
>
> PR19978 reports that some overflow warnings are emitted multiple
> times. Like for example,
>
> test.c:6: warning: integer overflow in expression
> test.c:6: warning: integer overflow in expression
> test.c:6: warning: integer overflow in expression
>
> The current testsuite will match any number of those to a single {
> dg-warning }. I don't know whether this is a known limitation, a bug
> on the testsuite or it just needs some magic.

As discussed on IRC, processing of dg-warning and dg-error is done in
code that's part of the DejaGnu project, and it matches all regular
expressions on the line.

> How could I test that exactly one warning was emitted?

Here's a way to treat duplicate messages as errors; the first test case
fails because it has duplicate messages, the second passes.

Janis

/* { dg-do compile } */
#include 
int
f (void)
{
  return INT_MAX + 1 - INT_MAX;   /* { dg-bogus "integer overflow in expression.*integer 
overflow in expression" "duplicate" } */
}

/* { dg-do compile } */
#include 
int
f (void)
{
  ;   /* { dg-bogus "control reaches end.*control reaches end" "duplicate" } */
}

Re: Announce: MPFR 2.2.1 is released

2006-12-04 Thread DJ Delorie

Joe Buck <[EMAIL PROTECTED]> writes:
> The ordinary user who builds gcc from source does not need *any*
> version of automake, autoconf, etc., so any strict requirements that
> are imposed on these tools have an impact only on gcc developers.

I wish we could have similar requirements for GMP and MPFR, rather
than requiring the user to pre-install them on pretty much EVERY
computer.

Re: Richard Guenther appointed middle-end maintainer

2006-12-04 Thread Eric Botcazou

> Yes, congratulations Richard!

Auf Deutsch, bitte! :-)

-- 
Eric Botcazou

Re: Richard Guenther appointed middle-end maintainer

2006-12-04 Thread Paolo Carlini


Eric Botcazou wrote:

Yes, congratulations Richard!


Auf Deutsch, bitte! :-)
  

Mais oui, mon ami! ;)

Paolo.

words of caution and tuples merge update

2006-12-04 Thread Aldy Hernandez

I have finished a merge from mainline @ 119445 into the tuples branch.  It
bootstraps and exhibits only one regression in libjava:

FAIL: PR27908 -O3 execution - source compiled test

This has been around forever, I've just been lazy and hadn't figured out
how to run java tests individually.  Tom Tromey showed me how, so I should
have this wrapped up shortly.

The plan is to run benchmark tests tonight/tommorrow, post them, and merge
into MAINLINE tommorrow.

For those of you who haven't been keeping up (ahem, most everyone).  After
the merge into mainline, it is now invalid to generate MODIFY_EXPRs after
gimplification has run, GIMPLE_MODIFY_STMT is the new thing and its
operands can be accessed with GIMPLE_STMT_OPERAND.  GIMPLE_MODIFY_STMTs
have no type and no TREE_CHAIN.  Otherwise, everything should work as
it does now.

Cheerios.
Aldy

packaging

2006-12-04 Thread Jacquelyn Brooksavi

Bootstrap failures with build/genextract

2006-12-04 Thread Art Haas

Hi.

I've been seeing bootstrap failures in my last few attempts to build 
mainline GCC. My last successful build was around November 26. I'm
running on i686-pc-linux-gnu using Debian unstable. My builds are
configured with the following short script:

$ cat gcc_conf.sh
#!/bin/sh

params="-march=pentium2"

CPPFLAGS="-DNDEBUG" CFLAGS="-O2 ${params}" CXXFLAGS="-O2 ${params} 
-fno-check-new" /mnt/src/gcc/configure --prefix=/opt/gnu --enable-shared 
--enable-threads=posix --enable-__cxa_atexit --enable-languages="c,c++,objc" 
--disable-checking --with-system-zlib --with-gc=page --disable-libstdcxx-pch

export _POSIX2_VERSION=199209
make bootstrap-lean BOOT_CFLAGS="-O2 ${params}" CXXFLAGS="-O2 ${params} 
-fno-check-new" > make_out 2>&1

# make -k check > check_out 2>&1
$

The output of my last successful build:

$ gcc -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: /mnt/src/gcc/configure --prefix=/opt/gnu
--enable-shared --enable-threads=posix --enable-__cxa_atexit
--enable-languages=c,c++,objc --disable-checking --with-system-zlib
--with-gc=page --disable-libstdcxx-pch
Thread model: posix
gcc version 4.3.0 20061127 (experimental)

Here's the snippet of the build log where things fail:

[ ... snip ... ]
/usr/src/gcc_svn/objdir-1204/./prev-gcc/xgcc 
-B/usr/src/gcc_svn/objdir-1204/./prev-gcc/ -B/opt/gnu/i686-pc-linux-gnu/bin/   
-O2 -march=pentium2 -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes 
-Wmissing-prototypes -pedantic -Wno-long-long -Wno-variadic-macros 
-Wno-overlength-strings -Wold-style-definition -Wmissing-format-attribute 
-Werror-DHAVE_CONFIG_H -DGENERATOR_FILE  -o build/genextract \
build/genextract.o build/rtl.o build/read-rtl.o build/ggc-none.o 
build/vec.o build/min-insn-modes.o build/gensupport.o build/print-rtl.o 
build/errors.o .././libiberty/libiberty.a
build/genextract /mnt/src/gcc/gcc/config/i386/i386.md \
  insn-conditions.md > tmp-extract.c
/bin/sh: line 1: 20227 Segmentation fault  build/genextract 
/mnt/src/gcc/gcc/config/i386/i386.md insn-conditions.md >tmp-extract.c
make[3]: *** [s-extract] Error 139
make[3]: Leaving directory `/usr/src/gcc_svn/objdir-1204/gcc'
make[2]: *** [all-stage2-gcc] Error 2
make[2]: Leaving directory `/usr/src/gcc_svn/objdir-1204'
make[1]: *** [stage2-bubble] Error 2
make[1]: Leaving directory `/usr/src/gcc_svn/objdir-1204'
make: *** [bootstrap-lean] Error 2

I also am seeing similar failures when building on a SMP PIII machine
running Redhat Rawhide, but on an Athlon machine running FC5 the builds
have been completing successfully. The Athlon box has 1G of physical
memory, where the rawhide machine has 256M with 1G swap and my Debian
box has 128M with 100M swap. My most recent build attempt was last
night on the Debian box, and the failure above comes from that build.

Anyone else seeing similar failures? Any ideas as to why things go boom?

Thanks.

Art Haas
-- 
Man once surrendering his reason, has no remaining guard against absurdities
the most monstrous, and like a ship without rudder, is the sport of every wind.

-Thomas Jefferson to James Smith, 1822

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Richard Guenther

On 12/4/06, Howard Hinnant <[EMAIL PROTECTED]> wrote:

On Dec 4, 2006, at 11:27 AM, Richard Guenther wrote:

> On 12/3/06, Toon Moene <[EMAIL PROTECTED]> wrote:
>> Richard,
>>
>> Somewhere, in a mail lost in my large e-mail clash with my ISP
>> (verizon), you said that gfortran couldn't profit from the pow(x,
>> 1./3.)
>> -> cbrt(x) conversion because gfortran didn't "know" of cbrt.
>>
>> Could you be more explicit about this - I'd like to repair this
>> deficiency, if at all possible.
>>
>> Thanks in advance !
>
> It's a matter of making the cbrt builtin available - I have a patch
> for this,
> but wondered if the fortran frontend can rely on the cbrt library
> call being
> available?  Or available in a fast variant, not a fallback
> implementation in
> libgfortran which does pow (x, 1./3.) which will then of course
> pessimize
> pow (x, 2./3.) -> tmp = cbrt(x); tmp * tmp  expansion.

Is pow(x, 1./3.) == cbrt(x) ?

My handheld calculator says (imagining a 3 decimal digit machine):

pow(64.0, .333) == 3.99

In other words, can pow assume that if it sees .333, that the client
actually meant the non-representable 1/3?  Or must pow assume that .
333 means .333?

It certainly will _not_ recognize 0.333 as 1/3 (or 0.33 as used in
Polyhedron aermod).  Instead it will require the exponent to be
exactly equal to the result of 1./3. in the precision of the exponent,
with correct rounding.

My inclination is that if pow(x, 1./3.) (computed
correctly to the last bit) ever differs from cbrt(x) (computed
correctly to the last bit) then this substitution should not be done.

C99 7.12.7.1 says "The cbrt functions compute the real cube root
of x." and "The cbrt functions return x**1/3".  So it looks to me
that cbrt is required to return the same as pow(x, 1/3).

Richard.

Re: Richard Guenther appointed middle-end maintainer

2006-12-04 Thread Richard Guenther


On 12/4/06, David Edelsohn <[EMAIL PROTECTED]> wrote:

I am pleased to announce that the GCC Steering Committee has
appointed Richard Guenther as non-algorithmic middle-end maintainer.

Please join me in congratulating Richi on his new role.  Richi,
please update your listings in the MAINTAINERS file.


Thanks!  Any objections to add a new section in the MAINTAINERS file
like below?

Richard.

2006-12-04  Richard Guenther  <[EMAIL PROTECTED]>

   * MAINTAINERS (Non-Algorithmic Maintainers): New section.
   (Non-Algorithmic Maintainers): Move over non-algorithmic
   loop optimizer maintainers, add myself as a non-algorithmic
   middle-end maintainer.


p
Description: Binary data

Re: Richard Guenther appointed middle-end maintainer

2006-12-04 Thread Richard Guenther

On 12/4/06, Richard Guenther <[EMAIL PROTECTED]> wrote:

On 12/4/06, David Edelsohn <[EMAIL PROTECTED]> wrote:
> I am pleased to announce that the GCC Steering Committee has
> appointed Richard Guenther as non-algorithmic middle-end maintainer.
>
> Please join me in congratulating Richi on his new role.  Richi,
> please update your listings in the MAINTAINERS file.

Thanks!  Any objections to add a new section in the MAINTAINERS file
like below?

Committed after "ok" on IRC.

Richard.

2006-12-04  Richard Guenther  <[EMAIL PROTECTED]>

* MAINTAINERS (Non-Algorithmic Maintainers): New section.
(Non-Algorithmic Maintainers): Move over non-algorithmic
loop optimizer maintainers, add myself as a non-algorithmic
middle-end maintainer.

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Howard Hinnant


On Dec 4, 2006, at 4:57 PM, Richard Guenther wrote:




My inclination is that if pow(x, 1./3.) (computed
correctly to the last bit) ever differs from cbrt(x) (computed
correctly to the last bit) then this substitution should not be done.


C99 7.12.7.1 says "The cbrt functions compute the real cube root
of x." and "The cbrt functions return x**1/3".  So it looks to me
that cbrt is required to return the same as pow(x, 1/3).


 For me, this:

#include 
#include 

int main()
{
printf("pow(100., 1./3.) = %a\ncbrt(100.)   = %a\n",  
pow(100., 1./3.), cbrt(100.));

}

prints out:

pow(100., 1./3.) = 0x1.8fffep+6
cbrt(100.)   = 0x1.9p+6

I suspect that both are correct, rounded to the nearest least  
significant bit.  Admittedly I haven't checked the computation by  
hand for pow(100., 1./3.).  But I did the computation using gcc  
4.0 on both PPC and Intel Mac hardware, and on PPC using CodeWarrior  
math libs, and got the same results on all three platforms.


The pow function is not raising 1,000,000 to the power of 1/3.  It is  
raising 1,000,000 to some power which is very close to, but not equal  
to 1/3.


Perhaps I've misunderstood and you have some way of exactly  
representing the fraction 1/3 in Gfortran.  In C and C++ we have no  
way to exactly represent that fraction except implicitly using cbrt.  
(or with a user-defined rational type)


-Howard

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Richard Guenther

On 12/4/06, Howard Hinnant <[EMAIL PROTECTED]> wrote:

On Dec 4, 2006, at 4:57 PM, Richard Guenther wrote:

>
>> My inclination is that if pow(x, 1./3.) (computed
>> correctly to the last bit) ever differs from cbrt(x) (computed
>> correctly to the last bit) then this substitution should not be done.
>
> C99 7.12.7.1 says "The cbrt functions compute the real cube root
> of x." and "The cbrt functions return x**1/3".  So it looks to me
> that cbrt is required to return the same as pow(x, 1/3).

 For me, this:

#include 
#include 

int main()
{
 printf("pow(100., 1./3.) = %a\ncbrt(100.)   = %a\n",
pow(100., 1./3.), cbrt(100.));
}

prints out:

pow(100., 1./3.) = 0x1.8fffep+6
cbrt(100.)   = 0x1.9p+6

I suspect that both are correct, rounded to the nearest least
significant bit.  Admittedly I haven't checked the computation by
hand for pow(100., 1./3.).  But I did the computation using gcc
4.0 on both PPC and Intel Mac hardware, and on PPC using CodeWarrior
math libs, and got the same results on all three platforms.

The pow function is not raising 1,000,000 to the power of 1/3.  It is
raising 1,000,000 to some power which is very close to, but not equal
to 1/3.

Perhaps I've misunderstood and you have some way of exactly
representing the fraction 1/3 in Gfortran.  In C and C++ we have no
way to exactly represent that fraction except implicitly using cbrt.
(or with a user-defined rational type)

1./3. is represented (round-to-nearest) as 0x1.5p-2
and pow (x, 1./3.) is of course not the same as the cube-root of x with
exact arithmetic.  cbrt as defined by C99 suggests that an
approximation by pow (x, 1./3.) fullfils the requirements.  The question
is whether a correctly rounded "exact" cbrt differs from the pow
replacement by more than 1ulp - it looks like this is not the case.

For cbrt (100.) I get the same result as from pow (100., nextafter (1./3., 1))
for example.  So, instead of only recognizing 1./3. rounded to nearest
we should recognize both representable numbers that are nearest to
1./3..

Richard.

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Joseph S. Myers

On Tue, 5 Dec 2006, Richard Guenther wrote:

> is whether a correctly rounded "exact" cbrt differs from the pow
> replacement by more than 1ulp - it looks like this is not the case.

They fairly obviously differ for negative arguments, which are valid for 
cbrt but not for pow (raising to a fraction with even denominator).  (The 
optimization from pow to cbrt is valid if you don't care about no longer 
getting a NaN from a negative argument.  Converting the other way (cbrt to 
pow) is only OK if you don't care about negative arguments to cbrt at 
all.)

-- 
Joseph S. Myers
[EMAIL PROTECTED]

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Richard Guenther

On 12/5/06, Joseph S. Myers <[EMAIL PROTECTED]> wrote:

On Tue, 5 Dec 2006, Richard Guenther wrote:

> is whether a correctly rounded "exact" cbrt differs from the pow
> replacement by more than 1ulp - it looks like this is not the case.

They fairly obviously differ for negative arguments, which are valid for
cbrt but not for pow (raising to a fraction with even denominator).  (The
optimization from pow to cbrt is valid if you don't care about no longer
getting a NaN from a negative argument.  Converting the other way (cbrt to
pow) is only OK if you don't care about negative arguments to cbrt at
all.)

True, F.9.4.4 says "pow (x, y) returns a NaN and raises the invalid
floating-point
exception for finite x < 0 and finite non-integer y.  I'll adjust the expander
to cover these cases by conditionalizing on tree_nonnegative_p or
HONOR_NANS.  I will probably also require flag_unsafe_math_optimizations
as we else will optimize cbrt (x) - pow (x, 1./3.) to zero.  Or even
pow (x, 1./3.) - pow (x, nextafter (1./3, 1)).

Richard.

const and strict aliasing rules

2006-12-04 Thread John L. Kulp

I have a situation where I have a class that internally maintains a 
container object of non-const objects that it wants to publish to 
clients as a const container of const objects, that is, clients can't 
modify the list or the items in the list.  The data member wants to be 
non-const because the managing class needs to create and delete the foo 
objects.


data member:
   PtrList listOfFoo;

publishing member functions:
const PtrList &GetFooList () const {
   return reinterpret_cast&>(listOfFoo);
}

This results in the warning:
"type punning to incomplete type might break strict aliasing rules"
because the types aren't exactly the same with regard to cv qualifiers.  
While I understand the general issue of aliasing incompatible types, 
simply adding const qualifiers shouldn't provoke aliasing warnings, as I 
believe it is a reasonable thing to want to do, at least in the 
direction of adding const.  I understand it is probably hard to detect 
this in situations like the above.  Perhaps there is a need for the 
inverse to the const_cast construct, namely, add const rather than 
remove it.


You can work around this by using union's of pointers of both non-const 
and const types, but the reinterpret_cast solution would be more attractive.


Thoughts?

Re: const and strict aliasing rules

2006-12-04 Thread Andrew Pinski

> You can work around this by using union's of pointers of both non-const 
> and const types, but the reinterpret_cast solution would be more attractive.
> 
> Thoughts?

This has nothing to do with const vs non-const but rather
a and a are two seperate types which are not related in any way.
The C++ standard defines these two types as seperate types and are not 
compatiable
in any way for aliasing.

-- Pinski

Re: const and strict aliasing rules

2006-12-04 Thread Gabriel Dos Reis

Andrew Pinski <[EMAIL PROTECTED]> writes:

| > You can work around this by using union's of pointers of both non-const 
| > and const types, but the reinterpret_cast solution would be more attractive.
| > 
| > Thoughts?
| 
| This has nothing to do with const vs non-const but rather
| a and a are two seperate types which are not related in any 
way.
| The C++ standard defines these two types as seperate types and are not 
compatiable
| in any way for aliasing.

???

3.10/15:

   If a program attempts to access the stored value of an object
   through an lvalue of other than one of the following types the
   behavior is undefined48): 
   -- the dynamic type of the object,

   -- *a cv-qualified version of the dynamic type of the object*,

   -- a type that is the signed or unsigned type corresponding to the
  dynamic type of the object,

   -- a type that is the signed or unsigned type corresponding to a
  cv-qualified version of the dynamic type of the object,

   -- an aggregate or union type that includes one of the
  aforementioned types among its members (including,
  recursively, a member of a subaggregate or contained union),

   -- a type that is a (possibly cv-qualified) base class type of the
  dynamic type of the object,

   -- a char or unsigned char type.

-- Gaby

Re: const and strict aliasing rules

2006-12-04 Thread Andrew Pinski

> 
> Andrew Pinski <[EMAIL PROTECTED]> writes:
> 
> | > You can work around this by using union's of pointers of both non-const 
> | > and const types, but the reinterpret_cast solution would be more 
> attractive.
> | > 
> | > Thoughts?
> | 
> | This has nothing to do with const vs non-const but rather
> | a and a are two seperate types which are not related in any 
> way.
> | The C++ standard defines these two types as seperate types and are not 
> compatiable
> | in any way for aliasing.
> 
> ???
> 
> 3.10/15:

And the template type a are distict from a based on the template
arguments are diffrent and therefore the qualifier part of 3.10/15 does not 
apply.
If it was const a and a then it would apply.

-- Pinski

Re: [PATCH] Canonical types (1/3)

2006-12-04 Thread Mark Mitchell

Doug Gregor wrote:
> This patch introduces canonical types into GCC, which allow us to
> compare two types very efficiently and results in an overall
> compile-time performance improvement. I have been seeing 3-5%
> improvements in compile time on the G++ and libstdc++ test suites,
> 5-10% on template-heavy (but realistic) code in Boost, and up to 85%
> improvement for extremely template-heavy metaprogramming.

The new macros in tree.h (TYPE_CANONICAL and TYPE_STRUCTURAL_EQUALITY)
need documentation, at least in tree.h, and, ideally, in the ill-named
c-tree.texi as well.

I want to make sure I understand this idiom, in
build_pointer_type_for_mode, and elsewhere:

+  if (TYPE_CANONICAL (to_type) != to_type)
+TYPE_CANONICAL (t) =
+  build_pointer_type_for_mode (TYPE_CANONICAL (to_type),
+  mode, can_alias_all);

If there was already a pointer type to the canonical type of to_type,
then the call build_pointer_type_for_mode will return it.  If there
wasn't, then we will build a new canonical type for that pointer type.
We can't use the pointer type we're building now (i.e., "T") as the
canonical pointer type because we have would have no way to find it in
future, when creating another pointer type for the canonical version of
to_type.

So, we are actually creating more type nodes in this case.  That seems
unfortunate, though I fully understand we're intentionally trading space
for speed just by adding the new type fields.  A more dramatic version
of your change would be to put the new pointer type on the
TYPE_POINTER_TO list for the canonical to_type, make it the canonical
pointer type, and then have the build_pointer_type_for_mode always go to
the canonical to_type to search TYPE_POINTER_TO, considering types to be
an exact match only if they had more fields in common (like, TYPE_NAME
and TYPE_CONTEXT, say).  Anyhow, your approach is fine, at least for now.

+  TYPE_STRUCTURAL_EQUALITY (t) = TYPE_STRUCTURAL_EQUALITY (to_type);

Does it ever make sense to have both TYPE_CANONICAL and
TYPE_STRUCTURAL_EQUALITY set?  If we have to do the structural equality
test, then it seems to me that the canonical type isn't useful, and we
might as well not construct it.

> +  type = build_variant_type_copy (orig_type);
>TYPE_ALIGN (type) = boundary;
> +  TYPE_CANONICAL (type) = TYPE_CANONICAL (orig_type);

Eek.  So, despite having different alignments, we consider these types
"the same"?  If that's what we already do, then it's OK to preserve that
behavior, but it sure seems worrisome.

I'm going to review patch 2/3 here too, since I don't think we should
add the fields in patch 1 until we have something that can actually take
advantage of them; otherwise, we'd just be wasting (more) memory.

+  else if (strict == COMPARE_STRUCTURAL)
+return structural_comptypes (t1, t2, COMPARE_STRICT);

Why do we ever want the explicit COMPARE_STRUCTURAL?

+static hashval_t
+cplus_array_hash (const void* k)
+{
+  hashval_t hash;
+  tree t = (tree) k;
+
+  hash = (htab_hash_pointer (TREE_TYPE (t))
+ ^ htab_hash_pointer (TYPE_DOMAIN (t)));
+
+  return hash;
+}

Since this hash function is dependent on pointer values, we'll get
different results on different hosts.  I was worried that will lead to
differences in generated debug information, perhaps due to different
TYPE_UIDs -- but it looks like there is only ever one matching entry in
the table, so we never have to worry about the compiler "randomly"
choosing between two equally good choices?

Have you tested with flag_check_canonical_types on, and verified that
you get no warnings?  (I agree that a --param for that would be better;
if a user ever has to turn this on, we broke the compiler.)

Thanks,

-- 
Mark Mitchell
CodeSourcery
[EMAIL PROTECTED]
(650) 331-3385 x713

Bizarre inlining type promotion effect

2006-12-04 Thread Shaun Jackman


In the code snippet below, the function mul_8_8 compiles to use
exactly one `mul' instruction on the AVR. The function mul_16_8 calls
mul_8_8 twice. If mul_8_8 is a static inline function and inlined in
mul_16_8, each call generates three `mul' instructions! Why does
inlining mul_8_8 cause each 8x8 multiplication to be promoted to a
16x16 multiplication?

It seems that the inlining mechanism has a real bug if inlining can
cause such a major change in the code generated for a given function.

Cheers,
Shaun

$ avr-gcc --version |head -1
avr-gcc (GCC) 4.1.0
$ cat mul.c
#include 

static uint16_t mul_8_8(uint8_t a, uint8_t b)
{
return a * b;
}

uint32_t mul_16_8(uint16_t a, uint8_t b)
{
uint8_t a0 = a, a1 = a >> 8;
return ((uint32_t)mul_8_8(a1, b) << 8) + mul_8_8(a0, b);
}
$ avr-gcc -c -g -O2 -mmcu=avr4 mul.c
$ avr-objdump -d mul.o

mul.o: file format elf32-avr

Disassembly of section .text:

 :
  0:86 9f   mul r24, r22
  2:c0 01   movwr24, r0
  4:11 24   eor r1, r1
  6:08 95   ret

0008 :
  8:bf 92   pushr11
  a:cf 92   pushr12
  c:df 92   pushr13
  e:ef 92   pushr14
 10:ff 92   pushr15
 12:0f 93   pushr16
 14:1f 93   pushr17
 16:6c 01   movwr12, r24
 18:b6 2e   mov r11, r22
 1a:8d 2d   mov r24, r13
 1c:99 27   eor r25, r25
 1e:f0 df   rcall   .-32; 0x0 
 20:7c 01   movwr14, r24
 22:00 27   eor r16, r16
 24:11 27   eor r17, r17
 26:10 2f   mov r17, r16
 28:0f 2d   mov r16, r15
 2a:fe 2c   mov r15, r14
 2c:ee 24   eor r14, r14
 2e:6b 2d   mov r22, r11
 30:8c 2d   mov r24, r12
 32:e6 df   rcall   .-52; 0x0 
 34:aa 27   eor r26, r26
 36:bb 27   eor r27, r27
 38:e8 0e   add r14, r24
 3a:f9 1e   adc r15, r25
 3c:0a 1f   adc r16, r26
 3e:1b 1f   adc r17, r27
 40:c8 01   movwr24, r16
 42:b7 01   movwr22, r14
 44:1f 91   pop r17
 46:0f 91   pop r16
 48:ff 90   pop r15
 4a:ef 90   pop r14
 4c:df 90   pop r13
 4e:cf 90   pop r12
 50:bf 90   pop r11
 52:08 95   ret
$ sed -i 's/static/& inline/' mul.c
$ avr-gcc -c -g -O2 -mmcu=avr4 mul.c
$ avr-objdump -d mul.o

mul.o: file format elf32-avr

Disassembly of section .text:

 :
  0:ac 01   movwr20, r24
  2:26 2f   mov r18, r22
  4:33 27   eor r19, r19
  6:89 2f   mov r24, r25
  8:99 27   eor r25, r25
  a:82 9f   mul r24, r18
  c:b0 01   movwr22, r0
  e:83 9f   mul r24, r19
 10:70 0d   add r23, r0
 12:92 9f   mul r25, r18
 14:70 0d   add r23, r0
 16:11 24   eor r1, r1
 18:88 27   eor r24, r24
 1a:99 27   eor r25, r25
 1c:98 2f   mov r25, r24
 1e:87 2f   mov r24, r23
 20:76 2f   mov r23, r22
 22:66 27   eor r22, r22
 24:55 27   eor r21, r21
 26:f9 01   movwr30, r18
 28:e4 9f   mul r30, r20
 2a:90 01   movwr18, r0
 2c:e5 9f   mul r30, r21
 2e:30 0d   add r19, r0
 30:f4 9f   mul r31, r20
 32:30 0d   add r19, r0
 34:11 24   eor r1, r1
 36:44 27   eor r20, r20
 38:55 27   eor r21, r21
 3a:62 0f   add r22, r18
 3c:73 1f   adc r23, r19
 3e:84 1f   adc r24, r20
 40:95 1f   adc r25, r21
 42:08 95   ret

Re: Bizarre inlining type promotion effect

2006-12-04 Thread Shaun Jackman


On 12/4/06, Shaun Jackman <[EMAIL PROTECTED]> wrote:

In the code snippet below, the function mul_8_8 compiles to use
exactly one `mul' instruction on the AVR. The function mul_16_8 calls
mul_8_8 twice. If mul_8_8 is a static inline function and inlined in

...

For comparison, a hand-coded 16x8 multiply function requires 11 instructions.

Cheers,
Shaun

mul_16_8:
mul r25, r22
mov r23, r0
mov r25, r1
mul r24, r22
eor r24, r24
mov r22, r0
add r23, r1
adc r24, r25
eor r25, r25
eor r1, r1
ret

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Howard Hinnant


On Dec 4, 2006, at 6:08 PM, Richard Guenther wrote:


The question
is whether a correctly rounded "exact" cbrt differs from the pow
replacement by more than 1ulp - it looks like this is not the case.


If that is the question, I'm afraid your answer is not accurate.  In  
the example I showed the difference is 2 ulp.  The difference appears  
to grow with the magnitude of the argument.  On my systems, when the  
argument is DBL_MAX, the difference is 75 ulp.


pow(DBL_MAX, 1./3.) = 0x1.428a2f98d7240p+341
cbrt(DBL_MAX)   = 0x1.428a2f98d728bp+341

And yes, I agree with you about the C99 standard.  It allows the  
vendor to compute pretty much any answer it wants from either pow or  
cbrt.  Accuracy is not mandated.  And I'm not trying to mandate  
accuracy for Gfortran either.  I just had a knee jerk reaction when I  
read that pow(x, 1./3.) could be optimized to cbrt(x) (and on re- 
reading, perhaps I inferred too much right there).  This isn't just  
an optimization.  It is also an approximation.  Perhaps that is  
acceptable.  I'm only highlighting the fact in case it might be  
important but not recognized.


-Howard

Re: const and strict aliasing rules

2006-12-04 Thread Gabriel Dos Reis

Andrew Pinski <[EMAIL PROTECTED]> writes:

| > 
| > Andrew Pinski <[EMAIL PROTECTED]> writes:
| > 
| > | > You can work around this by using union's of pointers of both non-const 
| > | > and const types, but the reinterpret_cast solution would be more 
attractive.
| > | > 
| > | > Thoughts?
| > | 
| > | This has nothing to do with const vs non-const but rather
| > | a and a are two seperate types which are not related in 
any way.
| > | The C++ standard defines these two types as seperate types and are not 
compatiable
| > | in any way for aliasing.
| > 
| > ???
| > 
| > 3.10/15:
| 
| And the template type a are distict from a based on the 
template
| arguments are diffrent and therefore the qualifier part of 3.10/15 does not 
apply.
| If it was const a and a then it would apply.

Yes, you're right -- somehow I missed the angle brackets.  Sorry.

-- Gaby

Re: Announce: MPFR 2.2.1 is released

2006-12-04 Thread Vincent Lefevre

On 2006-12-04 15:34:32 -0500, DJ Delorie wrote:
> Joe Buck <[EMAIL PROTECTED]> writes:
> > The ordinary user who builds gcc from source does not need *any*
> > version of automake, autoconf, etc., so any strict requirements that
> > are imposed on these tools have an impact only on gcc developers.
> 
> I wish we could have similar requirements for GMP and MPFR, rather
> than requiring the user to pre-install them on pretty much EVERY
> computer.

Do you mean that gcc should be distributed with GMP and MPFR libraries
in the tarball? (There had been a discussion about including them or
not in the Subversion repository, but I haven't seen one concerning
the tarballs.)

-- 
Vincent Lefèvre <[EMAIL PROTECTED]> - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)

Re: Announce: MPFR 2.2.1 is released

2006-12-04 Thread DJ Delorie


> > I wish we could have similar requirements for GMP and MPFR, rather
> > than requiring the user to pre-install them on pretty much EVERY
> > computer.
> 
> Do you mean that gcc should be distributed with GMP and MPFR libraries
> in the tarball? (There had been a discussion about including them or
> not in the Subversion repository, but I haven't seen one concerning
> the tarballs.)

Personally?  At this point, I'm thinking it's the only sane thing to
do.  There is currently NO platform that ships with the right
libraries to build ANY gcc language.  Do we have a guarantee that MPFR
will build with all native toolchains?  Do we really want to impose
this requirement on *all* users, not just maintainers?  GCC used to be
one of the things you could install first, at least for C, now it's
way down the list.

At the very least, we should be configured so that we *could* have an
in-tree mpfr, should vendors choose to add it.  Saving customers the
misery of figuring out how to build and install gmp/mpfr is the type
of value add they'd appreciate.

Re: Announce: MPFR 2.2.1 is released

2006-12-04 Thread Kaveh R. GHAZI

On Mon, 4 Dec 2006, Joe Buck wrote:

> On Sat, Dec 02, 2006 at 12:01:45PM -0500, Kaveh R. GHAZI wrote:
> > Hi Vincent, thanks for making this release.  Since this version of mpfr
> > fixes important bugs encountered by GCC, I've updated the gcc
> > documentation and error messages to refer to version 2.2.1.
> >
> > I have NOT (yet) updated gcc's configure to force the issue.  I'll wait a
> > little while to let people upgrade.
>
> Kaveh,
>
> IMHO, you should *never* update gcc's configure to force the issue.  To do
> so would be unprecedented.
>
> configure doesn't refuse to build gcc with older binutils versions, even
> though those versions cause some tests to fail that pass with newer
> versions.  Similarly, people aren't forced to upgrade their glibc
> because some tests fail with older versions.
>
> In my view, the only time configure should fail because of an
> old library version is if going ahead with the build would produce a
> completely nonfunctional compiler.  I wouldn't care if a warning message
> is generated.

Some people have argued we should wait until stage3 because upgrading
gmp/mpfr on a lot of machines is a pain in the butt.  I sympathize and
agree, however I worry that if we wait until then and then see that
mpfr-2.2.1 introduces a problem we won't find out until very late in the
release process.  My philosophy is we should test ASAP (now) what we
intend to ship later on.

OTOH, Joe you're arguing we should never require people to upgrade.  Well
I think that's unfair to people who rely on gcc to produce correct code.

Yes, I know *all* compilers have bugs.  But these are known fixed bugs (in
mpfr) that you're essentially saying we shouldn't ever "fix" through the
minimum library required. I think it's unworkable to freeze our gmp/mpfr
requirements for all time.  Once more in stage3 might be acceptable, but
frozen forever is too extreme IMHO.

With all modestly in mind, I foresaw these problems when I started this
project.  My initial gut instinct was to include the gmp/mpfr sources in
the tree and have the top level configure build them.  That way we could
ship gcc with the latest greatest sources for these libraries and avoid
pain for anyone (gcc developers or users) who want to build gcc.  We could
import fixes to the libs without disrupting gcc work.  No one would have
to propagate or install the libraries on their test machines.

That idea got nixed, but I think it's time to revisit it.  Paolo has
worked out the kinks in the configury and we should apply his patch and
import the gmp/mpfr sources, IMHO.

I believe then these problems go away.

--Kaveh
--
Kaveh R. Ghazi  [EMAIL PROTECTED]

Re: Announce: MPFR 2.2.1 is released

2006-12-04 Thread Joe Buck

On Mon, Dec 04, 2006 at 09:32:19PM -0500, Kaveh R. GHAZI wrote:
> OTOH, Joe you're arguing we should never require people to upgrade.  Well
> I think that's unfair to people who rely on gcc to produce correct code.

So, should we detect old binutils versions and refuse to build as well?
How about old C library versions?

I suggest warning that there will be bugs, and proceeding.

Re: Gfortran and using C99 cbrt for X ** (1./3.)

2006-12-04 Thread Brooks Moses


Howard Hinnant wrote:

On Dec 4, 2006, at 6:08 PM, Richard Guenther wrote:


The question
is whether a correctly rounded "exact" cbrt differs from the pow
replacement by more than 1ulp - it looks like this is not the case.


If that is the question, I'm afraid your answer is not accurate.  In  
the example I showed the difference is 2 ulp.  The difference appears  
to grow with the magnitude of the argument.  On my systems, when the  
argument is DBL_MAX, the difference is 75 ulp.


pow(DBL_MAX, 1./3.) = 0x1.428a2f98d7240p+341
cbrt(DBL_MAX)   = 0x1.428a2f98d728bp+341


Another relevant question is whether it's monotonic, in the sense that 
cbrt(DBL_MAX) is less than pow(DBL_MAX, 1./3. + 1ulp).  If it's not, 
that could potentially be trouble.


And yes, I agree with you about the C99 standard.  It allows the  
vendor to compute pretty much any answer it wants from either pow or  
cbrt.  Accuracy is not mandated.  And I'm not trying to mandate  
accuracy for Gfortran either.  I just had a knee jerk reaction when I  
read that pow(x, 1./3.) could be optimized to cbrt(x) (and on re- 
reading, perhaps I inferred too much right there).  This isn't just  
an optimization.  It is also an approximation.  Perhaps that is  
acceptable.  I'm only highlighting the fact in case it might be  
important but not recognized.


I think it's important (at least when -fwrong-math ... er, I mean, 
-ffast-math is not specified) that pow(x, 1./3.) return the same value 
as pow(x, y) when y has been set to 1./3. via some process at runtime.


However, this approximation is probably reasonable for -ffast-math.

- Brooks

55 matches

Mail list logo