Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?

2008-05-10 Thread Martin Guy
On 5/9/08, Paolo Bonzini <[EMAIL PROTECTED]> wrote:
>  The idea is to use integer arithmetic to compute the right exponent, and
> the lookup table to estimate the mantissa.  I used something like this for
> square root:
>
>  1) shift the entire FP number by 1 to the right (logical right shift)
>  2) sum 0x2000 so that the exponent is still offset by 64
>  3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, 32-bit
> table
>  4) sum the value (as a 32-bit integer!) with the content of the table
>  5) perform 2 Newton-Raphson iterations as necessary

It normally turns out to be faster to use the magic integer sqrt
algorithm, even when you have multiplication and division in hardware

unsigned long
isqrt(x)
unsigned long x;
{
register unsigned long op, res, one;

op = x;
res = 0;

/* "one" starts at the highest power of four <= than the argument. */
one = 1 << 30;  /* second-to-top bit set */
while (one > op) one >>= 2;

while (one != 0) {
if (op >= res + one) {
op = op - (res + one);
res = res +  2 * one;
}
res >>= 1;
one >>= 2;
}
return(res);
}

The current soft-fp routine in libm seems to use a variant of this,
but of course it may be faster if implemented using the Maverick's
64-bit add/sub/cmp.

M


Re: RFH: Building and testing gimple-tuples-branch

2008-05-10 Thread Diego Novillo

On 5/9/08 4:32 PM, Kaz Kojima wrote:


*  config/sh/sh.c (sh_gimplify_va_arg_expr): Change pre_p and
post_p types to gimple_seq *.


Thanks.  This is certainly OK.


Diego.


Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?

2008-05-10 Thread Andrew Haley
Paolo Bonzini wrote:
> 
>> I'd like to implement something similar for MaverickCrunch, using the
>> integer 32-bit MAC functions, but there is no reciprocal estimate
>> function on the MaverickCrunch.  I guess a lookup table could be
>> implemented, but how many entries will need to be generated, and how
>> accurate will it have to be IEEE754 compliant (in the swdiv routine)?
> 
> I think sh does something like that.  It is quite a mess, as it has half
> a dozen ways to implement division.
> 
> The idea is to use integer arithmetic to compute the right exponent, and
> the lookup table to estimate the mantissa.  I used something like this
> for square root:
> 
> 1) shift the entire FP number by 1 to the right (logical right shift)
> 2) sum 0x2000 so that the exponent is still offset by 64
> 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry,
> 32-bit table
> 4) sum the value (as a 32-bit integer!) with the content of the table
> 5) perform 2 Newton-Raphson iterations as necessary

To avoid the lookup table, calculate

   x = (a/2) + (8^(1/4) - 1)^2

which gives relative errors less than 0.036 over the range 1/2 <= a <= 2
at a cost of one shift and one addition.  The errors after 1,2,3, and 4
iterations of Heron's rule are 0.64E-3, 0.204E-6, 0.211E-13, and 0.222E-27.

So, this requires one more iteration but avoids the use of a table and the
corresponding memory hit.

Source: Computer Approximations, Hart et al.

Andrew.


Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?

2008-05-10 Thread Andrew Haley
Andrew Haley wrote:
> Paolo Bonzini wrote:
>>> I'd like to implement something similar for MaverickCrunch, using the
>>> integer 32-bit MAC functions, but there is no reciprocal estimate
>>> function on the MaverickCrunch.  I guess a lookup table could be
>>> implemented, but how many entries will need to be generated, and how
>>> accurate will it have to be IEEE754 compliant (in the swdiv routine)?
>> I think sh does something like that.  It is quite a mess, as it has half
>> a dozen ways to implement division.
>>
>> The idea is to use integer arithmetic to compute the right exponent, and
>> the lookup table to estimate the mantissa.  I used something like this
>> for square root:
>>
>> 1) shift the entire FP number by 1 to the right (logical right shift)
>> 2) sum 0x2000 so that the exponent is still offset by 64
>> 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry,
>> 32-bit table
>> 4) sum the value (as a 32-bit integer!) with the content of the table
>> 5) perform 2 Newton-Raphson iterations as necessary
> 
> To avoid the lookup table, calculate
> 
>x = (a/2) + (8^(1/4) - 1)^2
> 
> which gives relative errors less than 0.036 over the range 1/2 <= a <= 2
> at a cost of one shift and one addition.  The errors after 1,2,3, and 4
> iterations of Heron's rule are 0.64E-3, 0.204E-6, 0.211E-13, and 0.222E-27.
> 
> So, this requires one more iteration but avoids the use of a table and the
> corresponding memory hit.
> 
> Source: Computer Approximations, Hart et al.

Sorry, a context switch: this is for sqrt, not division.  Brain fade.

Andrew.


How do I add target specific tests?

2008-05-10 Thread Andy H

I want to add target specific tests for AVR.

These would be testcases for PR that fail  related to AVR back end 
problems - rather than testcases for generic PR.


Do I just add them to directory testsuite/gcc.target/avr? Or are there 
some other configuration steps needed?


Andy



Re: Deprecation?!

2008-05-10 Thread Alexandre Oliva
On May  9, 2008, Dave Higginbotham <[EMAIL PROTECTED]> wrote:

> I'm getting a " warning: deprecated conversion from string constant to
> ‘char*’" message in g++ (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7).

> I've always understood there is no such thing as deprecation in C++ (and
> have been proud of this concept). What gives?

In pre-standard versions of C++, (narrow) string literals had type
char[n], as in C, so they could decay to char*.  As of the first C++
standard, such string literals have type char const[n], so they decay
to char const*.  For backward compatibility, [conv.array]/2 in C++98
specifies a deprecated implicit conversion from string literals to
char*.

> I've always understood there is no such thing as deprecation in C++

Annex D in the C++ Standard specifies a few other deprecated
pre-standard language (mis?)features, defining deprecated as
"Normative for the current edition of the Standard, but not guaranteed
to be part of the Standard in future revisions."

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}
FSFLA Board Member   ¡Sé Libre! => http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}


Re: How do I add target specific tests?

2008-05-10 Thread Alexandre Oliva
On May 10, 2008, Andy H <[EMAIL PROTECTED]> wrote:

> These would be testcases for PR that fail  related to AVR back end
> problems - rather than testcases for generic PR.

> Do I just add them to directory testsuite/gcc.target/avr? Or are there
> some other configuration steps needed?

You'll want to create a .exp file that exists right away if the target
is not what you want, and that runs the testing machinery otherwise.
Look for `istarget' in testsuite/gcc.target/*/*.exp for various
examples.

-- 
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
Free Software Evangelist  [EMAIL PROTECTED], gnu.org}
FSFLA Board Member   ¡Sé Libre! => http://www.fsfla.org/
Red Hat Compiler Engineer   [EMAIL PROTECTED], gcc.gnu.org}


inline assembly question (memory side-effects)

2008-05-10 Thread Till Straumann

Hi.

What is the proper way to tell gcc that a
inline assembly statement either modifies
a particular area of memory or needs it
to be updated/in-sync because the assembly
reads from it.

E.g., assume I have a

struct blah {
   int sum;
 ...
};

which is accessed by my assembly code.

int my_chksum(struct blah *blah_p)
{
  blah_p->sum = 0;
  /* assembly code computes checksum, writes to blah_p->sum */
  asm volatile("..."::"b"(blah_p));
  return blah_p->sum;
}

How can I tell gcc that the object *blah_p
is accessed (read and/or written to) by the
asm (without declaring the object volatile or
using a general "memory" clobber).
(If I don't take any measures then gcc won't know
that blah_p->sum is modified by the assembly and
will optimize the load operation away and return
always 0)

A while ago (see this thread
http://gcc.gnu.org/ml/gcc/2008-03/msg00976.html)
I was told that simply adding the affected
memory area in question as memory ('m') output-
or input-operands is not appropriate -- note that
this is in contradiction to what the gcc info page
suggests (section about 'extended asm':

 "If you know how large the accessed memory
  is, you can add it as input or output..."

along with an example).

Could somebody please clarify (please don't suggest how
this particular example could be improved but try to
answer the fundamental question) ?

Thanks
-- Till

PS: please CC me as I'm not subscribed to the gcc list


[Windows] Fixing fprintf errors breaking bootstrap?

2008-05-10 Thread Aaron W. LaFramboise
I am still seeing errors such as this bootstrapping trunk with -Werror. 
 I thought all of this was supposed to be resolved?


../../svn/gcc/bt-load.c: In function 'migrate_btr_defs':
../../svn/gcc/bt-load.c:1415: error: ISO C does not support the 'I64' 
ms_printf length modifier


What needs to be fixed here?  Does HOST_WIDEST_INT_PRINT_DEC need to 
change, or something else?


Re: [Windows] Fixing fprintf errors breaking bootstrap?

2008-05-10 Thread Aaron W. LaFramboise

Aaron W. LaFramboise wrote:
I am still seeing errors such as this bootstrapping trunk with -Werror. 
 I thought all of this was supposed to be resolved?


OK, this is http://gcc.gnu.org/PR25502

Sorry for the noise.

Apparently the reason this has gone on so long is that most people are 
crossbuilding mingw32 or using --disable-werror.