Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?
On 5/9/08, Paolo Bonzini <[EMAIL PROTECTED]> wrote: > The idea is to use integer arithmetic to compute the right exponent, and > the lookup table to estimate the mantissa. I used something like this for > square root: > > 1) shift the entire FP number by 1 to the right (logical right shift) > 2) sum 0x2000 so that the exponent is still offset by 64 > 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, 32-bit > table > 4) sum the value (as a 32-bit integer!) with the content of the table > 5) perform 2 Newton-Raphson iterations as necessary It normally turns out to be faster to use the magic integer sqrt algorithm, even when you have multiplication and division in hardware unsigned long isqrt(x) unsigned long x; { register unsigned long op, res, one; op = x; res = 0; /* "one" starts at the highest power of four <= than the argument. */ one = 1 << 30; /* second-to-top bit set */ while (one > op) one >>= 2; while (one != 0) { if (op >= res + one) { op = op - (res + one); res = res + 2 * one; } res >>= 1; one >>= 2; } return(res); } The current soft-fp routine in libm seems to use a variant of this, but of course it may be faster if implemented using the Maverick's 64-bit add/sub/cmp. M
Re: RFH: Building and testing gimple-tuples-branch
On 5/9/08 4:32 PM, Kaz Kojima wrote: * config/sh/sh.c (sh_gimplify_va_arg_expr): Change pre_p and post_p types to gimple_seq *. Thanks. This is certainly OK. Diego.
Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?
Paolo Bonzini wrote: > >> I'd like to implement something similar for MaverickCrunch, using the >> integer 32-bit MAC functions, but there is no reciprocal estimate >> function on the MaverickCrunch. I guess a lookup table could be >> implemented, but how many entries will need to be generated, and how >> accurate will it have to be IEEE754 compliant (in the swdiv routine)? > > I think sh does something like that. It is quite a mess, as it has half > a dozen ways to implement division. > > The idea is to use integer arithmetic to compute the right exponent, and > the lookup table to estimate the mantissa. I used something like this > for square root: > > 1) shift the entire FP number by 1 to the right (logical right shift) > 2) sum 0x2000 so that the exponent is still offset by 64 > 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, > 32-bit table > 4) sum the value (as a 32-bit integer!) with the content of the table > 5) perform 2 Newton-Raphson iterations as necessary To avoid the lookup table, calculate x = (a/2) + (8^(1/4) - 1)^2 which gives relative errors less than 0.036 over the range 1/2 <= a <= 2 at a cost of one shift and one addition. The errors after 1,2,3, and 4 iterations of Heron's rule are 0.64E-3, 0.204E-6, 0.211E-13, and 0.222E-27. So, this requires one more iteration but avoids the use of a table and the corresponding memory hit. Source: Computer Approximations, Hart et al. Andrew.
Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?
Andrew Haley wrote: > Paolo Bonzini wrote: >>> I'd like to implement something similar for MaverickCrunch, using the >>> integer 32-bit MAC functions, but there is no reciprocal estimate >>> function on the MaverickCrunch. I guess a lookup table could be >>> implemented, but how many entries will need to be generated, and how >>> accurate will it have to be IEEE754 compliant (in the swdiv routine)? >> I think sh does something like that. It is quite a mess, as it has half >> a dozen ways to implement division. >> >> The idea is to use integer arithmetic to compute the right exponent, and >> the lookup table to estimate the mantissa. I used something like this >> for square root: >> >> 1) shift the entire FP number by 1 to the right (logical right shift) >> 2) sum 0x2000 so that the exponent is still offset by 64 >> 3) extract the 8 bits from 14 to 22 and look them up in a 256-entry, >> 32-bit table >> 4) sum the value (as a 32-bit integer!) with the content of the table >> 5) perform 2 Newton-Raphson iterations as necessary > > To avoid the lookup table, calculate > >x = (a/2) + (8^(1/4) - 1)^2 > > which gives relative errors less than 0.036 over the range 1/2 <= a <= 2 > at a cost of one shift and one addition. The errors after 1,2,3, and 4 > iterations of Heron's rule are 0.64E-3, 0.204E-6, 0.211E-13, and 0.222E-27. > > So, this requires one more iteration but avoids the use of a table and the > corresponding memory hit. > > Source: Computer Approximations, Hart et al. Sorry, a context switch: this is for sqrt, not division. Brain fade. Andrew.
How do I add target specific tests?
I want to add target specific tests for AVR. These would be testcases for PR that fail related to AVR back end problems - rather than testcases for generic PR. Do I just add them to directory testsuite/gcc.target/avr? Or are there some other configuration steps needed? Andy
Re: Deprecation?!
On May 9, 2008, Dave Higginbotham <[EMAIL PROTECTED]> wrote: > I'm getting a " warning: deprecated conversion from string constant to > ‘char*’" message in g++ (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7). > I've always understood there is no such thing as deprecation in C++ (and > have been proud of this concept). What gives? In pre-standard versions of C++, (narrow) string literals had type char[n], as in C, so they could decay to char*. As of the first C++ standard, such string literals have type char const[n], so they decay to char const*. For backward compatibility, [conv.array]/2 in C++98 specifies a deprecated implicit conversion from string literals to char*. > I've always understood there is no such thing as deprecation in C++ Annex D in the C++ Standard specifies a few other deprecated pre-standard language (mis?)features, defining deprecated as "Normative for the current edition of the Standard, but not guaranteed to be part of the Standard in future revisions." -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ Free Software Evangelist [EMAIL PROTECTED], gnu.org} FSFLA Board Member ¡Sé Libre! => http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org}
Re: How do I add target specific tests?
On May 10, 2008, Andy H <[EMAIL PROTECTED]> wrote: > These would be testcases for PR that fail related to AVR back end > problems - rather than testcases for generic PR. > Do I just add them to directory testsuite/gcc.target/avr? Or are there > some other configuration steps needed? You'll want to create a .exp file that exists right away if the target is not what you want, and that runs the testing machinery otherwise. Look for `istarget' in testsuite/gcc.target/*/*.exp for various examples. -- Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/ Free Software Evangelist [EMAIL PROTECTED], gnu.org} FSFLA Board Member ¡Sé Libre! => http://www.fsfla.org/ Red Hat Compiler Engineer [EMAIL PROTECTED], gcc.gnu.org}
inline assembly question (memory side-effects)
Hi. What is the proper way to tell gcc that a inline assembly statement either modifies a particular area of memory or needs it to be updated/in-sync because the assembly reads from it. E.g., assume I have a struct blah { int sum; ... }; which is accessed by my assembly code. int my_chksum(struct blah *blah_p) { blah_p->sum = 0; /* assembly code computes checksum, writes to blah_p->sum */ asm volatile("..."::"b"(blah_p)); return blah_p->sum; } How can I tell gcc that the object *blah_p is accessed (read and/or written to) by the asm (without declaring the object volatile or using a general "memory" clobber). (If I don't take any measures then gcc won't know that blah_p->sum is modified by the assembly and will optimize the load operation away and return always 0) A while ago (see this thread http://gcc.gnu.org/ml/gcc/2008-03/msg00976.html) I was told that simply adding the affected memory area in question as memory ('m') output- or input-operands is not appropriate -- note that this is in contradiction to what the gcc info page suggests (section about 'extended asm': "If you know how large the accessed memory is, you can add it as input or output..." along with an example). Could somebody please clarify (please don't suggest how this particular example could be improved but try to answer the fundamental question) ? Thanks -- Till PS: please CC me as I'm not subscribed to the gcc list
[Windows] Fixing fprintf errors breaking bootstrap?
I am still seeing errors such as this bootstrapping trunk with -Werror. I thought all of this was supposed to be resolved? ../../svn/gcc/bt-load.c: In function 'migrate_btr_defs': ../../svn/gcc/bt-load.c:1415: error: ISO C does not support the 'I64' ms_printf length modifier What needs to be fixed here? Does HOST_WIDEST_INT_PRINT_DEC need to change, or something else?
Re: [Windows] Fixing fprintf errors breaking bootstrap?
Aaron W. LaFramboise wrote: I am still seeing errors such as this bootstrapping trunk with -Werror. I thought all of this was supposed to be resolved? OK, this is http://gcc.gnu.org/PR25502 Sorry for the noise. Apparently the reason this has gone on so long is that most people are crossbuilding mingw32 or using --disable-werror.