Re: wot to do with the Maverick Crunch patches?

2008-03-30 Thread Hasjim Williams
On Sun, 30 Mar 2008 13:45:30 +0100, "Martin Guy" <[EMAIL PROTECTED]>
said:
> Ok, so we all have dozens of these EP93xx ARM SoCs on cheap boards,
> with unusable floating point hardware.
> 
> What do we have to do to get the best-working GCC support for Maverick
> Crunch FPU?
> 
> Suggest: make open-source project with objective:."to get the
> best-working GCC support for Maverick Crunch FPU". Anyone wanna run
> one, create repositories, set up mailing list etc a la
> producingoss.com, or is the current infrastructure sufficient for a
> coordinated effort?

Honestly, there is little reward in setting up a new mailing list since
no-one has really tried to look into the issue that much.  Traffic is so
low on this list (linux-cirrus), so there is little reason to start a
new one.

Reading documentation and working out why things aren't working is a
much better use of time.  Running the full C and C++ tests (in gcc) on
the current MaverickCrunch gcc would be a good start.  

We need to figure out what bugs are actually introduced by
MaverickCrunch (in C, C++ and std libraries), and fix them.

There is also a floating point testsuite for glibc:

glibc-2.5/math/gen-libm-test.pl

There should be one for uclibc, too...

> As I understand it, mailline GCC with patches in various versions can
> give:

> futaris-4.1.2/-4.2.0: Can usually use floating point in hardware for C
> and C++, maybe problems with exception unwinding in C++. In generated
> ASM code, all conditional execution of instructions is disabled except
> for jump/branch. Loss of code speed/size: negligable.
> Passes most FP tests but does not produce a fully working glibc (I
> gather from the Maverick OpenEmbedded people)

This is probably mainly due to the lack of exception unwinding, i.e.
what is detailed in ARM IHI 0038A.  It could also be due to bugs in the
MaverickCrunch engine, which aren't easy to pick up / debug.  I suggest
disabling MaverickCrunch double instructions, and working from there. 
If you or someone can get it working 100% with glibc then we can move
forward from there.
 
> Thoughts on a postcard please... any further progress in OE land?

At the moment we only MaverickCrunch in certain parts of our code, by
setting the relevant CFLAGS / CXXFLAGS.  We specifically only use float
rather than double, and it all works as it should.

Unfortunately, I haven't gotten glibc working with MaverickCrunch.  Lack
of time / motivation / documentation.

I think glibc EABI doesn't support MaverickCrunch, since no-one has
written "working" patches for this yet, e.g.
glibc-2.5/ports/sysdeps/arm/eabi

I'm pretty sure that the old patches that I did write, are incomplete:
http://files.futaris.org/glibc/glibc-crunch.patch

I don't think that ever ended up in the OE tree, since it was never
working correctly.  Additionally I don't think the binutils patch has
gone in.

> Cirrus also have a hand-coded Maverick libm that you can link with
> old-ABI binaries - can we incorporate this asm code in mainline?

Is this uClibc libm or glibc libm?

You might be able to use MaverickCrunch with uClibc but the patch
http://mail.busybox.net/lists/uclibc/2007-November/018723.html won't
compile a 100% working uClibc.  It compiles but doesn't work as
expected.  Not sure if it works correctly with OABI.

I don't think glibc compiles/runs if MaverickCrunch is enabled, because
of the lack of support in the glibc-2.5/ports/sysdeps/arm directory.

If someone can get iwmmxt support working in everything, then we might
be able to do the same for MaverickCrunch, since it is very similar work
to get one ARM coprocessor working as it is to get another working. 
Half of the support for the iWMMXt processor has already been written
and there is proper documentation.  Currently iwmmxt is only enabled in
certain applications in openembedded (rather than system-wide) because
of this reason.  I am not sure if it is only exception unwinding that
isn't working on iWMMXt, or if there is something else that also needs
to be written.


Re: wot to do with the Maverick Crunch patches?

2008-03-31 Thread Hasjim Williams
On Mon, 31 Mar 2008 11:31:01 + (UTC), "Joseph S. Myers"
<[EMAIL PROTECTED]> said:
> On Mon, 31 Mar 2008, Hasjim Williams wrote:

Joseph, 

First of all thanks for replying to this e-mail.  You seem to be the one
who has written most of the ARM coprocessor patches in the past, so
logically you're the best person to ask for most of these questions.
 
> > If someone can get iwmmxt support working in everything, then we might
> > be able to do the same for MaverickCrunch, since it is very similar work
> > to get one ARM coprocessor working as it is to get another working. 
> > Half of the support for the iWMMXt processor has already been written
> > and there is proper documentation.  Currently iwmmxt is only enabled in
> > certain applications in openembedded (rather than system-wide) because
> > of this reason.  I am not sure if it is only exception unwinding that
> > isn't working on iWMMXt, or if there is something else that also needs
> > to be written.
> 
> iWMMXt unwind support has been in GCC since my patch 
> <http://gcc.gnu.org/ml/gcc-patches/2007-01/msg00049.html>.

Thanks for that.  I think I saw this patch a while ago, but of course
forgot that it wasn't merged into 4.2.2, only 4.3.0.
 
> That illustrates the sort of thing that needs changing to implement
> unwind 
> support for a new coprocessor.  Obviously you need to get the unwind 
> specification in the official ARM EABI documents first before
> implementing 
> it in GCC

Any idea of who to contact and/or how to get this done?

> and binutils will also need to support generating correct 
> information given .save directives for the coprocessor registers.

http://files.futaris.org/gcc/binutils-crunch.patch almost covers this,
but I don't quite follow your binutils patch:
http://sourceware.org/ml/binutils/2006-08/msg00207.html

Am I reading this right (according to sec 9.3 from ARM IHI 0038A):

unwind.d /  unwind_vxwords.d patch tests:
d0c6c1c1 -> [d0] Pop VFP D[8]-D[8], [c6c1] Pop iWMMXt wR[10]-wR[11],
[c1] Pop iWMMXt wR[10]-wR[11]
b0b0c0c6|c1c1c6d0 -> Haven't decoded this, but it hasn't changed since
before your patch
c6c0b0b0 -> [c6c0] Pop iWMMXt wR[10]-wR[10], [b0] Finish, [b0] Finish
 
unwind.s is a simple test to save iWMMXt, and unwind.d does the unwind
for it?  Your patch adds code to test for the incorrect merging test?

gcc uses the code in unwind-arm.c etal to call the functions
(create_unwind_entry, unwind_save_mv etc) binutils gas/config/tc-arm.c
to do the frame unwinding, right?  To do the unwind parsing (of table 4
from 9.3 in IHI 0038A), what function in binutils gas/config/tc-arm.c is
called?

I think my problem was that I didn't know what opcode, to use when
calling add_unwind_opcode from within s_arm_unwind_save_mv in the above
binutils-crunch.patch, so I incorrectly used the code (i.e. copy/paste)
to unwind a mmxwr data reg.

Also should aebi_set_public_attributes in binutils/gas/config/tc-arm.c
be setting an EABI attr for MaverickCrunch?

Can we use gas or its testsuite to test each MaverickCrunch operation
easily?  I've disabled certain MaverickCrunch operations from gcc,
because I found them to be buggy.  I am not sure if this is because the
instruction is incorrectly encoded, or because of some inherit flaw with
MaverickCrunch.

> For setjmp/longjmp support in glibc you also need to get an HWCAP value 
> allocated in the kernel.

arch/arm/mach-ep93xx/core.c:
#ifdef CONFIG_CRUNCH
elf_hwcap |= HWCAP_CRUNCH;
#endif

The cirrus kernel patches from arm.cirrus.com add the necessary kernel
support, but I think it is also in mainstream linux.

Also, can I assume that running the testsuites (binutils, gcc and glibc)
is the best way to determine what is missing from the current
MaverickCrunch support?

binutils - http://sourceware.org/binutils/docs/
 - "make check" from gas build directory, eg:
binutils-cross-2.18-r1/binutils-2.18/build.i686-linux.arm-angstrom-linux-gnueabi/gas

gcc - http://gcc.gnu.org/testing/
- http://gcc.gnu.org/install/test.html 
- "make check-gcc" and "make check-gcc-c++"  from gcc build
directory, eg:
gcc-cross-4.1.2-r13/gcc-4.1.2/build.i686-linux.arm-angstrom-linux-gnueabi

glibc -
http://www.gnu.org/software/libc/manual/html_node/Configuring-and-compiling.html#Configuring-and-compiling
  - "make check" from glibc build directory, eg:
glibc-2.5-r9/build-arm-angstrom-linux-gnueabi

And of course your paths need to be setup along with dejagnu, etc...
e.g. http://lists.gnu.org/archive/html/dejagnu/2006-04/msg8.html


Re: wot to do with the Maverick Crunch patches?

2008-03-31 Thread Hasjim Williams

On Tue, 01 Apr 2008 12:10:54 +1000, "Hasjim Williams"
<[EMAIL PROTECTED]> said:
> gcc uses the code in unwind-arm.c etal to call the functions
> (create_unwind_entry, unwind_save_mv etc) binutils gas/config/tc-arm.c
> to do the frame unwinding, right?  To do the unwind parsing (of table 4
> from 9.3 in IHI 0038A), what function in binutils gas/config/tc-arm.c is
> called?

To answer my own question:

gcc/gcc/config/arm/pr-support.c -> __gnu_unwind_execute

uws is the GNU unwinding state as defined in unwind-arm.h

e.g. for VFP
gnu_Unwind_Save_VFP in libunwind.S called from unwind-arm.c /
_Unwind_VRS_Pop

I'm not sure at the moment, what regclass (UVRSC) MaverickCrunch
registers are being classed as.  I guess with my invalid
binutils-crunch.patch they would be classed as UVRSC_WMMXD...  Which
never "worked" (or even compiled) in gcc 4.2.2 or gcc 4.1.2 since
Joseph's patch hadn't been merged in, and so the opcode c6 or c1 etc
would fail.

I suppose we need a DEMAND_SAVE_MAVERICK like DEMAND_SAVE_VFP WMMXD etal
...


[ARM] Lack of __aeabi_fsqrt / __aeabi_dsqrt (sqrtsf2 / sqrtdf2) functions

2008-04-14 Thread Hasjim Williams
Hello all,

I've been working on MaverickCrunch support in gcc, and could never get
a completely working glibc (with-fp), since there is no soft-float sqrt
libcall function.  This is a big problem for MaverickCrunch as there are
no hard div or sqrt opcodes.

It seems that this is the only other thing that is missing to let glibc
be compiled with-fp for soft float arm, too.  

Is it possible to add these functions to ieee754-sf.S and ieee754-df.S
???  There is of course a c implementation in glibc/soft-fp but I don't
know of any arm assembly implementation...

I know that ARM IHI 0043A doesn't specific this as part of the EABI, but
perhaps it needs to be added?

Thanks


Re: [ARM] Lack of __aeabi_fsqrt / __aeabi_dsqrt (sqrtsf2 / sqrtdf2) functions

2008-04-14 Thread Hasjim Williams
On Mon, 14 Apr 2008 22:41:36 -0400, "Daniel Jacobowitz" <[EMAIL PROTECTED]>
said:
> On Tue, Apr 15, 2008 at 12:33:38PM +1000, Hasjim Williams wrote:
> > Hello all,
> > 
> > I've been working on MaverickCrunch support in gcc, and could never get
> > a completely working glibc (with-fp), since there is no soft-float sqrt
> > libcall function.  This is a big problem for MaverickCrunch as there are
> > no hard div or sqrt opcodes.
> > 
> Can you be more specific about the actual problem?  I don't see why
> there needs to be an __aeabi_sqrt; sqrt is a function in libm.

Both FPA and VFP coprocessors implement sqrt opcodes:

arm.md:

(define_expand "sqrtsf2"
  [(set (match_operand:SF 0 "s_register_operand" "")
(sqrt:SF (match_operand:SF 1 "s_register_operand" "")))]
  "TARGET_32BIT && TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP)"
  "")

fpa.md:

(define_insn "*sqrtsf2_fpa"
  [(set (match_operand:SF 0 "s_register_operand" "=f")
(sqrt:SF (match_operand:SF 1 "s_register_operand" "f")))]
  "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_FPA"
  "sqt%?s\\t%0, %1"
  [(set_attr "type" "float_em")
   (set_attr "predicable" "yes")]
)

vfp.md:


(define_insn "*sqrtsf2_vfp"
  [(set (match_operand:SF0 "s_register_operand" "=t")
(sqrt:SF (match_operand:SF 1 "s_register_operand" "t")))]
  "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP"
  "fsqrts%?\\t%0, %1"
  [(set_attr "predicable" "yes")
   (set_attr "type" "fdivs")]
)

Now, when you build glibc configured "--with-fp", you won't use the
generic glibc/soft-fp functions, only those in gcc, i.e. the above. 
Only if you configure glibc "--without-fp" will it not use the above
opcodes, but the soft-fp sqrt etc.


Re: [ARM] Lack of __aeabi_fsqrt / __aeabi_dsqrt (sqrtsf2 / sqrtdf2) functions

2008-04-14 Thread Hasjim Williams

On Mon, 14 Apr 2008 23:09:00 -0400, "Daniel Jacobowitz" <[EMAIL PROTECTED]>
said:
> On Tue, Apr 15, 2008 at 12:58:40PM +1000, Hasjim Williams wrote:
> > Both FPA and VFP coprocessors implement sqrt opcodes:
> 
> So?  Glibc does not rely on that.  I've been building soft-float
> versions of glibc for non-Crunch targets for scarily close to a decade
> now, so this is clearly not the problem :-)  Please say what actual
> error you've encountered.

Of course you can build glibc and have it work correctly, as you are
configuring glibc --without-fp.  Try building glibc --with-fp and seeing
whether it works.

Suffice to say, it will compile, but when you try to run it, and your
program tries to do the libcall to the sqrt function it will segfault,
because there is no libcall sqrt defined.

As far as I can tell, sqrt and div are the only major opcodes that are
needed (for ieee754 glibc --with-fp) that aren't implemented in
MaverickCrunch.



Re: [ARM] Lack of __aeabi_fsqrt / __aeabi_dsqrt (sqrtsf2 / sqrtdf2) functions

2008-04-15 Thread Hasjim Williams

On Tue, 15 Apr 2008 09:15:45 -0400, "Daniel Jacobowitz" <[EMAIL PROTECTED]>
said:
> 
> I'm going to keep asking until I get something we can work
> with... you're reporting a bug in the compiler, so we need a test case
> and the exact error message.  What is generating any kind of sqrt
> libcall?  There is nothing in GCC to call __aeabi_sqrt, which AFAIK
> does not exist.

gcc shouldn't call __aeabi_sqrt, since it doesn't exist.  It could
however try to do a libcall through the optab to sqrtsf2, since
sqrt_optab does exist in optabs.h/c and genopinit.c ...  

Ok, to try and test this out, I added this patch to gcc, and recompiled
gcc & glibc --with-fp for crunch:

--- gcc-4.2.2/gcc/config/arm/arm.md 2008-04-15 17:59:35.0
+1000
+++ gcc-4.2.2/gcc/config/arm/arm.md 2008-04-15 18:02:08.0
+1000
@@ -3100,6 +3100,22 @@
   "TARGET_ARM && TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP)"
   "")
 
+(define_expand "sqrtsf2_soft_insn"
+  [(set (match_operand:SF 0 "s_register_operand" "")
+   (sqrt:SF (match_operand:SF 1 "s_register_operand" "")))]
+  "TARGET_ARM && !(TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP))"
+  "{
+  FAIL;
+  }")
+
+(define_expand "sqrtdf2_soft_insn"
+  [(set (match_operand:DF 0 "s_register_operand" "")
+   (sqrt:DF (match_operand:DF 1 "s_register_operand" "")))]
+  "TARGET_ARM && !(TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP))"
+  "{
+  FAIL;
+  }")
+
 (define_insn_and_split "one_cmpldi2"
   [(set (match_operand:DI 0 "s_register_operand" "=&r,&r")
(not:DI (match_operand:DI 1 "s_register_operand" "?r,0")))]

I was expecting it to fail since now gcc shouldn't be able to expand
that operation.  I double check glibc config.status, and with_fp=yes.
So, I'm not sure how glibc is doing it's sqrt routines on arm.  In that
case I suspect that glibc doesn't call sqrt for VFP or FPA, then, and
you were right in your first e-mail.

glibc uses its own internal sqrt function, rather than the
sqrtsf2/sqrtdf2 opcode, even on FPA or VFP.

In any case if you want a test, then something like
gcc/testsuite/gcc.dg/arm-vfp1.c is enough (of course with different
dejagnu lines, e.g. dg-final scan-assembler lines) ...

I still think something like the above patch needs to be added to gcc,
just to ensure that the sqrtsf2 / sqrtdf2 libcall never happens, and an
error happens at compile time rather than runtime.  In the future glibc,
uclibc or some standalone function could make use of the "gcc" sqrt,
rather than the glibc sqrt.

Anyway, I misunderstood how the toolchain gets assembled and I would
appreciate it someone could point out where the sqrt function is in
glibc, on arm.
Does arm use glibc/sysdeps/ieee754/dbl-64/mpsqrt.c ???


Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?

2008-05-08 Thread Hasjim Williams
Hi all,

I was looking for ways to improve the MaverickCrunch division routine on
ARM ep93xx, and noticed that there are few other architectures that
don't have a hardware divide.

IA-64 has a "frcpa" instruction that returns an estimate of the
reciprocal of a float or double.
Likewise, RS-6000 has a "fres" that also returns an estimate of the
reciprocal of a float or double.
x86 seems to have something similar with SSE - called "rcpps" - that
also returns the estimated reciprocal.

They all seem to make use of FMAC / FNMAC instructions to calculate the
correct answer for x/y, through an Newton-Raphson and MAC Instructions. 
And the algorithms they use in GCC are different, due to the accuracy of
the reciprocal estimate.

http://en.wikipedia.org/wiki/N-th_root_algorithm
http://en.wikipedia.org/wiki/Multiply-accumulate

They also seem to use a similar algorithm to implement their sqrt
function...

My question is, are there any other architectures in GCC that don't have
a reciprocal estimate instruction, but have a FMAC?

I'd like to implement something similar for MaverickCrunch, using the
integer 32-bit MAC functions, but there is no reciprocal estimate
function on the MaverickCrunch.  I guess a lookup table could be
implemented, but how many entries will need to be generated, and how
accurate will it have to be IEEE754 compliant (in the swdiv routine)?

Also, where should I be sticking such an instruction / table?  Should I
put it in the kernel, and trap an invalid instruction?  Alternatively,
should I put it in libgcc or in glibc/uclibc?


Re: Division using FMAC, reciprocal estimates and Newton-Raphson - eg ia64, rs6000, SSE, ARM MaverickCrunch?

2008-05-11 Thread Hasjim Williams

On Sat, 10 May 2008 11:59:16 +0100, "Martin Guy" <[EMAIL PROTECTED]>
said:
> On 5/9/08, Paolo Bonzini <[EMAIL PROTECTED]> wrote:
> >  The idea is to use integer arithmetic to compute the right exponent, and
> > the lookup table to estimate the mantissa.

> The current soft-fp routine in libm seems to use a variant of this,
> but of course it may be faster if implemented using the Maverick's
> 64-bit add/sub/cmp.

ARM doesn't use the soft-fp routines in libm for division (in
glibc/soft-fp).  For gcc, there is a gcc/config/ieee754-sf.S and
ieee754-df.S that implements divsf3 and divdf3.

For SQRT, it should use these functions, on soft-float and any
coprocessor that doesn't have a sqrt function, i.e. MaverickCrunch.

http://www.gnu.org/software/libc/manual/html_node/Misc-FP-Arithmetic.html#Misc-FP-Arithmetic

mentions the fma / fmaf functions which do a fused multiply-add, at a
higher precision.  MaverickCrunch has a fma function for 32-bit
integers, which could be extended for 64-bit ints, floats and doubles...





[ARM] Cirrus EP93xx Maverick Crunch Support - "bge" pattern

2007-06-26 Thread Hasjim Williams
G'day all,

As I wrote previously on gcc-patches (
http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00244.html ), I'm working
on code to get the MaverickCrunch Floating-Point Co-processor supported
on ARM.  I mentioned previously that you can't use the same opcodes for
testing GE on the MaverickCrunch, as you use on ARM.  See the below
table for NZCV values from MaverickCrunch.

MaverickCrunch - (cfcmp*):
N  Z  C  V
A == B  0  1  0  0
A <  B  1  0  0  0
A >  B  1  0  0  1
unord   0  0  0  0

ARM/FPA/VFP - (cmp*):
N  Z  C  V
A == B  0  1  1  0
A <  B  1  0  0  0
A >  B  0  0  1  0
unord   0  0  1  1

I've added a new "maverick_comparison_operator" similar to
"arm_comparison_operator" to predicates.md, and added the right bits to
arm.md, as shown below:

;; Special predication pattern for Maverick Crunch floating-point

(define_cond_exec
  [(match_operator 0 "maverick_comparison_operator"
[(match_operand:CCFP 1 "cc_register" "")
 (const_int 0)])]
  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
  ""
)

All the other predicates are fine, since they are only used in floating
point comparisons.  But "ge" is also used for integer comparisons.  Now,
my problem is with the following code: 

; Special pattern to match GE for MAVERICK.
(define_insn "*arm_bge"
  [(set (pc)
(if_then_else (ge (match_operand 1 "cc_register" "") (const_int 0))
  (label_ref (match_operand 0 "" ""))
  (pc)))]
  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
  "*
  gcc_assert (!arm_ccfsm_state);
  if (get_attr_cirrus (prev_active_insn(insn)) == CIRRUS_COMPARE)
  return \"beq\\t%l0\;bvs\\t%l0\"; else return \"bge\\t%l0\;nop\";
  "
  [(set_attr "conds" "jump_clob")
   (set_attr "length" "8")]
)

As you can see, I need to replace all bge with a maverick crunch
equivalent.  However, "bge" is still also used with integer comparisons,
e.g:

double a, b;
if (a>=b) {

produces:

cfcmpd  r15, mvd1, mvd0
beq .L4
bvs .L4
b   .L2
.L4:

I haven't got a good example for the integer ge, but unsidf makes use of
ge (NB, I disabled MaverickCrunch 64-bit support for clarity, and
bugfixing):

unsigned int e = 9;
double e_d = e;

produces:

mov r3, #9
str r3, [fp, #-32]
ldr r0, [fp, #-32]
bl  __aeabi_i2d
str r0, [fp, #-44]
str r1, [fp, #-40]
ldr r3, [fp, #-32]
cmp r3, #0
bge .L2
nop
mov r0, r0  @ nop
mov r0, r0  @ nop
cfldrd  mvd0, .L4
mov r0, r0  @ nop
cfldrd  mvd1, [fp, #-44]
mov r0, r0  @ nop
cfaddd  mvd1, mvd1, mvd0
cfstrd  mvd1, [fp, #-44]
.L2:

This seems to work fine for C code, but when I generate C++ code, I get
errors.  Can anyone suggest a better way of writing the above code?  I'd
rather the jump_clob didn't actually happen unless it was definitely a
cirrus compare, and not an ARM compare, but I can't seem to write the
code correctly.

Also, is "(get_attr_cirrus (prev_active_insn(insn)) == CIRRUS_COMPARE)"
the best way to find out if the last compare instruction was a cirrus
compare, or an ARM compare?  Should I add a new routine to find the last
compare insn, e.g. prev_compare_insn???  I know that a branch doesn't
have to be directly after a comparison.

This and invalid coprocessor offset issue are the only two outstanding
bugs for MaverickCrunch support on the Cirrus EP93xx.


Re: [ARM] Cirrus EP93xx Maverick Crunch Support - "bge" pattern

2007-06-27 Thread Hasjim Williams

On Wed, 27 Jun 2007 08:17:47 +0200, "Paolo Bonzini" <[EMAIL PROTECTED]>
said:
> 
> >   if (get_attr_cirrus (prev_active_insn(insn)) == CIRRUS_COMPARE)
> >   return \"beq\\t%l0\;bvs\\t%l0\"; else return \"bge\\t%l0\;nop\";
> >   "
> >   [(set_attr "conds" "jump_clob")
> >(set_attr "length" "8")]
> > )
> > 
> > As you can see, I need to replace all bge with a maverick crunch
> > equivalent.  However, "bge" is still also used with integer comparisons,
> > e.g:
> 
> I think you should generate the compare using a different mode for the 
> CC register (like cc:CCMAV) and then use two patterns:
> 
> ; Special pattern to match GE for MAVERICK.  Most restrictive
> ; pattern goes first.
> (define_insn "*arm_cirrus_bge"
>[(set (pc)
>   (if_then_else (ge (match_operand:CCMAV 1 "cc_register" "") (const_int 
> 0))
> (label_ref (match_operand 0 "" ""))
> (pc)))]
>"TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
>"beq\\t%l0\;bvs\\t%l0\"
>[(set_attr "conds" "jump_clob")
> (set_attr "length" "8")]
> )
> 
> ; Special pattern to match GE for ARM.
> (define_insn "*arm_bge"
>[(set (pc)
>   (if_then_else (ge (match_operand 1 "cc_register" "") (const_int 0))
> (label_ref (match_operand 0 "" ""))
> (pc)))]
>"TARGET_ARM && TARGET_HARD_FLOAT"
>"bge\\t%l0\"
>[(set_attr "conds" "jump_clob")
> (set_attr "length" "4")]
> )

Yep, this will work.  Floating point comparisons are already done in
CCFP mode, so I have used that.  NB, I already tried this earlier, but I
think most of my problem comes from conditional execution ...

I tried changing:

(define_cond_exec
  [(match_operator 0 "arm_comparison_operator"
[(match_operand 1 "cc_register" "")
 (const_int 0)])]
  "TARGET_ARM"
  ""
)

to:

(define_cond_exec
  [(match_operator 0 "maverick_comparison_operator"
[(match_operand:CCFP 1 "cc_register" "")
 (const_int 0)])]
  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
  ""
)

(define_cond_exec
  [(match_operator 0 "arm_comparison_operator"
[(match_operand 1 "cc_register" "")
 (const_int 0)])]
  "TARGET_ARM"
  ""
)

But I think I also need to modify or add to all the other scc and / ior
etc lines, since I think combining scc's / condexecs doesn't work
correctly.  I think the if the above define_cond_exec is still there,
then gcc thinks it can optimize all ge execution, and so optimises the
above output from arm_bge, and deletes the label.  I rebuilt gcc with
all conditional execution disabled to see if it would work.  I did this
by commenting out any line referencing arm_comparison_operator or
define_cond_exec.

However, when I compile a c++ program, the compiler still can't generate
the label again, and it fails with:

internal compiler error: output_operand: '%l' operand isn't a label

then of course the assembler fails with:

undefined local label

NB, I shouldn't need the second arm_bge as it should be handled by the
code in arm_condition_code, for non MAVERICK and Maverick non-floating
point.  I've also disabled DImode on Maverick, since it is only signed
or unsigned, and not both at the same time.  I think it will also cause
similar comparison-based problems too.  Incidentally, is it possible to
do something like:

(if_then_else (ge (match_operand:CCFP,CCDI 1 "cc_register" "")
(const_int 0))

And can someone explain what is the difference between these two lines:

if_then_else (ge (match_operand:CCFP 1 "cc_register" "") (const_int 0))
if_then_else (ge:CCFP (match_operand 1 "cc_register" "") (const_int 0))

Is the second line still valid syntax?


Re: [ARM] Cirrus EP93xx Maverick Crunch Support - "bge" pattern

2007-06-27 Thread Hasjim Williams

On Wed, 27 Jun 2007 18:15:12 +1000, "Hasjim Williams" <[EMAIL PROTECTED]>
said:
> 
> if_then_else (ge (match_operand:CCFP 1 "cc_register" "") (const_int 0))
> if_then_else (ge:CCFP (match_operand 1 "cc_register" "") (const_int 0))
> 
> Is the second line still valid syntax?

The second line doesn't work.  The first one does.  It also fixes up the
"internal compiler error: output_operand: '%l' operand isn't a label"
error...

Incidentally, does anyone know if can you do something like:
if_then_else (ge (match_operand:!CCFP 1 "cc_register" "") (const_int 0))


Re: [ARM] Cirrus EP93xx Maverick Crunch Support - condexec / bugfixing / "co-processor offset out of range"

2007-06-27 Thread Hasjim Williams

On Wed, 27 Jun 2007 12:31:42 +0200, "Rask Ingemann Lambertsen"
<[EMAIL PROTECTED]> said:
> On Wed, Jun 27, 2007 at 06:45:26PM +1000, Hasjim Williams wrote:
> > 
> > It also fixes up the
> > "internal compiler error: output_operand: '%l' operand isn't a label"
> > error...
> > 
> > Incidentally, does anyone know if can you do something like:
> > if_then_else (ge (match_operand:!CCFP 1 "cc_register" "") (const_int 0))
> 
>You can't (but mode macros help). As Paolo says, you will have to
>define
> one or more new comparison modes and you will have to define branch insns
> which use the new mode(s), comparison insns which set the cc register in
> the
> new mode(s), new sCC style insns, and so on. Additionally, look at
> SELECT_CC_MODE and TARGET_CC_MODE_COMPATIBLE. If you have some sort of
> arm_output_compare_insn() function, modify that as well.
> 
>The significance of defining a CCmode is that is says that comparisons
> done in that mode set the flags in a specific way.

Thanks.  This really clears things up for me.  For the moment, I will
leave conditional execution disabled for EVERYTHING when compiling for
MaverickCrunch.  I think the arm.md code only conditionally executes
operands if the compare was in SImode, anyway - (see "scc insns" in
gcc/config/arm/arm.md) .  Can anyone confirm this?

This just leaves me with one other major bug for MaverickCrunch.

It is related to the bugs in the Cirrus silicon.  Mainstream gcc and
older versions of gcc have a parameter -mcirrus-fix-invalid-insns.  The
patch from Nucleus Systems
(http://www.nucleusys.com/projects/crunch.php), removes this parameter,
and replaces it with two -mfix-crunch-d0 and -mfix-crunch-d1.

I've modified it and attached it to this post.  At the moment, I hard
code both to 0, to disable the bug fixes, since enabling them I think is
the cause of a "co-processor offset out of range" error, in the
assembler.  Essentially the two major bugs that the attached code fixes
are, after a branch, two nops are needed.  Secondly, the a register
written to in one instruction can not be read from in the next
instruction, without a non-MaverickCrunch operation in between, i.e. a
nop.  

Essentially this extra code is run in arm_reorg, which is always run on
ARM, since an address can only be loaded a limited distance around the
pc.  Likewise for the MaverickCrunch coprocessor, we only have an 8-bit
word offset, which means a max 1024 byte offset, minus the 8 byte
minimum jump, etc.

Now, it seems whether this patch is applied (and turned on) or not
applied I get "co-processor offset out of range" errors, because of the
extra NOPs inserted between the jump and original label point.  I think
this in turn shifts the offset.  I can't see anyway to easily
recalculate or fix the coprocessor offset instructions, since this
happens AFTER the instruction has been generated.

I tried to hack around this by putting 2 NOPs before all cirrus
instructions, and modifying the length of each instruction.  I think
this means
that the coprocessor offset will be correct, since the NOP has been
generated BEFORE the instruction was generated.  However, it will mean
that all cirrus instructions are slower, since some will have additional
unneccessary NOPs appended before them.  I don't think that alone will
work, though...

I think the "co-processor offset out of range" error is generated
because of the cfldrs and cfldrd instructions.  These are used to Load
Floating Point Values from Memory into MaverickCrunch registers
directly.  I commented out the cirrus_movsf / cirrus_movdf insn patterns
(in asm generation), and it removed the error.  Does this mean that
someone calculated the pool_range & neg_pool_range attrs incorrectly, or
are the constraints I talk about below missing?

Is there something else in arm.c/h that I should be looking at?
arm_legitimate_index_p ??? arm_coproc_mem_operand ???
EXTRA_CONSTRAINT_STR_ARM ?

'Uv' is an address valid for VFP load/store insns. - i.e. doesn't
support writeback
'Uy' is an address valid for iwmmxt load/store insns. - i.e. supports
writeback

Is there supposed to be something similar for FPA / Maverick load/store
insns?  Or should it use the Uy mode?  Only the VFP supports the
writeback modes?  Autoincrement / decrement modes?  Only VFP does this,
I think...  I think m mode is only used for r->mE and m->r.  w->UvE & Uv
-> w for VFP.  y -> yrUy & yrUy -> y for iwMMXt.  However, this isn't
done for FPA.

Is this a bug for FPA?  Or hasn't it been picked up since no-one really
uses FPA?

Also, once I get the code doing what it's supposed to do, and generate a
patch against svn HEAD, do I need to do anything else special besides
posting it to gcc-patches, and le

Re: [ARM] Cirrus EP93xx Maverick Crunch Support - CC modes / condexec / CC_REGNUM

2007-06-28 Thread Hasjim Williams

On Thu, 28 Jun 2007 14:55:17 +0200, "Rask Ingemann Lambertsen"
<[EMAIL PROTECTED]> said:
> On Wed, Jun 27, 2007 at 11:26:41AM +1000, Hasjim Williams wrote:
> > G'day all,
> > 
> > As I wrote previously on gcc-patches (
> > http://gcc.gnu.org/ml/gcc-patches/2007-06/msg00244.html ), I'm working
> > on code to get the MaverickCrunch Floating-Point Co-processor supported
> > on ARM.  I mentioned previously that you can't use the same opcodes for
> > testing GE on the MaverickCrunch, as you use on ARM.  See the below
> > table for NZCV values from MaverickCrunch.
> > 
> > MaverickCrunch - (cfcmp*):
> > N  Z  C  V
> > A == B  0  1  0  0
> > A <  B  1  0  0  0
> > A >  B  1  0  0  1
> > unord   0  0  0  0
> > 
> > ARM/FPA/VFP - (cmp*):
> > N  Z  C  V
> > A == B  0  1  1  0
> > A <  B  1  0  0  0
> > A >  B  0  0  1  0
> > unord   0  0  1  1
> 
>The key to getting this right is to use a special comparison mode for
> MaverickCrunch comparisons. You have to look at arm_gen_compare_reg(),
> which
> calls arm_select_cc_mode() to do all the work. I think there's probably a
> mess in there: CCFPmode is used for some non-MaverickCrunch floating
> point
> compares as well as some MaverickCrunch ones. You'll have to sort that
> out.
> The safe way would be to define a new mode, say CCMAVmode, and use that.

I think a new CCMAV mode is needed, if I reenable the cfcmp64 code,
otherwise, the other CCFP modes aren't used, since an ARM processor can
only have one floating-point co-processor, either FPA, VFP or MAVERICK,
selected by the -mfpu arg.  Current patches replaces the
get_arm_condition_code for CCFPmode, if TARGET_MAVERICK is set. 
Conditional execution is tricky, since you don't check for unordered
when doing a integer comparison, and you might want to check for signed
comparison or unsigned comparison.  I think two CC "maverick-specific"
modes are needed for the comparison stuff if 64-bit comparisons are
enabled, because of this reason.  Probably better to leave 64-bit
comparison disabled I think, and let the soft 64-bit int comparison
handle it, and hence not worry about conditional execution for 64-bit
int at all.

> (define_insn "*cirrus_cmpdi"
>   [(set (reg:CC CC_REGNUM)
> (compare:CC (match_operand:DI 0 "cirrus_fp_register" "v")
> (match_operand:DI 1 "cirrus_fp_register" "v")))]
>   "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
>   "cfcmp64%?\\tr15, %V0, %V1"
>   [(set_attr "type"   "mav_farith")
>(set_attr "cirrus" "compare")]
> )
> 
> Does this insn also set the flags according to the MaverickCrunch NZCV
> table
> above? If so, it needs to use a MaverickCrunch CCmode too, in which case,
> with the CANONICALIZE_COMPARISON macro, you can change

Yes, this instruction compares two 64-bit values in the maverick
coprocessor & stores the result in NZCV flags in PC/R15.  At the moment,
I have disabled 64bit support in my Maverick Crunch patches, simply
because the 64-bit support is signed only, or unsigned only depending on
the value of UI in DSPSC in the Maverick co-processor, and most code
doesn't make use of 64-bit, so the performance hit is minimal.  I think
there are some EABI functions to wrap unsigned 64bit functions to signed
64bit functions, but I haven't really checked this, in detail...
 
>Once you know that all MaverickCrunch comparisons (and only those)
>have
> (reg:CCMAV CC_REGNUM) in them, then it is easy to write all the
> corresponding comparison and branch instructions.

Agreed.  I have had the correct conditional executing for a while, and
have run the ieee754 tests with dejagnu and they all passed.  It was
only a couple of days ago, that I realised that I was replacing all bge
instructions.  The other comparisons that can't be implemented (on the
maverick crunch), shouldn't matter since they aren't used in the other
comparison modes.  But just to be safe, I have put the CCFP arg on them,
since this is the only mode that they should be used in.  The other
floating point ARM co-processors make use of a CCFPE mode, but the
MaverickCrunch doesn't need it.

In any case I have a working gcc compiler for the MaverickCrunch, that
lets me compile everything and seems to execute everything correctly
now.  However, I have disabled conditional execution and 64-bit int
mode, for the time being...  I think I will have to add a
maverick_cc_register though, like you suggest if I want to enable
conditional execution, otherwise the cc_register will have maverick and
arm comparisons in there, and th