RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.

Claudiu Zissulescu Thu, 28 Apr 2016 07:12:48 -0700

Hi,

> Where exactly does the test go wrong?


The test which fails is this one: 
        TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1);
From the test file included in the patch.

> Can you show a trace of __eqdf2 with register values?

Sure thing, running for ARC700, using original implementation and enabled 
guarded code for FPX handling:

[0x000002a2] 0xc000                 K Z    ld_s           r0,[sp,0x0] : lw 
[0x5000c0c0] => 0xffffffff : (w1) r0 <= 0xffffffff *
[0x000002a4] 0xc101                 K Z    ld_s           r1,[sp,0x4] : lw 
[0x5000c0c4] => 0x7fefffff : (w1) r1 <= 0x7fefffff *
[0x000002a6] 0xc202                 K Z    ld_s           r2,[sp,0x8] : lw 
[0x5000c0c8] => 0xffffffff : (w1) r2 <= 0xffffffff *
[0x000002a8] 0xc303                 K Z    ld_s           r3,[sp,0xc] : lw 
[0x5000c0cc] => 0x7fefffff : (w1) r3 <= 0x7fefffff *
[0x000002aa] 0x0aea0000             K Z    bl             0x2e8 : (w0) r31 <= 
0x000002ae *
[0x00000590] 0x091d00e1             K Z    brne.d         r1,r3,0x1c
[0x00000594] 0x2153050c             K Z    bmsk           r12,r1,0x14 : (w0) 
r12 <= 0x000fffff *
[0x00000598] 0x200580be             K Z    or.f           0,r0,r2 *
[0x0000059c] 0x24cf1562             K  N   bset.ne        r12,r12,0x15 : (w0) 
r12 <= 0x002fffff *
[0x000005a0] 0x2414904c             K  N   add1.f         r12,r12,r1 : (w0) r12 
<= 0x000ffffd *
[0x000005a4] 0x7fe0                 K   C  j_s.d          [blink] *
[0x000005a6] 0x20cc8086             KD  C  cmp.cc         r0,r2
 
For reference, the routine:

        .global __eqdf2
        .balign 4
        HIDDEN_FUNC(__eqdf2)
        /* Good performance as long as the difference in high word is
           well predictable (as seen from the branch predictor).  */
__eqdf2:
        brne.d DBL0H,DBL1H,.Lhighdiff
        bmsk    r12,DBL0H,20
#ifndef __HS__
        /* The next two instructions are required to recognize the FPX
        NaN, which has a pattern like this: 0x7ff0_0000_8000_0000, as
        oposite to 0x7ff8_0000_0000_0000.  */
        or.f    0,DBL0L,DBL1L
        bset.ne r12,r12,21
#endif /* __HS__ */
        add1.f  r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN.  */
        j_s.d   [blink]
        cmp.cc  DBL0L,DBL1L
        .balign 4
.Lhighdiff:
        or      r12,DBL0H,DBL1H
        or.f    0,DBL0L,DBL1L
        j_s.d   [blink]
        bmsk.eq.f r12,r12,30
        ENDFUNC(__eqdf2)

All those results were collected using nsimfree.

Please let me know if you need more info,
Claudiu

RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.

Reply via email to