On 10/23/2017 11:19 AM, Lennart Sorensen wrote:
On Fri, Oct 20, 2017 at 10:55:34PM -0400, Dennis Clarke wrote:
WARNING : long and winding but full of examples.
<snip>
So the comment mentions pi = 3.14159265358979323846264338327950280e+00L
while you use pi_ld = 3.1415926535897932384626433832795028841971L;
so that's different and might explain the slight difference in hex value.
As a follow up my current build of gcc 7.2.0 outputs the correct ppc64
assembly for a perfect representation in float128 format :
.file "not_pi.c"
.machine power4
.section ".text"
.Ltext0:
.cfi_sections .debug_frame
.globl _q_qtod
.globl _q_qtos
.section .rodata
.align 3
.LC1:
.string "C"
.align 3
.
..
. etc
In the above we see the correct references to the conversion subroutines
_q_qto{d|s} for the double and "single" float datatypes.
The correct constant conversion for long double pi is in LC0 thus :
.LC0:
.long 1073779231
.long 3041149649
.long 2221509004
.long 3306619320
.section ".text"
There we see the 16 byte value represented in four integers :
1073779231 == 4000921F
3041149649 == B54442D1
2221509004 == 8469898C
3306619320 == C51701B8
This is similar to what we see on sparc64 assembly from Oracle c99 :
! 44 long double pi_ld =
3.1415926535897932384626433832795028841971L;
sethi %h44(.L_cseg0),%l0
or %l0,%m44(.L_cseg0),%l0
sllx %l0,12,%l0
or %l0,%l44(.L_cseg0),%l0
ldd [%l0+0],%f8
ldd [%l0+8],%f10
std %f8,[%fp+719]
std %f10,[%fp+727]
The constant conversion is handled in a similar way but as 8-byte hex
integers in sequence :
.L_cseg0:
.xword 0x4000921fb54442d1LL,0x8469898cc51701b8LL
.type .L_cseg0,#object
.size .L_cseg0,16
The conversions ( casts ) are handled in nearly identical fashion with
calls to _Qp_qto{d|s} and I think we have the sources on those :
! 47 double pi_d_c = (double) pi_ld;
ldd [%fp+719],%f8
ldd [%fp+727],%f10
add %fp,687,%o0
std %f8,[%fp+687]
std %f10,[%fp+695]
call _Qp_qtod
nop
std %f0,[%fp+711]
! 48 float pi_f_c = (float) pi_ld;
ldd [%fp+719],%f8
ldd [%fp+727],%f10
add %fp,687,%o0
std %f8,[%fp+687]
std %f10,[%fp+695]
call _Qp_qtos
nop
st %f0,[%fp+707]
Again the conversion is perfect. The assembly from gcc 7.2.0 is nearly
identical to what one would expect on ppc64 :
.file "not_pi.c"
.section ".text"
.global _Qp_qtod
.global _Qp_qtos
.section ".rodata"
.align 8
.
.
.
.LLC0:
.long 1073779231
.long 3041149649
.long 2221509004
.long 3306619320
.section ".text"
.align 4
.global main
.type main, #function
.proc 04
So that is perfect and from gcc 7.2.0. Final link stage on ppc64 is
still an issue and I am looking into that. Works fine on sparc64.
And given you have to pass the -m argument, makes you wonder of
the math library functions are compiled with the same option or some
other option with different results. After all I believe the default
128bit floating point used to be the ibm extended format, not IEEE.
The real issue seems to be -mfloat128 me thinks. In any case the
assembly looks fine and I will keep digging. Most likely a gcc bug
to be filed regarding a hard requirement for vector scalar hardware.
Dennis