BALATON Zoltan <bala...@eik.bme.hu> writes:
> The low level extract and deposit funtions provided by bitops.h are > used in performance critical places. It crept into target/ppc via > FIELD_EX64 and also used by softfloat so PPC code using a lot of FPU > where hardfloat is also disabled is doubly affected. Most of these asserts compile out to nothing if the compiler is able to verify the constants are in the range. For example examining the start of float64_add: Dump of assembler code for function float64_add: ../../fpu/softfloat.c: 1979 { 0x00000000007ac9b0 <+0>: movabs $0xfffffffffffff,%r9 0x00000000007ac9ba <+10>: push %rbx /home/alex/lsrc/qemu.git/include/qemu/bitops.h: 396 return (value >> start) & (~0ULL >> (64 - length)); 0x00000000007ac9bb <+11>: mov %rdi,%rcx 0x00000000007ac9be <+14>: shr $0x34,%rcx 0x00000000007ac9c2 <+18>: and $0x7ff,%ecx ../../fpu/softfloat.c: 1979 { 0x00000000007ac9c8 <+24>: sub $0x30,%rsp /home/alex/lsrc/qemu.git/include/qemu/bitops.h: 396 return (value >> start) & (~0ULL >> (64 - length)); 0x00000000007ac9cc <+28>: mov %fs:0x28,%rax 0x00000000007ac9d5 <+37>: mov %rax,0x28(%rsp) 0x00000000007ac9da <+42>: mov %rdi,%rax 0x00000000007ac9dd <+45>: and %r9,%rdi ../../fpu/softfloat.c: 588 *r = (FloatParts64) { 0x00000000007ac9e0 <+48>: mov %ecx,0x4(%rsp) 0x00000000007ac9e4 <+52>: mov %rdi,0x8(%rsp) /home/alex/lsrc/qemu.git/include/qemu/bitops.h: 396 return (value >> start) & (~0ULL >> (64 - length)); 0x00000000007ac9e9 <+57>: shr $0x3f,%rax ../../fpu/softfloat.c: 588 *r = (FloatParts64) { 0x00000000007ac9ed <+61>: mov %al,0x1(%rsp) 589 .cls = float_class_unclassified, 590 .sign = extract64(raw, f_size + e_size, 1), 0x00000000007ac9f1 <+65>: mov %rax,%r8 I don't see any check and abort steps because all the shift and mask values are known at compile time. The softfloat compilation certainly does have some assert points though: readelf -s ./libqemu-ppc64-softmmu.fa.p/fpu_softfloat.c.o |grep assert 136: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND g_assertion_mess[...] 138: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND __assert_fail but the references are for the ISRA segments so its tricky to know if they get used or are just there for LTO purposes. If there are hot-paths that show up the extract/deposit functions I suspect a better approach would be to implement _nocheck variants (or maybe _noassert?) and use them where required rather than turning off the assert checking for these utility functions. -- Alex Bennée Virtualisation Tech Lead @ Linaro