在 2023-05-26 23:40, Stefan Kanthak 写道:
Feel free to propose this alternative here (better elsewhere, where you'll earn less laughter). But don't forget that this 23-bit mantissa will be all zeroes for quite some 64-bit (and even 32-bit) integers which are no power of 2, for example 0x8000003fffffffff, and that both FILD and CVT2SI2SS only work on SIGNED integers.
The precision loss can be detected by examining the PF bit (6th bit i.e. `0x20`) of the x87 status register. It doesn't matter whether the number is interpreted as signed or unsigned: `-0x80000000'00000000` still only has one bit in its mantissa. Another option is to store the number in the 80-bit extended precision format, with a 64-bit mantissa which includes the otherwise hidden bit (so if the number is a power of two, the mantissa will be `0x80000000'00000000`).
But anyway, traditional x86 has very few GPRs and GCC doesn't optimize multi-word arithmetic very well. Performance may or may not vary depending on cache locality and number of μops; not to mention `movq` and `movd` which have relative high latencies. I would like to see some benchmarking results first.
-- Best regards, LIU Hao
OpenPGP_signature
Description: OpenPGP digital signature