Kean Johnston wrote:
Ok I am no compiler expert, so this may be totally impossible, and if so I'd appreciate an education, but this is what I instinctively thought of when first thinking about this problem.
Note that never mind SSE1, even conventional 8-byte fpt, while not requiring fpt for correctness, but sure does for efficiency.