[Bug middle-end/29756] SSE intrinsics hard to use without redundant temporaries appearing

timday at bottlenose dot demon dot co dot uk Wed, 08 Nov 2006 02:01:26 -0800


------- Comment #3 from timday at bottlenose dot demon dot co dot uk  
2006-11-08 10:01 -------
I've just tried an alternative version (will upload later) replacing the union
with a single
  __v4sf _rep,
and implementing the [] operators using e.g
  (reinterpret_cast<const float*>(&_rep))[i];
However the code generated by the two transform implementations remains the
same (20 and 32 instructions anyway; haven't checked the details yet).
Maybe not surprising as it's just moving the problem around.


The big difference between the two methods is perhaps primarily that the bad
one involves a __v4sf->float->__vfs4 conversion, while the good one uses __v4sf
throughout by using the mul_compN methods.  I'll try and prepare a more concise
test case based on the premise that bad handling of __v4sf <-> float is the
real issue.


-- 

timday at bottlenose dot demon dot co dot uk changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |timday at bottlenose dot
                   |                            |demon dot co dot uk


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29756

[Bug middle-end/29756] SSE intrinsics hard to use without redundant temporaries appearing

Reply via email to