https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908

--- Comment #10 from hubicka at kam dot mff.cuni.cz ---
>        |     b = 2.0 * ray.dir.x * (ray.orig.x - sph->pos.x) +                
> #
>        |       movupd   (%rdi),%xmm5                                          
> #
>        |     2.0 * ray.dir.y * (ray.orig.y - sph->pos.y) +                    
> #
>        |     2.0 * ray.dir.z * (ray.orig.z - sph->pos.z);                     
> #
>   0.02 |       movsd    0x10(%rdi),%xmm9                                      
> #
>   0.01 |       movupd   0xb8(%rsp),%xmm13                                     
> #
>  37.67 |       movupd   0xa0(%rsp),%xmm15                                
> 
> so we pass struct ray on the stack(?) and perform SSE loads from it but
> the argument passing does
> 
>   0.88 |       movups %xmm2,(%rsp)                                            
> #
>   0.22 |       movups %xmm3,0x10(%rsp)                                        
> #
>  43.81 |       movups %xmm4,0x20(%rsp)                                        
> #
>   0.66 |       call   ray_sphere                   

Adding Martin to CC.  I think we could teach ipa-sra to, with -flto,
turn the structure either to scalar arguments or to be passed by
reference which would allow us to hoist its initialization out of the
loop body.

Honza

Reply via email to