On 2/24/20 11:08 AM, Jakub Jelinek wrote:
On Mon, Feb 24, 2020 at 11:04:55AM -0600, Bill Schmidt wrote:
+ if (clonei->simdlen
+ && (clonei->simdlen < 2
+ || clonei->simdlen > 1024
Assuming that clonei->simdlen matches "vector length" in the ABI, 1024 is
too large a number. We can have at most 8 vector registers containing
a homogeneous aggregate, each having up to 16 elements, so the correct
limit would be 128.
Well, further arguments can be passed on the stack...
Well, ELFv2 doesn't define such a thing as a qualified homogeneous aggregate.
See rs6000_discover_homogeneous_aggregate and "Parameter Passing in
Registers" in ELFv2. So the entire aggregate would be passed in memory,
not just the excess after 128 bytes. I don't think this is necessarily
something we want to encourage in an interface intended to improve
performance. Is there any reason we need to permit a larger value? Do we
need to add this constraint to rs6000_simd_clone_usable?
Thanks,
Bill
Jakub