rjmccall added a comment.

In D108643#2999776 <https://reviews.llvm.org/D108643#2999776>, @erichkeane 
wrote:

>> ! In D108643#2965852 <https://reviews.llvm.org/D108643#2965852>, @rjmccall 
>> wrote:
>>
>> The choice that high bits are unspecified rather than extended is an 
>> interesting one.  Can you speak to that?  That's good for +, -, *, &, |, ^, 
>> <<, and narrowing conversions, but bad for ==, <, /, >>, and widening 
>> conversions.
>
> So we chose this for a few reasons:
>
> 1- Consistency with struct padding bits.  It seemed strange to specify the 
> padding bits here, when the rest of the standards/ABI don't specify padding 
> bits.

I think it's a mistake to think of these as padding bits.  Implementations on 
general-purpose hardware will be doing operations in word-sized chunks; these 
are the high bits of the most significant word.

> 2- Flexibility of implementation: Requiring zeroing is a more constraining 
> decision, which limits implementation to having to set these bits.  By 
> leaving it unspecified, the implementation is free to zero them out if it 
> feels it is worth-while.

This is a trade-off.  Extending constrains the implementation of operations 
that produce noise in the high bits.  Extending constrains the implementation 
of operations that are affected by noise in the high bits.  I'm willing to 
believe that the trade-off favors leaving the bits undefined, but that's why 
I'm asking, to see if you've actually evaluated this trade-off, because it 
kindof sounds like you've evaluated one side of it.

> I'll note that our backends choose NOT to zero them out when not necessary, 
> since (so I'm told) 'masked' compares are trivial in most processors.

They're trivial to implement in custom hardware, of course, but what existing 
ISAs actually provide masked compare instructions?  Is this a common feature 
I'm simply totally unaware of?  In practice I think this will be 1-2 extra 
instructions in every comparison.

> 3- Implement-ability on FPGAs: Since this was our motivating example, forcing 
> an FPGA to zero out these bits when dealing with an interaction with a 
> byte-aligned processor would have incredible performance overhead.

How on earth does making the store unit zero/sign-extend have "incredible 
performance overhead"?  This is totally trivial technologically.  It's not like 
you're otherwise sending 17-bit stores out on the bus.

I'm not sure it's appropriate to think of this as primarily an FPGA feature 
when in fact it's being added to standard targets.

> 4- Ease of implementation: Forcing LLVM to zero out these bits would either 
> mean we had to do quite a bit of work in our CodeGen to zero them out, or 
> modify most of the backends to not zero padding bits in these cases. Since 
> there isn't a particular performance benefit (see #2) we didn't think it 
> would be worth while.

The obvious lowering would be for clang to use i17 as the scalar type lowering 
but i32 as the "in-memory" lowering, then make sure that the backends are 
reasonably intelligent about folding extend/trunc operations around operations 
that aren't sensitive / don't produce noise.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108643/new/

https://reviews.llvm.org/D108643

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to