Hi Robin:

Just got few more clarifications from Andrew about the behavior for
the valid* LMUL for ELEN=32,

* valid may not be a precise word, anyway, the spec guarantees that it
should be implemented.

Spec[1] say:

---

When LMUL < SEWMIN/ELEN, there is no guarantee an implementation would
have enough bits in the fractional vector register to store at least
one element, as VLEN=ELEN is a valid implementation choice. For
example, with VLEN=ELEN=32, and SEWMIN=8, an LMUL of 1/8 would only
provide four bits of storage in a vector register.

For a given supported fractional LMUL setting, implementations must
support SEW settings between SEWMIN and LMUL * ELEN, inclusive.

---

So valid range fractional LMUL for SEW=8, 16 32 are:

mf8 = [8, (1/8)*32] = [8, 4] = [], no SEW is valid with mf8 for ELEN = 32
mf4 = [8, (1/4)*32] = [8, 8] = only SEW 8 with mf4 is valid
mf2 = [8, (1/2)*32] = [8, 16] = SEW 8 and 16 with mf2 are valid

[1] 
https://github.com/riscvarchive/riscv-v-spec/blob/master/v-spec.adoc#342-vector-register-grouping-vlmul20


>> >> In particular, how would the same LMUL for AVL=2 and AVL=4 and the same 
>> >> data
>> >> type be correct?
>> >
>> > That's right. The case just allocates more space, but storing 2 and 4
>> > elements remains the same.
>>
>> Even if a V2SI with LMUL=1 on VLEN=128 doesn't lead to a SIGILL right away
>> it would surely modify the overlap constraints and such.  To me that doesn't
>> look right.

I am not sure I got the point, we are using early clobber to avoid the
overlap constraint, that's pretty conservative way and should not lead
the problem here when we using LMUL=1/SEW=32 for V2SI, or are you
worry about we may put two V2SI within single vector register?

Reply via email to