HI @tqchen , 
I will try to sporadically comment, since this is a project I prototyped (and 
enjoyed :) ) when I was in Arm. 

If I understand your comment correctly, what @MeeraN7 is doing is closer to 
what you are proposing. Instead of transforming a loop into a Ramp, and passing 
the ramp "as is" to LLVM (which is what is done for fixed vector length, but 
*not doable* for SVE) @MeeraN7 is legalising the loop in TIR and passing the 
legal loop down to the LLVM code-generator. In other words, the following loop:
```
for i = 1:1:10
A[i] = B[i]+C[i];
end
```
Once legalised becomes
```
for i = 1:VL:10
A[VL] = B[VL]+C[VL]
end
```

And then the LLVM code generator, knowing that this is a variable loop, 
translates this it with some LLVM intrinsics for:
* Predicated load/store
* Loop increment
* Predication mask calculation
Please note that only load/store needs to be predicated. Other 
register-to-register operations (e.g., add/sub/mul)  won't need predication. 

Please also note that while predication is not a TIR concept, we use it to 
support VLA (cc @tkonolige) in the LLVM codegen. In future it should be quite 
straightforward to expose predication also in TIR (if required). 

@MeeraN7 feel free to jump in if something I said is not correct (very likely 
:) )

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/tvm-rfcs/pull/18#issuecomment-893870692

Reply via email to