HI @tqchen , I will try to sporadically comment, since this is a project I prototyped (and enjoyed :) ) when I was in Arm.
If I understand your comment correctly, what @MeeraN7 is doing is closer to what you are proposing. Instead of transforming a loop into a Ramp, and passing the ramp "as is" to LLVM (which is what is done for fixed vector length, but *not doable* for SVE) @MeeraN7 is legalising the loop in TIR and passing the legal loop down to the LLVM code-generator. In other words, the following loop: ``` for i = 1:1:10 A[i] = B[i]+C[i]; end ``` Once legalised becomes ``` for i = 1:VL:10 A[VL] = B[VL]+C[VL] end ``` And then the LLVM code generator, knowing that this is a variable loop, translates this it with some LLVM intrinsics for: * Predicated load/store * Loop increment * Predication mask calculation Please note that only load/store needs to be predicated. Other register-to-register operations (e.g., add/sub/mul) won't need predication. Please also note that while predication is not a TIR concept, we use it to support VLA (cc @tkonolige) in the LLVM codegen. In future it should be quite straightforward to expose predication also in TIR (if required). @MeeraN7 feel free to jump in if something I said is not correct (very likely :) ) -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/tvm-rfcs/pull/18#issuecomment-893870692