Thanks @MeeraN7 @giuseros, to make the discussion more concrete, right now the IR after legalization looks like
```c++ for (i: int32, 0, 17;i+=VL) { C_2[ramp(i, 1, VL)] = ((int8xVL*)A_2[ramp(i, 1, VL)] + (int8xVL*)B_2[ramp(i, 1, VL)]) } ``` This would require changes such as the ramp data structure and data type to support the VL vector types. I wonder if they are necessary. Here is a possible alternative way to do so ```c++ for (i: int32, 0, 17;i, annotation={"VLA"}) { C_2[i] = A_2[i] + B_2[i]; } ``` And we will be defering the vectorized instruction generation to the codegen phase, by specially handling the patterns in the for that is annotated with VLA loop. Of course we can only support a limited set of patterns(such as read/write to the same vector index or limited reduction support), that is why legalize is needed to make sure the body of VLA for loop satiesfies the pattern. In this way we can likely get a similar set of things without hacking into get a ramp with VL size -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/tvm-rfcs/pull/18#issuecomment-916553296