TVMScript also provide `doc.NodeVisitor` and `doc.NodeTransformer` for AST
level mutations. In our private fork we collect all lhs variables with multiple
`Assign` or `AugAssign` bindings. They are implicitly transformed to scalar
buffer load & store during parsing to structure same as @spectr
I have a implementation done in my fork, which is to introduce sugars at
parser/printer level. In the IR local variables remain buffers.
```py
A = T.alloc_cell("int32")
A = A + 1
T.cp_async(A.buffer.data, ...)
```
Parser will create a buffer with dtype `int32` with shape `[1]`, but in the
va
>From merged pr https://github.com/apache/tvm/pull/17278/files . We can do code
>generation for `LetStmt`. For example, if we disable let-inlining in the
>Simplify Pass:
```python
iter_var: T.int32() = T.ceildiv(K, block_K)
for ko in T.serial(iter_var):
...
```
The generated CUDA code bec