Hi @slyubomirsky @tqchen , can we enable multiple outputs for 
`call_tir_inplace`?

We have a use case of fusing rotary embedding and flashattention in MLC-LLM, 
the programming interface is:
```
@T.prim_func
def fused_rotary_flashattention(k: T.Buffer(...), q: T.Buffer(...), v: 
T.Buffer(...), output: T.Buffer(...)):
    ...

updated_k, output = T.call_tir_inplace(fused_rotary_flashattention(k, q, v), 
(update_k_shape, output_shape))
```

and the updates on k will be in-place.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/discuss-inplace-update-in-dataflow-block/14669/9)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/e0b323786c3a7068143721a61e005c0496c10b35e43db0d80efcc8eed2caa423).

Reply via email to