Hi @slyubomirsky @tqchen , can we enable multiple outputs for `call_tir_inplace`?
We have a use case of fusing rotary embedding and flashattention in MLC-LLM, the programming interface is: ``` @T.prim_func def fused_rotary_flashattention(k: T.Buffer(...), q: T.Buffer(...), v: T.Buffer(...), output: T.Buffer(...)): ... updated_k, output = T.call_tir_inplace(fused_rotary_flashattention(k, q, v), (update_k_shape, output_shape)) ``` and the updates on k will be in-place. --- [Visit Topic](https://discuss.tvm.apache.org/t/discuss-inplace-update-in-dataflow-block/14669/9) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/e0b323786c3a7068143721a61e005c0496c10b35e43db0d80efcc8eed2caa423).