Indeed, that could be the case. But for example, if I would like to tensorize 
this TIR to leverage specific instructions for a particular HW, having the 
opportunity to actually merge the bias into the update block could be highly 
beneficial. 

It is very common, for example, for vector ISAs to have some kind of macc 
instruction. I could, for example, preload the bias into a register, and then 
accumulate the multiplication directly onto the preloaded register.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/tir-problem-inlining-addition-into-matmul-block/18066/5)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/2fbdd00cfe7804ba64bbde813ac5ad7b08cd7a7390ea835bdd6903af3854b724).

Reply via email to