Indeed, that could be the case. But for example, if I would like to tensorize this TIR to leverage specific instructions for a particular HW, having the opportunity to actually merge the bias into the update block could be highly beneficial.
It is very common, for example, for vector ISAs to have some kind of macc instruction. I could, for example, preload the bias into a register, and then accumulate the multiplication directly onto the preloaded register. --- [Visit Topic](https://discuss.tvm.apache.org/t/tir-problem-inlining-addition-into-matmul-block/18066/5) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/2fbdd00cfe7804ba64bbde813ac5ad7b08cd7a7390ea835bdd6903af3854b724).