[Apache TVM Discuss] [Questions] Intution on why this int8 algorithm is slower?

Wheest via Apache TVM Discuss Tue, 07 Jun 2022 09:55:48 -0700


I've been exploring quantization in TVM, and one thing that I found that on the 
CPU there is a special compute/schedule for running int8 conv2d  on the CPU 
([see 
here](https://github.com/apache/tvm/blob/main/python/tvm/topi/x86/conv2d_int8.py#L132)).
  From what I can tell, it seems to be pretty much the same as standard CPU 
spatial pack convolution.

To explore this, I tried disabling this special compute/schedule, and let the
quantized model use the standard spatial pack algorithm (just running
quantized). When I do this, I see an expected slowdown compared to the
specialized version, however I see an unexpected slow down compared to the
`float32` version of the same algorithm.

For [a simple
example](https://gist.github.com/Wheest/42df546cedf084eaf8a4206c19a273b4) I get
the following results:

```
default int8: 7.529054908081889
modified int8: 23.42591354623437
normal float32: 11.465726513415575
```

(Disabling the algorithm is very simple: just comment out the if block that
checks for int8
[here](https://github.com/apache/tvm/blob/70884e957aa5c8de9c02c25a14d30563d7300cb9/python/tvm/relay/op/strategy/x86.py#L117)).

My main question is why am I seeing a slowdown using the standard convolution
approach?

Surely the operations would be the same, just using integers? And on most
CPUs, that would take fewer clock cycles. Where would that overhead be coming
from?

I would assume that the specialised compute/schedule would better exploit the
quantization (e.g. the fact you can get more values into SIMD). However that
still doesn't explain why `modified` is slower than `normal`.

---
[Visit
Topic](https://discuss.tvm.apache.org/t/intution-on-why-this-int8-algorithm-is-slower/12920/1)
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click
here](https://discuss.tvm.apache.org/email/unsubscribe/84119296e33c61c329c0be00fe7ad489f5f67502be9b21b8b80719992b46c98a).

[Apache TVM Discuss] [Questions] Intution on why this int8 algorithm is slower?

Reply via email to