Hi all,

I would like to contribute to this project by implementing 8-bit quantization 
for 3d convolution. Currently my implementation works fine without auto-tuning. 
It is quite similar to what is happening in 2D:

1. Reshape the input data and the kernel such as the convolution computation 
can be vectorized

2. Perform the convolution computation in a vectorized fashion via dp4a.

3. Reshape the output

The 8-bit convolution outputs are relatively close to the standard convolution 
one.

The auto-tuning step runs smoothly (it takes more time to run) and it outputs a 
log file with the optimal configuration for the 3d convolution 
(conv3d_ncdhw_int8).

However, during the compilation phase, I sometimes encounter the following 
error:
```
[12:11:45] /usr/tvm/src/tir/transforms/loop_partition.cc:548: Cannot prove: 
((((((floordiv(((threadIdx.z*2) + 1), 4) + 1) - floordiv(threadIdx
.z, 2)) - 1) - (29 - (blockIdx.z*4))) + 1) >= 0), when generating the post 
doubt loop
Traceback (most recent call last):                                              
                                                      [0/1634]
  File "tune_relay_cuda_int8.py", line 508, in <module>
    tune_and_evaluate(tuning_option)
  File "tune_relay_cuda_int8.py", line 409, in tune_and_evaluate
    graph, lib, params = relay.build_module.build(mod, target=target, 
params=params)
  File "/usr/tvm/python/tvm/relay/build_module.py", line 260, in build
    graph_json, mod, params = bld_mod.build(mod, target, target_host, params)
  File "/usr/tvm/python/tvm/relay/build_module.py", line 127, in build
    self._build(mod, target, target_host)
  File "tvm/_ffi/_cython/./packed_func.pxi", line 322, in 
tvm._ffi._cy3.core.PackedFuncBase.__call__
  File "tvm/_ffi/_cython/./packed_func.pxi", line 257, in 
tvm._ffi._cy3.core.FuncCall
  File "tvm/_ffi/_cython/./packed_func.pxi", line 246, in 
tvm._ffi._cy3.core.FuncCall3
  File "tvm/_ffi/_cython/./base.pxi", line 160, in tvm._ffi._cy3.core.CALL
tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /usr/tvm/build/libtvm.so(tvm::build(tvm::Map<tvm::runtime::String, 
tvm::IRModule, void, void> const&, tvm::Target const&)+0x83c) [0
x7fd6f772267c]
  [bt] (7) /usr/tvm/build/libtvm.so(tvm::build(tvm::Map<tvm::Target, 
tvm::IRModule, void, void> const&, tvm::Target const&)+0x2c7) [0x7fd6f772
1397]
  [bt] (6) /usr/tvm/build/libtvm.so(tvm::SplitDevHostFuncs(tvm::IRModule, 
tvm::Target const&, tvm::Target const&, tvm::transform::PassContext 
const&)+0x488) [0x7fd6f771fff8]
  [bt] (5) 
/usr/tvm/build/libtvm.so(tvm::transform::Pass::operator()(tvm::IRModule) 
const+0x6a) [0x7fd6f71d8e7a]
  [bt] (4) 
/usr/tvm/build/libtvm.so(tvm::transform::SequentialNode::operator()(tvm::IRModule,
 tvm::transform::PassContext const&) const+0x40e)
 [0x7fd6f7241d1e]
  [bt] (3) 
/usr/tvm/build/libtvm.so(tvm::transform::ModulePassNode::operator()(tvm::IRModule,
 tvm::transform::PassContext const&) const+0x1e2)
 [0x7fd6f723fe52]
  [bt] (2) /usr/tvm/build/libtvm.so(+0x8d347c) [0x7fd6f74d147c]
  [bt] (1) 
/usr/tvm/build/libtvm.so(tvm::tir::MakePackedAPI(tvm::tir::PrimFunc&&, 
int)+0x2d19) [0x7fd6f74ce7a9]
  [bt] (0) 
/usr/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x61) 
[0x7fd6f7138f91]
  File "/usr/tvm/src/tir/transforms/make_packed_api.cc", line 210
TVMError: Not all Vars are passed in api_args:  'threadIdx.z'  is not bound to 
any variables
```

Depending on the optimization that has been found by the auto-tuner, this error 
may or may not occur. For instance, by modifying the log that 
was produced during the auto-tuning, I am able to make a invalid configuration 
actually work. 

Invalid configuration 
```
{"input": ["cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown 
-thread_warp_size=32", "conv3d_NCDHWc_int8.cuda", [["TENSOR", [1, 128, 18, 56, 
56], "int8"], ["TENSOR", [128, 128, 3, 3, 3], "int8"], [1, 1, 1], [1, 1, 1, 1, 
1, 1], [1, 1, 1], "NCDHW", "int32"], {}], "config": {"index": 77070610321, 
"code_hash": null, "entity": [["tile_f", "sp", [-1, 1, 8, 2]], ["tile_d", "sp", 
[-1, 1, 1, 2]], ["tile_y", "sp", [-1, 1, 7, 2]], ["tile_x", "sp", [-1, 2, 1, 
1]], ["fuse_yx", "ot", 0], ["tile_rc", "sp", [-1, 1]], ["tile_rd", "sp", [-1, 
1]], ["tile_ry", "sp", [-1, 1]], ["tile_rx", "sp", [-1, 1]], ["reorder_inner", 
"re", [1, 2, 0, 3]], ["auto_unroll_max_step", "ot", 1500]]}, "result": 
[[0.0027175069], 0, 11.701743602752686, 1603898087.1376908], "version": 0.2, 
"tvm_version": "0.8.dev0"}
```

Valid configuration 
```
{"input": ["cuda -keys=cuda,gpu -max_num_threads=1024 -model=unknown 
-thread_warp_size=32", "conv3d_NCDHWc_int8.cuda", [["TENSOR", [1, 128, 18, 56, 
56], "int8"], ["TENSOR", [128, 128, 3, 3, 3], "int8"], [1, 1, 1], [1, 1, 1, 1, 
1, 1], [1, 1, 1], "NCDHW", "int32"], {}], "config": {"index": 77070610321, 
"code_hash": null, "entity": [["tile_f", "sp", [-1, 1, 8, 1]], ["tile_d", "sp", 
[-1, 1, 1, 2]], ["tile_y", "sp", [-1, 1, 7, 2]], ["tile_x", "sp", [-1, 2, 1, 
1]], ["fuse_yx", "ot", 0], ["tile_rc", "sp", [-1, 1]], ["tile_rd", "sp", [-1, 
1]], ["tile_ry", "sp", [-1, 1]], ["tile_rx", "sp", [-1, 1]], ["reorder_inner", 
"re", [1, 2, 0, 3]], ["auto_unroll_max_step", "ot", 1500]]}, "result": 
[[0.0027175069], 0, 11.701743602752686, 1603898087.1376908], "version": 0.2, 
"tvm_version": "0.8.dev0"}
```

I am not sure how to solve this problem. What would you advice me ?





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/quantization-and-3d-convolution/8338/1) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/bef7fe2403c32b4c1bbe047d8c76f38257813d59fcc398535dae7b64c7b26e9f).

Reply via email to