Sorry for the delayed reply in this discussion. Here are a few thoughts. Let us put a concise namespace for the quantization dialect. Two possible candidates:
- ```relay.op.qnn```, e.g. relay.op.qnn.conv2d - The qnn name is consistent with QNNPack - ```relay.op.tflite``` - The op name is a dialect. In both cases, they are a dialect of relay, which means by default we do not want to introduce special implementation, but instead will translate them into existing core ops. We need to have a special op_level for these core ops. I still think we should minimize the number of operators, and directly translate to lower ops if possible. This includes things like ``` quantize/dequantize```, and qnn.concat. Please discuss this alternative and list pros and cons. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/2351#issuecomment-506970753