Hey Chris,
The two extensible bits will be done through user defined callable functions. For the green list/gray list/red list situation, we have the user define a function which given a Call node, returns the color of the operation. For the initial implementation we will just do a naive solution like placing all conv2d's in the green list, all elementwise in the graylist, etc. For the accumulation datatype, I imagine a user defined function which given a Call node, returns the accumulation datatype and the output datatype of the operation. The accumulation datatype is self explanatory and confusingly maps to the existing "output_dtype" field in existing relay ops like conv and dense. Our "new" output datatype meanwhile for example tells what precision other operations will ingest the results of the operation at: weight (fp16 or fp32), data (fp16 or fp32) --> conv2d (accumulation_dtype) -> cast(output_dtype). If the accumulation_dtype == output_dtype then we don't need the cast. Finally, to answer your question, in the scenario given we would simply express conv2d as an operator with an accumulation_dtype of fp32 and an output_dtype of fp16. This should give the final graph listed (don't know about the operator fusion part though to be honest, not sure if all the knobs on the fused operator are there, if not guess I have to do something about that too). In a sense we do have separate "accumulator_dtype" and "output_dtypes" then the user can define on a per-operation basis. I hope that answers your question and I hope it is sufficient for most applications! For the default I am going to do something simple like define all operators which support accumulation datatypes separate from the input datatypes accumulate into fp32 but output into fp16. Otherwise, we assume it accumulates in fp16 and outputs fp16 (e.g. for elementwise operators). There are downsides with this simplistic method. One downside is the only sort of analysis that is easy with this framework looking at the current operator. That is to say, it's kind of cumbersome to look ahead and backward to make decisions. There are some other theoretical limitations in the graphs it can easily generates but I think it covers most reasonable scenarios! \ --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-relay-fp32-fp16-model-support/9994/3) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/8cbd80bcf0fd9d26220b9ab517b698049b2e331d117d32d0b95dff0bb201cc71).