Your solution makes sense to me. This mechanism is used for the case that a 
BYOC backend attempts to manage the constant values with certain processes, 
such as layout transform. It works well for other codegens (e.g., JSON), but as 
you pointed out, we never really solve this problem for C codegen.

IMHO, we could have a specialized mechanism for C codegen to manage constants. 
For example, we could let C codegen serialize the constants to a separate 
artifact file, and encapsulate it along with the generated/compiled engines, 
and load them to the memory at the first execution.

On the other hand, the reason that BYOC backends may need to manage constants 
by themselves is because the processed constants may violate the typing (e.g., 
layout or data type), so another approach is to let C codegen register/update 
constants to metadata module. This should be done via constant updater: 

https://github.com/apache/tvm/blob/70f2297191d0d2b9efb6b1b6257e6f9755f516ca/tests/python/relay/test_external_codegen.py#L183

For the second question that uses JSON runtime, in this case the flow may look 
like the following, which is similar to TensorRT:
1. In codegen, simply output JSON graph and constants.
2. In runtime, at the first iteration, run the C codegen according to the JSON 
graph and input data, and profile/compile the generated C code to be executable 
kernels. As you can imagine, the first iteration may be very slow in this case.
3. Cache and execute kernels.
4. In the rest iterations, simply use the compiled kernels as it is.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/byoc-cutlass-dealing-with-constants-in-c-source-gen-based-byoc/11362/3)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/26df6a12c97c36611432e78c246ca49fe74a2ebfc7ee57e043126c436b923083).

Reply via email to