The recently merged [CUTLASS BYOC](https://github.com/apache/tvm/pull/9261) relies on C-codegen based BYOC infra to JIT generate and compile C++ template classes. Currently it doesn't support Constants embedded in an external function and instead requires all weight and bias parameters etc to be passed in at runtime. This caused a problem for me, when I apply CUTLASS BYOC to a real model: I need to run constant folding to turn fp32 bias parameters into fp16 for pattern matching purpose and sending fp16 tensors to CUTLASS. For that, I need to bind parameters to the module by `bind_params_by_name`, which embeds constant to the external functions like this, which is not supported by CUTLASS BYOC right now: ``` def @tvmgen_default_cutlass_main_267(%cutlass_267_i0: Tensor[(1024, 1024), float16], %cutlass_267_i1: Tensor[(4096, 1024), float16], Inline=1, Compiler="cutlass", global_symbol="tvmgen_default_cutlass_main_267", Primitive=1) -> Tensor[(1024, 4096), float16] { %9 = fn (%FunctionVar_8_0: Tensor[(1024, 1024), float16], %FunctionVar_8_1: Tensor[(4096, 1024), float16], %FunctionVar_8_2: Tensor[(4096), float16], PartitionedFromPattern="nn.dense_add_multiply_cast_erf_cast_multiply_add_multiply_", Composite="cutlass.dense_bias_gelu_fp16") -> Tensor[(1024, 4096), float16] { %1 = nn.dense(%FunctionVar_8_0, %FunctionVar_8_1, units=None, out_dtype="float16") /* ty=Tensor[(1024, 4096), float16] */; %2 = add(%1, %FunctionVar_8_2) /* ty=Tensor[(1024, 4096), float16] */; %3 = multiply(%2, meta[relay.Constant][0] /* ty=float16 */) /* ty=Tensor[(1024, 4096), float16] */; %4 = cast(%3, dtype="float32") /* ty=Tensor[(1024, 4096), float32] */; %5 = erf(%4) /* ty=Tensor[(1024, 4096), float32] */; %6 = cast(%5, dtype="float16") /* ty=Tensor[(1024, 4096), float16] */; %7 = multiply(%6, meta[relay.Constant][1] /* ty=float16 */) /* ty=Tensor[(1024, 4096), float16] */; %8 = add(%7, meta[relay.Constant][2] /* ty=float16 */) /* ty=Tensor[(1024, 4096), float16] */; multiply(%8, %2) /* ty=Tensor[(1024, 4096), float16] */ }; // meta[relay.Constant][3] is the bias constant, not supported by CUTLASS BYOC for now %9(%cutlass_267_i0, %cutlass_267_i1, meta[relay.Constant][3] /* ty=Tensor[(4096), float16] */) /* ty=Tensor[(1024, 4096), float16] */ } ```
So I now need to deal with Constants. I think embedding all constants into C-source is infeasible for models like `BERT-large` which I'm working with. Alternative I think of is to somehow "unbind" constants after constant folding. But this requires modifying signatures of external functions and passing additional parameters inside `main` module, for which I don't see an easy way to achieve. My questions: * Is there a good way to deal with Constants in C-source codegen based BYOC? Has there been any improvement since discussions from last year such as https://discuss.tvm.apache.org/t/external-codegen-constant-tensors-in-c-codegen/5890 and https://github.com/apache/tvm/pull/5310 (also cc @lhutton1 @manupa-arm @matt-arm) * Should CUTLASS codegen switch to JSON runtime, which I believe has no issues with constants? How can we compile generated C-source with JSON based BYOC? cc @Laurawly @comaniac @zhiics --- [Visit Topic](https://discuss.tvm.apache.org/t/byoc-cutlass-dealing-with-constants-in-c-source-gen-based-byoc/11362/1) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/19c33f7efdc48881d2d968f7aeaa4a06e6ed321155a2859a15cc7f0473fa56f7).