The recently merged [CUTLASS BYOC](https://github.com/apache/tvm/pull/9261) 
relies on C-codegen based BYOC infra to JIT generate and compile C++ template 
classes.
 
Currently it doesn't support Constants embedded in an external function and 
instead requires all weight and bias parameters etc to be passed in at runtime. 
This caused a problem for me, when I apply CUTLASS BYOC to a real model: I need 
to run constant folding to turn fp32 bias parameters into fp16 for pattern 
matching purpose and sending fp16 tensors to CUTLASS. For that, I need to bind 
parameters to the module by `bind_params_by_name`, which embeds constant to the 
external functions like this, which is not supported by CUTLASS BYOC right now:
```
def @tvmgen_default_cutlass_main_267(%cutlass_267_i0: Tensor[(1024, 1024), 
float16], %cutlass_267_i1: Tensor[(4096, 1024), float16], Inline=1, 
Compiler="cutlass", global_symbol="tvmgen_default_cutlass_main_267", 
Primitive=1) -> Tensor[(1024, 4096), float16] {
  %9 = fn (%FunctionVar_8_0: Tensor[(1024, 1024), float16], %FunctionVar_8_1: 
Tensor[(4096, 1024), float16], %FunctionVar_8_2: Tensor[(4096), float16], 
PartitionedFromPattern="nn.dense_add_multiply_cast_erf_cast_multiply_add_multiply_",
 Composite="cutlass.dense_bias_gelu_fp16") -> Tensor[(1024, 4096), float16] {
    %1 = nn.dense(%FunctionVar_8_0, %FunctionVar_8_1, units=None, 
out_dtype="float16") /* ty=Tensor[(1024, 4096), float16] */;
    %2 = add(%1, %FunctionVar_8_2) /* ty=Tensor[(1024, 4096), float16] */;
    %3 = multiply(%2, meta[relay.Constant][0] /* ty=float16 */) /* 
ty=Tensor[(1024, 4096), float16] */;
    %4 = cast(%3, dtype="float32") /* ty=Tensor[(1024, 4096), float32] */;
    %5 = erf(%4) /* ty=Tensor[(1024, 4096), float32] */;
    %6 = cast(%5, dtype="float16") /* ty=Tensor[(1024, 4096), float16] */;
    %7 = multiply(%6, meta[relay.Constant][1] /* ty=float16 */) /* 
ty=Tensor[(1024, 4096), float16] */;
    %8 = add(%7, meta[relay.Constant][2] /* ty=float16 */) /* ty=Tensor[(1024, 
4096), float16] */;
    multiply(%8, %2) /* ty=Tensor[(1024, 4096), float16] */
  };
  // meta[relay.Constant][3] is the bias constant, not supported by CUTLASS 
BYOC for now
  %9(%cutlass_267_i0, %cutlass_267_i1, meta[relay.Constant][3] /* 
ty=Tensor[(4096), float16] */) /* ty=Tensor[(1024, 4096), float16] */
}
```

So I now need to deal with Constants. I think embedding all constants into 
C-source is infeasible for models like `BERT-large` which I'm working with. 
Alternative I think of is to somehow "unbind" constants after constant folding. 
But this requires modifying signatures of external functions and passing 
additional parameters inside `main` module, for which I don't see an easy way 
to achieve. 

My questions:
* Is there a good way to deal with Constants in C-source codegen based BYOC? 
Has there been any improvement since discussions from last year such as 
https://discuss.tvm.apache.org/t/external-codegen-constant-tensors-in-c-codegen/5890
 and https://github.com/apache/tvm/pull/5310 (also cc @lhutton1 @manupa-arm 
@matt-arm)
* Should CUTLASS codegen switch to JSON runtime, which I believe has no issues 
with constants? How can we compile generated C-source with JSON based BYOC? cc 
@Laurawly @comaniac @zhiics





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/byoc-cutlass-dealing-with-constants-in-c-source-gen-based-byoc/11362/1)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/19c33f7efdc48881d2d968f7aeaa4a06e6ed321155a2859a15cc7f0473fa56f7).

Reply via email to