##
This RFC is a case study on unifying the lower API, which had implementations in both Python and C++. I'll call the duplication of APIs in Python and C++ "bifurcation" or "bifurcation of the API". I'll conclude by analyzing how this bifurcation happened in the first place, and then present options for changing open source policy to avoid future bifurcation. You can skip ahead to "causes of the bifurcation" if you don't want to read the details of this case. ## Unifying the Lower API The PR is [here](https://github.com/apache/tvm/pull/8110) if you would like to take a look. In the case of Lower, there were two Lower functions, one in C++ and one in Python that have different signatures. The Python was introduced first, and is much more broad than the C++ API. Notably, it accepts Union types for `inputs` and `args`, which is difficult to replicate in C++. Here are the two function signatures from **before the refactor**, for reference: ```python def lower( inputs: Union[schedule.Schedule, PrimFunc, IRModule], args: Optional[List[Union[Buffer, tensor.Tensor, Var]]] = None, name: str = "main", binds: Optional[Mapping[tensor.Tensor, Buffer]] = None, simple_mode: bool = False, ) -> IRModule ``` ```cpp IRModule lower(te::Schedule sch, const Array<te::Tensor>& args, const std::string& name, const std::unordered_map<te::Tensor, tir::Buffer>& binds); ``` To preserve the behavior of the Python API, I split the C++ backend into three functions: LowerModule, LowerPrimFunc and LowerSchedule and registered these through the FFI. Then, from the python lower function, I dispatch into each of these C++ backends depending on the type passed in. This approach is OK for a small number of types, but as you can see, if we had more types in the Union or multiple arguments with Union types, we'd have to make even more C++ functions. The other really tricky part was that `args` is an Array[Union[Buffer, Var, Tensor]]. Unfortunately, Buffer, Var and Tensor's only common supertype is ObjectRef. So, to replicate the Python behavior in C++, we needed to let `args` be an Array<ObjectRef>. But, we also need to preserve the original signature for backwards compatibility. Thus, we end up with four function signatures in C++ to duplicate the behavior of one Python function, which is not ideal. ```cpp IRModule LowerModule(IRModule mod, bool simple_mode = false); IRModule LowerPrimFunc(tvm::tir::PrimFunc func, const std::string& name, bool simple_mode = false); IRModule LowerSchedule(te::Schedule sch, const Array<te::Tensor>& args, const std::string& name, const std::unordered_map<te::Tensor, tir::Buffer>& binds, bool simple_mode = false); IRModule LowerSchedule(te::Schedule sch, const Array<ObjectRef>& args, const std::string& name, const std::unordered_map<te::Tensor, tir::Buffer>& binds, bool simple_mode = false); ``` Writing the 2nd version of `LowerSchedule` that takes in an `Array<ObjectRef>` for `args` also caused some problems because the helper functions it called also needed their signatures changed. For this case study, we'll just look at `get_binds`. In C++, there is a parallel helper function `GetBinds` , but again, its signature doesn't match the signature of the Python `get_binds`. The Python version allows `args` to be `Array[Union[Buffer, Var, Tensor]]`, while the C++ version requires that `args` is an `Array<Tensor>`. ```python def get_binds(args, compact=False, binds=None): ``` The type of `args` in the Python version is `Array[Union[Buffer, Var, Tensor]]`. ```cpp void GetBinds(const Array<te::Tensor>& args, bool compact, const std::unordered_map<te::Tensor, tir::Buffer>& binds, Map<te::Tensor, tir::Buffer>* out_binds, Array<ObjectRef>* out_arg_list); ``` To replicate the behavior of the Python `get_binds`, I had to write a second version of `GetBinds` in C++ that takes in `args` as an `Array<ObjectRef>`. Additionally, there are a few places that `get_binds` is used in the codebase, so I had to register and expose `GetBinds` through the FFI to ensure backwards compatibility. The final problem I encountered was related to dead code. The C++ version is only called in one place. It calls `SchedulePostProcRewriteForTensorCore`, which causes failures in some of the calls that originate from Python but not in the calls that originate from C++. Fortunately, this function is dead code, so I just removed it. However, it could have caused further problems if it were actually required in some cases but not in others. ## Cause of the bifurcation 1. Python `lower` is introduced, and it relies on python duck typing 2. Because code is in Python, all code is sort of like a public API, so people start using `lower`, `get_binds` and `form_irmodule` in ways that they shouldn't 3. Someone needs to introduce the C++ version, but it's hard to duplicate the Python API in C++ because of duck typing. So the new C++ version doesn't match all Python cases, and Python is not removed to preserve backwards compatibility. 3. Because Python code is the primary code used, it is updated and maintained more frequently than the C++ version, which resulted in the C++ version having a call to `SchedulePostProcRewriteForTensorCore` that should have been removed from the codebase. Additionally, that was the only call to it in the entire codebase, so we were able to remove the body of `SchedulePostProcRewriteForTensorCore`, which was over 1000 LOC. ## Recommendations on OSS Policy Change First, the current TVM "policy" is that code can be implemented in Python for the prototyping and ease of implementation, and eventually, that code will be replaced with C++ and the Python will be deleted. I am not sure if this is written down anywhere, however I have heard it from community members when I ask about why there are both Python and C++ versions of functions. In my view, the main problem with this policy is that usually, the Python code is never removed from the code base, or is removed only after it has caused problems (like in the case of Lower). I see two potential solutions to prevent this from happening, which have different overheads in terms of enforcement: **Option 1**: Turn on static type checking in Python to make it easier to convert Python to C++, and require that Python code is removed from the code base when corresponding C++ is added. Currently we assume that the Python will be removed later, which I do not think is a good assumption to make. **Option 2**: Require that all new compiler APIs (as well as most logic in general) be written in C++, and only allow Python to serve as a frontend and soft wrapper of the C++ APIs. It would be great if people could weigh in on which solution they prefer, or provide alternate solutions. I personally prefer Option 2. Although it is a bit strict, I think it will be easier to enforce than Option 1, since enforcement of Option 1 requires reviewers to check if there is a Python API the C++ is duplicating. Additionally, turning on static type checking in Python will probably require lots of changes to existing code and take a long time. --- [Visit Topic](https://discuss.tvm.apache.org/t/rfc-bifurcation-of-apis-across-python-and-c-a-case-study-and-recommendations-for-prevention/10203/1) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.apache.org/email/unsubscribe/b9195975f92fdb6213b9c599474d455cc1943a48ca2404649170014eed054aab).