| Issue |
172169
|
| Summary |
[Clang] `clang::CodeGenerator` is completely unusable
|
| Labels |
clang:as-a-library
|
| Assignees |
|
| Reporter |
Sirraide
|
Currently, `CodeGenerator` (defined in `clang/CodeGen/ModuleBuilder.h`) is fundamentally broken in a number of ways:
- The actual implementation is in a derived class (`CodeGeneratorImpl`), meaning that creating an instance of `CodeGenerator` itself is a terrible idea (its members literally do a `static_cast<CodeGeneratorImpl *>(this)`). Despite this, you can just... create an instance of it. The default constructor is just public and available despite the fact that this is clearly supposed to be an abstract class.
- The *intended* way to create one, that being `clang::CreateLLVMCodeGen()`, just calls the `CodeGeneratorImpl` constructor, which only creates an `llvm::Module`; it, for some reason, does *not* call `CodeGeneratorImpl::Initialize()`, which performs some very important initialisation steps, such as creating the `CodeGenModule` instance that is used to perform the actual codegenning. Without calling `Initialize()`, attempting to emit anything will basically just crash.
- The class has an `ASTContext* Ctx` member, which is quite literally impossible to initialise: the constructor sets it to `nullptr`; then, `Initialize()` accepts an `ASTContext&` reference that it saves in `Ctx`, which would be fine, if it weren’t for the fact that this function is only ever called with `*Ctx` (i.e. `nullptr`) to begin with, which, on that note, is also just a guaranteed null pointer dereference.
- Calling `Initialize()` directly is impossible as it is not part of the `CodeGenerator` API, but rather an implementation detail of `CodeGeneratorImpl`. This means that there is no way to actually pass a valid `ASTContext` to `Initialize()`. The only way to call it is via `StartModule()`, which creates a new llvm module to perform codegen into.
- And on that note, `StartModule()` asserts that the current module is null, but as stated earlier, `clang::CreateLLVMCodeGen()` creates a module, which means that calling this directly after that function in an attempt to call `Initialize()` also just crashes, so you’d have to call `ReleaseModule()` first (which returns a raw pointer, so if you forget to delete it yourself, the module is just leaked), and *then* you can try calling `StartModule()` to have it call `Initialize()` internally. This still doesn’t change the fact that there is no way to actually set the `ASTContext*`.
- As an aside, `HandleTranslationUnit()` *does* take an `ASTContext*` parameter... but `CodeGeneratorImpl` just ignores it. And even if setting the `ASTContext*` via this function was the intent, `HandleTranslationUnit()` is (supposed to be) called *after* processing the TU. Crucially, it sets the `IRGenFinished` flag of `CodeGeneratorImpl` to `true`, which prevents it from emitting any more code. Effectively, you’d then have to create a new module to reenable codegen (but don’t forget to call `ReleaseModule()` first and delete the pointer it returns else you get a crash), so clearly, this is also not the right place to initialise the `ASTContext*`.
Assuming we even want to keep this around (it seems like a rather ancient API, and considering how broken it is, I don’t think it’s possible that anyone is actually using this unless they did some major refactoring of it), there’s a few things that need to be fixed here:
- Actually just add an `ASTContext*` parameter to `clang::CreateLLVMCodeGen()`; I don’t think there’s a good reason to delay initialising this member.
- I don’t think `Initialize()` needs to exist; it should just be part of `StartModule()`.
- The constructor of `CodeGeneratorImpl` should actually just call `StartModule()` so that the entire generator is in a usable state after calling `clang::CreateLLVMCodeGen()`.
- `ReleaseModule()` should return a `std::unique_ptr<llvm::Module>` instead of an `llvm::Module*`.
Also, these are only the issues I ran into when I tried using this earlier. There might be more that I haven’t been able to discover because it is currently impossible for this not to crash as soon as you try and emit any code with it.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs