Issue 132337
Summary [bindings] OCaml bindings will cause use after free, in consequence of mixed manual and automatic memory management
Labels new issue
Assignees
Reporter lynzrand
    ## Background

In LLVM OCaml bindings, most types (e.g. `LLVMContext`, `LLVMModule`) have lifetimes that are manually-managed. The user needs to explicitly call the dispose function of these types:

https://github.com/llvm/llvm-project/blob/7d742f97b035f8dd9adaeccb98a28d1b7586f343/llvm/bindings/ocaml/llvm/llvm_ocaml.c#L229-L240

However, the two builder types, `LLVMBuilder` and `LLVMDIBuilder`, have their lifetimes managed by the OCaml GC, disposing when and only when the value is collected. Notice the finalizer callback passed into the allocator:

https://github.com/llvm/llvm-project/blob/7d742f97b035f8dd9adaeccb98a28d1b7586f343/llvm/bindings/ocaml/llvm/llvm_ocaml.c#L1972-L1991

## The problem

**In this scenario, the builder types may unexpectedly outlive the contexts.** The user cannot manually free the values to mitigate the issue, since their lifetimes are managed by the GC.

If the OCaml GC collects the builder types **after** the contexts have been manually freed, the builders' dispose function may access data in the contexts already freed. This will cause a use-after-free issue.

The issue is especially troublesome on `LLVMDIBuilder`, where the dispose function also calls `LLVMDIBuilderFinalize`, which **will definitely** access the data in `LLVMModule`s.

https://github.com/llvm/llvm-project/blob/c2692afc0a92cd5da140dfcdfff7818a5b8ce997/llvm/bindings/ocaml/debuginfo/debuginfo_ocaml.c#L183-L208


An example program that triggers this bug may look like this:


```ocaml
let cx = Llvm.create_context () in
let m = Llvm.create_module cx in
let di_builder = Llvm_debuginfo.dibuilder m in
(*
  ... 
  user generates code using LLVM
  ...
*)
Llvm_debuginfo.dibuild_finalize di_builder;
(* di_builder is *not* freed here! *)

(* user exports code... *)
Llvm.dispose_module m;
Llvm.dispose_context cx;

let something = allocate () in ...
(* now the user allocates something and triggers a GC cycle.
   GC: sweeps di_builder;
       calls finalizer;
       finalizer accesses already-freed code.
  
   BOOM!
 *)
```


The following is part of a valgrind trace of this issue, recorded in a real program. This eventually caused a segmentation fault of the program.

```
==1437619== Invalid read of size 8
==1437619==    at 0x5AF1F99: llvm::MetadataTracking::track(void*, llvm::Metadata&, llvm::PointerUnion<llvm::MetadataAsValue*, llvm::Metadata*, llvm::DebugValueUser*>) (in /usr/lib/libLLVM.so.18.1)
==1437619==    by 0x59D1971: llvm::DIBuilder::finalizeSubprogram(llvm::DISubprogram*) (in /usr/lib/libLLVM.so.18.1)
==1437619==    by 0x59D1D3E: llvm::DIBuilder::finalize() (in /usr/lib/libLLVM.so.18.1)
==1437619==    by 0xD5346C: llvm_finalize_dibuilder (debuginfo_ocaml.c:184)
==1437619==    by 0xD6982D: caml_empty_minor_heap (minor_gc.c:413)
==1437619==    by 0xD69C97: caml_gc_dispatch (minor_gc.c:492)
==1437619==    by 0xD69E14: caml_alloc_small_dispatch (minor_gc.c:539)
...
==1437619==  Address 0x11d61408 is 56 bytes inside a block of size 120 free'd
==1437619==    at 0x4849424: operator delete(void*) (vg_replace_malloc.c:1131)
==1437619== by 0x5AC94B4: llvm::LLVMContextImpl::~LLVMContextImpl() [clone .part.0] (in /usr/lib/libLLVM.so.18.1)
==1437619==    by 0x5AB7DB4: llvm::LLVMContext::~LLVMContext() (in /usr/lib/libLLVM.so.18.1)
==1437619== by 0x59AC971: LLVMContextDispose (in /usr/lib/libLLVM.so.18.1)
==1437619== by 0xD55AE8: llvm_dispose_context (llvm_ocaml.c:237)
==1437619==    by 0x6DB022: camlLlvm__fun_3347 (in ...)
...
==1437619==  Block was alloc'd at
==1437619==    at 0x4845F93: operator new(unsigned long) (vg_replace_malloc.c:487)
==1437619==    by 0x5AF0AE3: llvm::MDNode::operator new(unsigned long, unsigned long, llvm::Metadata::StorageType) (in /usr/lib/libLLVM.so.18.1)
==1437619==    by 0x5A17A48: llvm::DISubprogram::getImpl(llvm::LLVMContext&, llvm::Metadata*, llvm::MDString*, llvm::MDString*, llvm::Metadata*, unsigned int, llvm::Metadata*, unsigned int, llvm::Metadata*, unsigned int, int, llvm::DINode::DIFlags, llvm::DISubprogram::DISPFlags, llvm::Metadata*, llvm::Metadata*, llvm::Metadata*, llvm::Metadata*, llvm::Metadata*, llvm::Metadata*, llvm::MDString*, llvm::Metadata::StorageType, bool) (in /usr/lib/libLLVM.so.18.1)
==1437619==    by 0x59D59F2: llvm::DIBuilder::createFunction(llvm::DIScope*, llvm::StringRef, llvm::StringRef, llvm::DIFile*, unsigned int, llvm::DISubroutineType*, unsigned int, llvm::DINode::DIFlags, llvm::DISubprogram::DISPFlags, llvm::MDTupleTypedArrayWrapper<llvm::DITemplateParameter>, llvm::DISubprogram*, llvm::MDTupleTypedArrayWrapper<llvm::DIType>, llvm::MDTupleTypedArrayWrapper<llvm::DINode>, llvm::StringRef) (in /usr/lib/libLLVM.so.18.1)
==1437619==    by 0x59E52DF: LLVMDIBuilderCreateFunction (in /usr/lib/libLLVM.so.18.1)
==1437619==    by 0xD539AD: llvm_dibuild_create_function_native (debuginfo_ocaml.c:285)
==1437619==    by 0x6D754A: camlLlvm_debuginfo__fun_1435 (in ...)
...
```

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to