On 3/2/26 04:53, Florian Weimer wrote: > * Demi Marie Obenour: > >> On 2/27/26 14:39, Florian Weimer wrote: >>> * Alan Coopersmith: >>> >>>> https://sympa.inria.fr/sympa/arc/ocsf-ocaml-security-announcements/2026-02/msg00000.html >>>> announces: >>>>> From: Hannes Mehnert <[email protected]> >>>>> To: [email protected] >>>>> Subject: [ocsf-ocaml-security-announcements] OSEC-2026-01 in the OCaml >>>>> runtime: Buffer Over-Read in OCaml Marshal Deserialization >>>>> Date: Tue, 17 Feb 2026 15:16:54 +0100 >>>>> Dear everyone, >>>>> it is my pleasure to announce the first security announcement of >>>>> this year, >>>>> and the first on this mailing list. >>>>> It should any moment now also appear at >>>>> https://osv.dev/list?q=OSEC-2026-01 >>>>> Human link: >>>>> https://github.com/ocaml/security-advisories/tree/main/advisories/2026/OSEC-2026-01.md >>> >>> Surprised to read this. I think this comment from 2018 is still >>> appropriate: >>> >>> | Marshal should not used in contexts where an attacker can control the >>> | data. I don't believe it is, at least in any project I'm aware of, and >>> | if it were, it's unlikely that those project perform enough check on >>> | the result of Marshal to make the use safe anyway. >>> >>> <https://github.com/ocaml/ocaml/issues/7765#issuecomment-473076288> >>> >>> The demarshaller does not have access to type information from the >>> program, so it has the ability to construct an arbitrary object graph. >> >> That is indeed true. However, unlike in many other languages, this >> does not directly allow arbitrary code execution. > > Not really. > > This code > > type x = A of int | B of int | C of int | D of int | E of int > let f x fA fB fC fD fE = > match x with > | A a -> fA a > | B b -> fB b > | C c -> fC c > | D d -> fD d > | E e -> fE e > > gets compiled to: > > 0000000000000000 <camlBlah.f_5>: > 0: 55 push %rbp > 1: 48 89 e5 mov %rsp,%rbp > 4: 49 89 c0 mov %rax,%r8 > 7: 49 89 d1 mov %rdx,%r9 > a: 4d 3b 3e cmp (%r14),%r15 > d: 76 51 jbe 60 <camlBlah.f_5+0x60> > f: 49 0f b6 40 f8 movzbq -0x8(%r8),%rax > 14: 48 8d 15 00 00 00 00 lea 0x0(%rip),%rdx # 1b > <camlBlah.f_5+0x1b> > 17: R_X86_64_PC32 .rodata-0x4 > 1b: 48 63 04 82 movslq (%rdx,%rax,4),%rax > 1f: 48 01 c2 add %rax,%rdx > 22: ff e2 jmp *%rdx > 24: 49 8b 00 mov (%r8),%rax > 27: 48 8b 3b mov (%rbx),%rdi > 2a: 5d pop %rbp > 2b: ff e7 jmp *%rdi > 2d: 0f 1f 00 nopl (%rax) > 30: 49 8b 00 mov (%r8),%rax > 33: 48 8b 37 mov (%rdi),%rsi > 36: 48 89 fb mov %rdi,%rbx > 39: 5d pop %rbp > 3a: ff e6 jmp *%rsi > 3c: 49 8b 00 mov (%r8),%rax > 3f: 48 8b 3e mov (%rsi),%rdi > 42: 48 89 f3 mov %rsi,%rbx > 45: 5d pop %rbp > 46: ff e7 jmp *%rdi > 48: 49 8b 00 mov (%r8),%rax > 4b: 49 8b 39 mov (%r9),%rdi > 4e: 4c 89 cb mov %r9,%rbx > 51: 5d pop %rbp > 52: ff e7 jmp *%rdi > 54: 49 8b 00 mov (%r8),%rax > 57: 48 8b 39 mov (%rcx),%rdi > 5a: 48 89 cb mov %rcx,%rbx > 5d: 5d pop %rbp > 5e: ff e7 jmp *%rdi > 60: e8 00 00 00 00 call 65 <camlBlah.f_5+0x65> > 61: R_X86_64_PLT32 caml_call_gc-0x4 > 65: eb a8 jmp f <camlBlah.f_5+0xf> > 67: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1) > 6e: 00 00 > > Add offset 0x1b, there's the tag load, and this tag is used to index a > jump table without a bounds check. > > Admittedly, This does not give full control over program execution > directly. One would have to search for a suitable gadget. There are > likely better ways to exploit unsafe demarshalling, this is just the > first approach I could think of. > > Thanks, > Florian >
One can use C code or the Obj module to validate the value before using it. In this case, one could use Obj.is_int to check that this is a boxed value, Obj.tag to check that its tag is in the correct range, Obj.len to check that the length is correct, and Obj.field and Obj.is_int to validate the constructor payloads. Only after validation is complete would one call Obj.obj to cast the validated value to the intended type. It's also possible to unmarshal to an opaque type and provide safe functions for traversing the object graph. This is unlike Java, Ruby, or Python, where arbitrary code can be executed during the unmarshalling process itself. I do agree that whether existing code does this is questionable, but it is definitely *possible* to do correctly. -- Sincerely, Demi Marie Obenour (she/her/hers)
OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key
OpenPGP_signature.asc
Description: OpenPGP digital signature
