On 5/31/23 20:29, Jürgen Hötzel wrote:
> 
> "Richard W.M. Jones" <rjo...@redhat.com> writes:
> 
>> On Sat, May 27, 2023 at 03:35:38PM +0200, Jürgen Hötzel wrote:
>>> Fixes deadlocks on OCaml5 when trying to get the lock that is already
>>> held:
>>>
>>> Fatal error during lock: Resource deadlock avoided
>>> ---
>>>  ocaml/guestfs-c.c | 8 ++++++--
>>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/ocaml/guestfs-c.c b/ocaml/guestfs-c.c
>>> index 3888c9456..bcf8e6ab3 100644
>>> --- a/ocaml/guestfs-c.c
>>> +++ b/ocaml/guestfs-c.c
>>> @@ -395,12 +395,16 @@ event_callback_wrapper (guestfs_h *g,
>>>    /* Ensure we are holding the GC lock before any GC operations are
>>>     * possible. (RHBZ#725824)
>>>     */
>>> -  caml_leave_blocking_section ();
>>> +  bool in_blocking_section = (caml_state == NULL);
>>> +
>>> +  if (in_blocking_section)
>>> +    caml_leave_blocking_section ();
>>>  
>>>    event_callback_wrapper_locked (g, data, event, event_handle, flags,
>>>                                   buf, buf_len, array, array_len);
>>>  
>>> -  caml_enter_blocking_section ();
>>> +  if (in_blocking_section)
>>> +    caml_enter_blocking_section ();
>>>  }
>>
>> I don't understand the reason why this patch is needed.
> 
> the event_callback_wrapper is ALSO called from libguestfs code that already
> holds the OCaml domain lock. (when blocking = false).

Where does that happen? I don't see it.

Either way, wherever it happens: that particular call site should be
modified to call event_callback_wrapper_locked() directly, IMO.
event_callback_wrapper() does nothing other than bracket
event_callback_wrapper_locked() with leave+enter, so if another call
site needs everything that event_callback_wrapper() does *except*
leave+enter (because it already holds the OCaml domain lock), then that
site should just directly call event_callback_wrapper_locked(). For this
purpose, event_callback_wrapper_locked() may have to be made a public
function (but again I don't know where that new call is supposed to come
from).

Laszlo

> 
> My Patch is just a (confusing) workaround to fix this issue at runtime.
> 
> An better solution might be using different callbacks at compile time:
> 
> event_callback_wrapper_locked (when called from a non-blocking function)
> event_callback_wrapper (when called from a blocking function).
> 
> This problem also occurs only with OCaml 5
> 
> Minimal invalid OCAML/C Code:
> 
> ============================== stubs.c ===============================
> #include <caml/alloc.h>
> #include <caml/custom.h>
> #include <caml/memory.h>
> #include <caml/mlvalues.h>
> 
> void dummy_leave_blocking() {
>   CAMLparam0();
>   /* we did not call  caml_enter_blocking_section_before in C */
>   caml_leave_blocking_section();
>   caml_enter_blocking_section();
>   CAMLreturn0;
> }
> ============================== main.ml ===============================
> external dummy_leave_blocking : unit -> unit = "dummy_leave_blocking"
> let () =
>   Printf.printf "Hello from OCaml %s\n%!" Sys.ocaml_version;
>   dummy_leave_blocking();
>   Printf.printf "Bye from OCaml %s\n%!" Sys.ocaml_version;
> ======================================================================
> 
> 
> Using Ocaml 5 results in crash:
> 
> # dune exec --  bin/main.exe
> Hello from OCaml 5.0.0
> Fatal error: Fatal error during lock: Resource deadlock avoided
> 
> Aborted (core dumped)
> 
> VS OCaml 4:
> # dune exec --  bin/main.exe
> Hello from OCaml 4.11.1
> Bye from OCaml 4.11.1
> 
> Jürgen
> 
> 
> 
> 
> _______________________________________________
> Libguestfs mailing list
> Libguestfs@redhat.com
> https://listman.redhat.com/mailman/listinfo/libguestfs

_______________________________________________
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs

Reply via email to