On 08/08/2017 09:49, Peter Xu wrote: > On Tue, Aug 08, 2017 at 09:26:43AM +0200, Paolo Bonzini wrote: >> On 08/08/2017 09:00, Peter Xu wrote: >>> We were calling rcu_init_complete() twice in the child processes when >>> fork happened. However the pthread library does not really suggest to do >>> it that way: >>> >>> http://pubs.opengroup.org/onlinepubs/7908799/xsh/pthread_mutex_init.html >>> >>> "Attempting to initialise an already initialised mutex results in >>> undefined behaviour." >>> >>> Actually, IMHO we can do it in a more natural way: Firstly, we only init >>> the RCU globals once in rcu_init(). Then, in rcu_init_child(), we unlock >>> all the locks held in rcu_init_lock() just like what we do in the parent >>> process, then do the rest of RCU re-init (e.g., create the RCU thread). >> >> This doesn't work for error-checking mutexes: rcu_init_child has a >> different PID than the parent, so the mutexes aren't unlocked. It's >> also true that right now we don't use error-checking mutexes (commit >> 24fa90499f, "qemu-thread: do not use PTHREAD_MUTEX_ERRORCHECK", >> 2015-03-10); however, that's also a bit sad. >> >> The reason for the undefined behavior is probably that some operating >> systems allocate memory in pthread_mutex_init, and initializing twice >> causes a memory leak. One such operating system is OpenBSD. :( > > Good to know. :) > > I thought pthread_atfork() was designed to solve such a locking > problem (in child hanlder, we unlock all the held locks). If > PTHREAD_MUTEX_ERRORCHECK cannot coop well with it, not sure whether > that means we should just avoid using PTHREAD_MUTEX_ERRORCHECK in such > a use case (but we should be able to use the error checks in other > mutexes that do not need extra fork handling)? > > Another idea is: can we just destroy the mutex first then re-init it > in subprocess? A quick glance in libpthread code shows that at least > pthread_mutex_destroy() won't check PTHREAD_MUTEX_ERRORCHECK.
Destroying a mutex that is locked is also undefined behavior, even if it has no waiters. So it's a nice catch-22: we cannot destroy before unlocking, and we cannot unlock an error-checking mutex, so we cannot destroy before init. Actually, unlocking in atfork's child callback is interesting even for non-error-checking mutexes. If there are waiters, waking up the first one in pthread_mutex_unlock fails. This is okay if (as on Linux) pthread_mutex_lock does none of the woken-up process's work. However, on OpenBSD pthread_mutex_lock sets mutex->owner... but in the child there's no other thread to wake up and thus no one will set mutex->owner to NULL. This should lead to deadlock scenarios with pthread_atfork. So pthread_atfork is impossible to use correctly with mutexes, and reinitializing is the best we can do (the memory leak being a lesser evil). The workaround would be to replace mutexes with semaphores, and to implement QemuEvent natively for OpenBSD using its __thrsleep and __thrwakeup system calls (no, I am not going to do that :)). Let's wait and hear what Eric has to say. Paolo > Thanks, > >> >> Eric, you chimed in on the patch that became commit 24fa90499f, what do >> you suggest? >> >> Paolo >