On Fri, 2021-03-26 at 18:05 -0400, Stephen Frost wrote: > * Jacob Champion (pchamp...@vmware.com) wrote: > > Yeah. I was hoping to avoid implementing our own locks and refcounts, > > but it seems like it's going to be required. > > Yeah, afraid so.
I think it gets worse, after having debugged some confusing crashes. There's already been a discussion on PR_Init upthread a bit: > Once we settle on a version we can confirm if PR_Init is/isn't needed and > remove all traces of it if not. What the NSPR documentation omits is that implicit initialization is not threadsafe. So NSS_InitContext() is technically "threadsafe" because it's built on PR_CallOnce(), but if you haven't called PR_Init() yet, multiple simultaneous PR_CallOnce() calls can crash into each other. So, fine. We just add our own locks around NSS_InitContext() (or around a single call to PR_Init()). Well, the first thread to win and successfully initialize NSPR gets marked as the "primordial" thread using thread-local state. And it gets a pthread destructor that does... something. So lazy initialization seems a bit dangerous regardless of whether or not we add locks, but I can't really prove whether it's dangerous or not in practice. I do know that only the primordial thread is allowed to call PR_Cleanup(), and of course we wouldn't be able to control which thread does what for libpq clients. I don't know what other assumptions are made about the primordial thread, or if there are any platform-specific behaviors with older versions of NSPR that we'd need to worry about. It used to be that the primordial thread was not allowed to exit before any other threads, but that restriction was lifted at some point [1]. I think we're going to need some analogue to PQinitOpenSSL() to help client applications cut through the mess, but I'm not sure what it should look like, or how we would maintain any sort of API compatibility between the two flavors. And does libpq already have some notion of a "main thread" that I'm missing? --Jacob [1] https://bugzilla.mozilla.org/show_bug.cgi?id=294955