> Date: Fri, 9 Aug 2024 16:55:30 +0000 > From: Emmanuel Dreyfus <m...@netbsd.org> > > I took more time on fss.c. I still have some trouble to grasp the > locking protocol in this file,
I don't know what the locking protocol is, but any changes to it absolutely must be accompanied by comments identifying the lock order and, if applicable, any rules like `forbidden in softint'. > but I have a proposal to address the > most offending issue: snapshot creation are serialized and the driver > locks fss_device_lock while they occur. On a big filesystem, > the operation can takes dozens of seconds while fssconfig(1) > commands get stuck in the kernel awaiting for fss_device_lock. I don't understand, how did you come to this conclusion? Snapshot creation currently works like so: mutex_enter(&fss_device_lock); while (fss_creating) { /* wait for signal on fss_device_cv */ } fss_creating = true; mutex_exit(&fss_device_lock); error = fss_create_snapshot(sc, fss, l); ... mutex_enter(&fss_device_lock); fss_creating = false; cv_broadcast(&fss_device_cv); mutex_exit(&fss_device_lock); As far as I can tell, nothing holds fss_device_lock across snapshot creation or disk I/O. fss_device_lock is held across fss_open/close to serialize config_attach_pseudo and config_detach, which is probably necessary because autoconf's pseudo-device attach protocol is still essentially broken from what I recall last time I looked into this while addressing open/close/detach races for non-pseudo (real?) devices. And it's held in fss_unmount_hook to iterate over fss(4) devices when forced by unmount. fss_device_lock is also held to serialize access to the fss_creating lock and wakeups with fss_device_cv. But I don't see any evidence that it's held across snapshot creation or disk I/O. Can you share ps and bt output in crash(8) for the hangs you've seen? You're sure they're waiting for fss_device_lock and not for snapwait? It would be nice if (a) snapshot creation for independent file systems could be done in parallel, and if (b) autoconf pseudo-device attach weren't bonkers. But I don't think running fss_snapshot_create in a separate thread will address either of those, so I'm not sure what this will address. > In the attached patch, I added a thread dedicated to snapshot > creation. This threads takes snapshot creation requests and > execute them one by one. The caller does not have to wait on > fss_device_lock. What is the advantage of this? How does a caller know when the snapshot creation has completed this way, so it's ready to be mounted? You'll need some way to notify the caller. But since snapshot creation is still serialized in your patch, I don't see why running it in a separate thread is an advantage over just doing it synchronously. > While there, I added a few mutex_enter/mutex_exit for sc->sc_slock > where it looked inconsistent with other usages. I tried to avoid > holding the lock on long operation. One question pending: it is > okay to hold the lock during a copyin? If it's a spin lock, or if it's ever taken in softint/callout context or taken while holding a lock transitively like that, absolutely not OK to hold during copyin. If it's an adaptive lock, better not to -- better to copy into a temporary stack buffer. I'm not 100% sure it will create deadlock possibilities but better not to have to prove it won't.