Christian Kratzer wrote: > Hi Rick, > > On Mon, 12 Oct 2015, Rick Macklem wrote: > > > Christian Kratzer wrote: > >> Hi Rick, > >> > >> there was also a second more recent crash in /var/crash > >> > >> Mon Oct 12 03:01:16 CEST 2015 > >> > >> FreeBSD noc3.cksoft.de 10.2-STABLE FreeBSD 10.2-STABLE #2 r288980M: > >> Sun > >> Oct 11 08:37:40 CEST 2015 > >> c...@noc3.cksoft.de:/usr/obj/usr/src/sys/NOC amd64 > >> > >> panic: Assertion mtx_unowned(m) failed at > >> /usr/src/sys/kern/kern_mutex.c:955 > >> > > Oops, I screwed up. I should have looked at this panic assertion when you > > reported > > it before. Ok, so if I understand the assertion correctly, it means that > > another > > thread has the mutex locked. If this is correct, I'll have to take another > > look at > > the code and figure out how to wait for these other threads to finish with > > the mutexes. > > > > I do think the patch fixes the race I saw, but there must be other races in > > the code. > > > > I'll take another look, but if anyone else is conversant with netsmb, feel > > free to > > jump in, because it is all new to me. > > > > Unfortunately, I won't have any way to do testing for the next month or so, > > so any > > patches I do come up with will be "try this untested..". > > thats no problem. > > Just keep the patches coming when you have time and tell me when to reset > back to stable, > current or whatever so we don't lose sync of the status. > Well, you can try the attached one instead of the previous ones (ie. against stable). It just delays destroying the mutexes until the iod thread is exiting.
I can't quite see why the previous patches wouldn't fix it, but this one leaves smb_iod_main() unchanged, so it is a simpler patch and doesn't affect semantics except for a slight delay in destroying the mutexes. > As it looks like that the race happens on unmount I could try putting a sleep > 60 into the > script that does the "mount && rsycn && umount" magic just before the umount. > That would > allow anything that it slow to go away to perhaps release the mutexes before > the umount. > If it still crashes with this patch, it might be worth a try. Or, if this patch still crashes, you could just delete the 3 lines that the patch moves, so the mutexes are never destroyed. This would result in a leak, but it would tell us if destroying these mutexes is the problem. Thanks for your willingness to test these, rick > Not a real fix of course but might help to verify what's going on. > > Greetings > Christian > > > -- > Christian Kratzer CK Software GmbH > Email: c...@cksoft.de Wildberger Weg 24/2 > Phone: +49 7032 893 997 - 0 D-71126 Gaeufelden > Fax: +49 7032 893 997 - 9 HRB 245288, Amtsgericht Stuttgart > Mobile: +49 171 1947 843 Geschaeftsfuehrer: Christian Kratzer > Web: http://www.cksoft.de/ > _______________________________________________ > freebsd-stable@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" >
--- smb_iod.c.orig 2015-10-10 18:53:34.000000000 -0400 +++ smb_iod.c 2015-10-12 20:30:00.000000000 -0400 @@ -659,6 +659,11 @@ smb_iod_thread(void *arg) break; tsleep(&iod->iod_flags, PWAIT, "90idle", iod->iod_sleeptimo); } + + /* We can now safely destroy the mutexes and free the iod structure. */ + smb_sl_destroy(&iod->iod_rqlock); + smb_sl_destroy(&iod->iod_evlock); + free(iod, M_SMBIOD); mtx_unlock(&Giant); kproc_exit(0); } @@ -695,9 +700,6 @@ int smb_iod_destroy(struct smbiod *iod) { smb_iod_request(iod, SMBIOD_EV_SHUTDOWN | SMBIOD_EV_SYNC, NULL); - smb_sl_destroy(&iod->iod_rqlock); - smb_sl_destroy(&iod->iod_evlock); - free(iod, M_SMBIOD); return 0; }
_______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"