Re: amd64/168342: mbuf exhaustion hangs all daemons in keglimit state
The following reply was made to PR kern/168342; it has been noted by GNATS. From: John Baldwin To: freebsd-am...@freebsd.org, Ziyan Maraikar Cc: freebsd-gnats-sub...@freebsd.org, Darshana Jayasinghe Subject: Re: amd64/168342: mbuf exhaustion hangs all daemons in keglimit state Date: Tue, 29 May 2012 08:12:40 -0400 On Friday, May 25, 2012 4:34:20 pm Ziyan Maraikar wrote: > > >Number: 168342 > >Category: amd64 > >Synopsis: mbuf exhaustion hangs all daemons in keglimit state > >Confidential: no > >Severity: serious > >Priority: medium > >Responsible:freebsd-amd64 > >State: open > >Quarter: > >Keywords: > >Date-Required: > >Class: sw-bug > >Submitter-Id: current-users > >Arrival-Date: Fri May 25 20:40:01 UTC 2012 > >Closed-Date: > >Last-Modified: > >Originator: Ziyan Maraikar > >Release:FreeBSD 9.0-RELEASE amd64 > >Organization: > Department of computer engineering, University of Peradeniya > >Environment: > System: FreeBSD nanuoya.pdn.ac.lk 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Tue Jan 3 07:46:30 UTC 2012 r...@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 > HP Proliant DL165 4-core, 8G RAM > 4x igb NICs -- 1 interface assigned 6 IPv4 aliases. > 3x 1TB SATA zfs RAID-Z pool (zfs boot) > > >Description: > This machine has been running DHCP, BIND, NFS and, openldap serving a lab of about 40 machines. The machine recently began to experience very frequentlockups in all network services including, ssh. The services all hang in state keglimit, even under very light load. I have tried disbling TSO and hardware checksum on igb as suggested in related mailing list posts, but it has no effect. > > >How-To-Repeat: > Several ssh attempts after boot is enough to make all daemons hang in keglimit. > # netstat -m > 25034/1602/26636 mbufs in use (current/cache/total) > 24892/708/25600/25600 mbuf clusters in use (current/cache/total/max) > 24642/708 mbuf+clusters out of packet secondary zone in use (current/cache) > 0/9/9/12800 4k (page size) jumbo clusters in use (current/cache/total/max) > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > 56053K/1852K/57905K bytes allocated to network (current/cache/total) > 0/1697/1209 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 0/0/0 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 0 requests for I/O initiated by sendfile > 0 calls to protocol drain routines Have you tried increasing kern.ipc.nmbclusters? Alternatively, have you tried restricting igb to only using 1 queue? It sounds like all your igb interfaces are allocating all of your mbuf clusters for their receive rings. -- John Baldwin ___ freebsd-bugs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"
Re: amd64/168342: mbuf exhaustion hangs all daemons in keglimit state
The following reply was made to PR kern/168342; it has been noted by GNATS. From: Ziyan Maraikar To: John Baldwin Cc: freebsd-am...@freebsd.org, freebsd-gnats-sub...@freebsd.org, Darshana Jayasinghe Subject: Re: amd64/168342: mbuf exhaustion hangs all daemons in keglimit state Date: Tue, 29 May 2012 22:14:02 +0530 Hello John, Thanks for the response. >=20 > Have you tried increasing kern.ipc.nmbclusters? Alternatively, have = you tried=20 > restricting igb to only using 1 queue? It sounds like all your igb = interfaces=20 > are allocating all of your mbuf clusters for their receive rings. >=20 I found this very suggestion on several mailing list discussions [1] and = set these values on Saturday. kern.ipc.nmbclusters=3D"131072" hw.igb.num_queues=3D"2" So far everything seems to back to normal, and netstat -m shows plenty = of headroom now.=20 The problem cropped up after running several months on 9.0-RELEASE when = I brought up another interface. Disabling the new interface didn't = restore normal operation, however. I also tried 8.3-RELEASE but the = problem was worse on it. [1] http://osdir.com/ml/freebsd-stable/2012-02/msg00563.html __ Regards Ziyan.= ___ freebsd-bugs@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-bugs To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"
Re: bin/165207: commit references a PR
The following reply was made to PR bin/165207; it has been noted by GNATS. From: dfil...@freebsd.org (dfilter service) To: bug-follo...@freebsd.org Cc: Subject: Re: bin/165207: commit references a PR Date: Tue, 29 May 2012 19:47:16 + (UTC) Author: trasz Date: Tue May 29 19:47:06 2012 New Revision: 236253 URL: http://svn.freebsd.org/changeset/base/236253 Log: MFC r232598: Make racct and rctl correctly handle jail renaming. Previously they would continue using old name, the one jail was created with. PR: bin/165207 MFC r235795: Don't leak locks in prison_racct_modify(). MFC r235803: Fix use-after-free in kern_jail_set() triggered e.g. by attempts to clear "persist" flag from empty persistent jail, like this: jail -c persist=1 jail -n 1 -m persist=0 Modified: stable/9/sys/kern/kern_jail.c stable/9/sys/kern/kern_racct.c stable/9/sys/sys/racct.h Directory Properties: stable/9/sys/ (props changed) Modified: stable/9/sys/kern/kern_jail.c == --- stable/9/sys/kern/kern_jail.c Tue May 29 19:46:42 2012 (r236252) +++ stable/9/sys/kern/kern_jail.c Tue May 29 19:47:06 2012 (r236253) @@ -130,6 +130,7 @@ static char *prison_path(struct prison * static void prison_remove_one(struct prison *pr); #ifdef RACCT static void prison_racct_attach(struct prison *pr); +static void prison_racct_modify(struct prison *pr); static void prison_racct_detach(struct prison *pr); #endif #ifdef INET @@ -1810,6 +1811,16 @@ kern_jail_set(struct thread *td, struct } } +#ifdef RACCT + if (!created) { + sx_sunlock(&allprison_lock); + prison_racct_modify(pr); + sx_slock(&allprison_lock); + } +#endif + + td->td_retval[0] = pr->pr_id; + /* * Now that it is all there, drop the temporary reference from existing * prisons. Or add a reference to newly created persistent prisons @@ -1830,7 +1841,7 @@ kern_jail_set(struct thread *td, struct if (!(flags & JAIL_ATTACH)) sx_sunlock(&allprison_lock); } - td->td_retval[0] = pr->pr_id; + goto done_errmsg; done_deref_locked: @@ -4427,24 +4438,32 @@ prison_racct_hold(struct prison_racct *p refcount_acquire(&prr->prr_refcount); } +static void +prison_racct_free_locked(struct prison_racct *prr) +{ + + sx_assert(&allprison_lock, SA_XLOCKED); + + if (refcount_release(&prr->prr_refcount)) { + racct_destroy(&prr->prr_racct); + LIST_REMOVE(prr, prr_next); + free(prr, M_PRISON_RACCT); + } +} + void prison_racct_free(struct prison_racct *prr) { int old; + sx_assert(&allprison_lock, SA_UNLOCKED); + old = prr->prr_refcount; if (old > 1 && atomic_cmpset_int(&prr->prr_refcount, old, old - 1)) return; sx_xlock(&allprison_lock); - if (refcount_release(&prr->prr_refcount)) { - racct_destroy(&prr->prr_racct); - LIST_REMOVE(prr, prr_next); - sx_xunlock(&allprison_lock); - free(prr, M_PRISON_RACCT); - - return; - } + prison_racct_free_locked(prr); sx_xunlock(&allprison_lock); } @@ -4454,15 +4473,66 @@ prison_racct_attach(struct prison *pr) { struct prison_racct *prr; + sx_assert(&allprison_lock, SA_XLOCKED); + prr = prison_racct_find_locked(pr->pr_name); KASSERT(prr != NULL, ("cannot find prison_racct")); pr->pr_prison_racct = prr; } +/* + * Handle jail renaming. From the racct point of view, renaming means + * moving from one prison_racct to another. + */ +static void +prison_racct_modify(struct prison *pr) +{ + struct proc *p; + struct ucred *cred; + struct prison_racct *oldprr; + + sx_slock(&allproc_lock); + sx_xlock(&allprison_lock); + + if (strcmp(pr->pr_name, pr->pr_prison_racct->prr_name) == 0) { + sx_xunlock(&allprison_lock); + sx_sunlock(&allproc_lock); + return; + } + + oldprr = pr->pr_prison_racct; + pr->pr_prison_racct = NULL; + + prison_racct_attach(pr); + + /* + * Move resource utilisation records. + */ + racct_move(pr->pr_prison_racct->prr_racct, oldprr->prr_racct); + + /* + * Force rctl to reattach rules to processes. + */ + FOREACH_PROC_IN_SYSTEM(p) { + PROC_LOCK(p); + cred = crhold(p->p_ucred); + PROC_UNLOCK(p); + racct_proc_ucred_changed(p, cred, cred); + crfree(cred); + } + + sx_sunlock(&allproc_lock); + prison_racct_free_locked(oldprr); + sx_xunloc