On Monday 11 May 2009 2:27:48 pm j...@0xabadba.be wrote: > John, > > Thank you for your input on this matter, I'm excited to write > some software for this project since its given me great code to learn > from as i've grown up (still a kid though :). My questions are a bit > more detailed below. > > On Mon, May 11, 2009 at 12:24 PM, John Baldwin <j...@freebsd.org> wrote: > > On Friday 08 May 2009 5:41:17 pm Ed Schouten wrote: > >> A solution would be to solve it as follows: > >> > >> - Use a semaphore, initialized to some insane high value to put an upper > >> limit on the amount of concurrent sysctl calls. I'm not sure whether > >> this is really needed. Maybe this issue is not as serious as we think > >> it is. > > > > Well, one compromise might be to allow concurrent userland requests if the > > buffer size is small (say < 1 page). This would be a quite simple change and > > would cover many common syscalls like fetching an int which don't wire memory > > anyway. > > Why is this a compromise? Isn't concurrent sysctl calls from userland > a good thing? What materials would be good to read other than the > code and the sysctl manual pages? You said it would be relatively > easy to implement this; what methods should I be considering to do > this in and what part of the code specifically should I be looking at?
Well, in theory a bunch of "small" requests to SYSCTL_PROC() nodes that used sysctl_wire_old() (or whatever it is called) could cause the amount of user memory wired for sysctls to grow unbounded. Thus, allowing this limited concurrency is a tradeoff as there is a minimal (perhaps only theoretical at the moment) risk of removing the safety net. The patch is quite small, btw, because the locking for the sysctl tree already exists, and by using read locks, one can already allow concurrent sysctl requests. There is no need to add any new locks or restructure the sysctl tree, just to adjust the locking that is already present. It might be clearer, in fact, to split the sysctl memory lock back out into a separate lock. This would allow "small" sysctl requests to run concurrently with a single "large" request whereas in my suggestion in the earlier e-mail, the "large" request will block all other user requests until it finishes. I've actually gone ahead and done this below. --- //depot/projects/smpng/sys/kern/kern_sysctl.c 2009/05/08 11:53:25 +++ //depot/user/jhb/lock/kern/kern_sysctl.c 2009/05/11 21:58:08 @@ -77,11 +77,12 @@ * API rather than using the dynamic API. Use of the dynamic API is * strongly encouraged for most code. * - * This lock is also used to serialize userland sysctl requests. Some - * sysctls wire user memory, and serializing the requests limits the - * amount of wired user memory in use. + * The sysctlmemlock is used to limit the amount of user memory wired for + * sysctl requests. This is implemented by serializing any userland + * sysctl requests larger than a single page via an exclusive lock. */ static struct sx sysctllock; +static struct sx sysctlmemlock; #define SYSCTL_SLOCK() sx_slock(&sysctllock) #define SYSCTL_SUNLOCK() sx_sunlock(&sysctllock) @@ -543,6 +544,7 @@ { struct sysctl_oid **oidp; + sx_init(&sysctlmemlock, "sysctl mem"); SYSCTL_INIT(); SYSCTL_XLOCK(); SET_FOREACH(oidp, sysctl_set) @@ -1563,7 +1565,7 @@ size_t *oldlenp, int inkernel, void *new, size_t newlen, size_t *retval, int flags) { - int error = 0; + int error = 0, memlocked; struct sysctl_req req; bzero(&req, sizeof req); @@ -1603,14 +1605,20 @@ if (KTRPOINT(curthread, KTR_SYSCTL)) ktrsysctl(name, namelen); #endif - - SYSCTL_XLOCK(); + + if (req.oldlen > PAGE_SIZE) { + memlocked = 1; + sx_xlock(&sysctlmemlock); + } else + memlocked = 0; CURVNET_SET(TD_TO_VNET(curthread)); for (;;) { req.oldidx = 0; req.newidx = 0; + SYSCTL_SLOCK(); error = sysctl_root(0, name, namelen, &req); + SYSCTL_SUNLOCK(); if (error != EAGAIN) break; uio_yield(); @@ -1620,7 +1628,8 @@ if (req.lock == REQ_WIRED && req.validlen > 0) vsunlock(req.oldptr, req.validlen); - SYSCTL_XUNLOCK(); + if (memlocked) + sx_xunlock(&sysctlmemlock); if (error && error != ENOMEM) return (error); -- John Baldwin _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"