On 2/4/11 9:38 AM, Robert Watson wrote:
On Thu, 3 Feb 2011, John Baldwin wrote:
1) Move per John Baldwin to mp_maxid
2) Some signed/unsigned errors found by Mac OS compiler (from
Michael)
3) a couple of copyright updates on the effected files.
Note that mp_maxid is the maxium valid ID, so you typically have to
do things like:
for (i = 0; i <= mp_maxid; i++) {
if (CPU_ABSENT(i))
continue;
...
}
There is a CPU_FOREACH() macro that does the above (but assumes you
want to skip over non-existent CPUs).
I'm finding the network stack requires quite a bit more along these
lines, btw. I'd love also to have:
PACKAGE_FOREACH()
CORE_FOREACH()
HWTHREAD_FOREACH()
I agree, which is why I usually support adding such iterators though
some people scream about them.
(e.g. FOREACH_THREAD_IN_PROC and there is one for iterating through
vnets too.)
CURPACKAGE()
CURCORE()
CURTHREAD()
also current jail, vnet, etc. (these (kinda) exist)
Available when putting together thread worker pools, distributing
work, identifying where to channel work, making dispatch decisions
and so on. It seems likely that in some scenarios, it will be
desirable to have worker thread topology linked to hardware topology
-- for example, a network stack worker per core, with distribution
of work targeting the closest worker (subject to ordering
constraints)...
Hmmm, this is more complicated. Can sctp_queue_to_mcore() handle
the fact that a cpu_to_use value might not be valid? If not you
might want to maintain a separate "dense" virtual CPU ID table
numbered 0 .. mp_ncpus - 1 that maps to "present" FreeBSD CPU IDs.
I think Robert has done something similar to support RSS in TCP.
Does that make sense?
This proves somewhat complicated. I basically have two models,
depending on whether RSS is involved (which adds an external
factor). Without RSS, I build a contiguous workstream number space,
which is then mapped via a table to the CPU ID space, allowing
mappings and hashing to be done easily -- however, these refer to
ordered flow processing streams (i.e., "threads") rather than CPUs,
in the strict sense. In the future with dynamic configuration, this
becomes important because what I do is rebalance ordered processing
streams rather than work to CPUs. With RSS there has to be a link
between work distribution and the CPU identifiers shared by device
drivers, hardware, etc, in which case RSS identifies viable CPUs as
it starts (probably not quite correctly, I'll be looking for a
review of that code shortly, cleaning it up currently).
This issue came up some at the BSDCan devsummit last year: as more
and more kernel subsystems need to exploit parallelism explicitly,
the thread programming model isn't bad, but lacks a strong tie to
hardware topology in order to help manage work distribution. One
idea idly bandied around was to do something along the lines of
KSE/GCD for the kernel: provide a layered "work" model with ordering
constraints, rather than exploit threads directly, for work-oriented
subsystems. This is effectively what netisr does, but in a network
stack-specific way. But with crypto code, IPSEC, storage stuff,
etc, all looking to exploit parallelism, perhaps a more general
model is called for.
Robert
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"