On 2/4/11 9:38 AM, Robert Watson wrote:
On Thu, 3 Feb 2011, John Baldwin wrote:

  1) Move per John Baldwin to mp_maxid
2) Some signed/unsigned errors found by Mac OS compiler (from Michael)
  3) a couple of copyright updates on the effected files.
Note that mp_maxid is the maxium valid ID, so you typically have to 
do things like:
    for (i = 0; i <= mp_maxid; i++) {
        if (CPU_ABSENT(i))
            continue;
        ...
    }

There is a CPU_FOREACH() macro that does the above (but assumes you want to skip over non-existent CPUs).
I'm finding the network stack requires quite a bit more along these 
lines, btw.  I'd love also to have:
  PACKAGE_FOREACH()
  CORE_FOREACH()
  HWTHREAD_FOREACH()

I agree, which is why I usually support adding such iterators though 
some people scream about them.
(e.g. FOREACH_THREAD_IN_PROC and there is one for iterating through 
vnets too.)
  CURPACKAGE()
  CURCORE()
  CURTHREAD()
also current jail, vnet, etc. (these (kinda) exist)
Available when putting together thread worker pools, distributing 
work, identifying where to channel work, making dispatch decisions 
and so on.  It seems likely that in some scenarios, it will be 
desirable to have worker thread topology linked to hardware topology 
-- for example, a network stack worker per core, with distribution 
of work targeting the closest worker (subject to ordering 
constraints)...
Hmmm, this is more complicated. Can sctp_queue_to_mcore() handle the fact that a cpu_to_use value might not be valid? If not you might want to maintain a separate "dense" virtual CPU ID table numbered 0 .. mp_ncpus - 1 that maps to "present" FreeBSD CPU IDs. I think Robert has done something similar to support RSS in TCP. Does that make sense?
This proves somewhat complicated.  I basically have two models, 
depending on whether RSS is involved (which adds an external 
factor).  Without RSS, I build a contiguous workstream number space, 
which is then mapped via a table to the CPU ID space, allowing 
mappings and hashing to be done easily -- however, these refer to 
ordered flow processing streams (i.e., "threads") rather than CPUs, 
in the strict sense.  In the future with dynamic configuration, this 
becomes important because what I do is rebalance ordered processing 
streams rather than work to CPUs.  With RSS there has to be a link 
between work distribution and the CPU identifiers shared by device 
drivers, hardware, etc, in which case RSS identifies viable CPUs as 
it starts (probably not quite correctly, I'll be looking for a 
review of that code shortly, cleaning it up currently).
This issue came up some at the BSDCan devsummit last year: as more 
and more kernel subsystems need to exploit parallelism explicitly, 
the thread programming model isn't bad, but lacks a strong tie to 
hardware topology in order to help manage work distribution.  One 
idea idly bandied around was to do something along the lines of 
KSE/GCD for the kernel: provide a layered "work" model with ordering 
constraints, rather than exploit threads directly, for work-oriented 
subsystems.  This is effectively what netisr does, but in a network 
stack-specific way.  But with crypto code, IPSEC, storage stuff, 
etc, all looking to exploit parallelism, perhaps a more general 
model is called for.
Robert

_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to