Re: [osol-discuss] A question of scheduling.

John Plocher Thu, 09 Nov 2006 09:45:00 -0800

Precisely what I was thinking.  Except that third column is "swapped out
processes" and how does one pull them back from the brink of never never
land?  What ladel reaches into that bucket and when?


I don't rightly know.

Lets run through a thought experiment/demonstration:

Start off with a couple of different kinds of processes, and make

the assumption that a kernel thread is used for each process that isinvoked.

For this discussion, consider the following three types of process:

compute bound      eg: main() { for(;;); }
resource bound      eg: main() { for(;;) sleep(1); }

i/o bound eg: main() { for(;;) write(/dev/null,read(/dev/zero))); }

If you fire off a bunch of compute bound processes, at some point therewill bemore processes ready to run than there are cpus/cores to run them on.They all will be ready to run whenever a cpu is available. This is


        r        the number of kernel threads in run queue


(IIRC, this does not include processes actually running on cpus...)
The scheduler (at least in simplistic terms) does this:

   event_handler["scheduling quantum expired"] => {
      Take the current process and put it at the end of the run queue

Take the process off the head of the run queue and make it thecurrent process

This ensures that all processes that are "ready to run" get run at somepoint.

(The details of the scheduler are the topic for some other discussion about

scheduling policies, multi-processor aware kernels, process groups andthe like...)


Now you fire off an additional set of the resource or I/O bound processes.
You still have more process to run than you have cpus, so the "r"un queue
is still in use.  Sometimes these new processes are ready to run and other
times they are waiting for something asynchronous to happen - a kernel
timer to fire, an I/O operation to complete, some mutex to become unstuck,
whatever.   The key point is that the kernel "knows" that the process is

waiting for something and therefore it is not "runnable" until thatsomething

happens.   While these processes are "stuck", they are

        b        the number of blocked kernel threads that
                  are waiting for resources I/O, paging, etc


Whenever that something happens, the process is put back onto the end
of the run queue by the kernel and this narrative goes back to the above
scheduling loop.

Now, kick off a large number of these processes such that together they
require more RAM memory than your system actually contains.  (The above
snippets of pseudocode are not good examples for this because they don't
actually consume much unshared memory and they don't make a mess of
what little they *do* use...)  At the point where all of the system's RAM
memory is allocated, the system has to do something to make room for

a new process. When I last played in this area (don't ask :-), thealgorithmwas simple: First, look for pages that can be reloaded easily fromsomewhereelse: read-only pages like those found in executable files from the filesystem

are easy to toss out and read back in.  If there aren't any of these easy
choices, then the system needs to choose a dirty page (one that has
content modified by a user process, such as a stack frame, data structures,
etc) and write it out to disk to make room.  Once a piece of a process's
address space is paged or swapped out, the page tables are changed so

that they will generate a page fault when it tries to access it. Theprocess

will continue to run as usual (bouncing between running, being on the
run queue and being blocked for resources) until it tries to access one of
its pages that are no longer in memory.  At that point, IIRC,  the process
ends up as:

        w        the number of  swapped  out  light-
                 weight  processes  waiting ...


The kernel makes arrangements for the proper disk blocks to be read back
in and mapped to the proper VM page tables, but those operations take lots

of time to complete, relative to the instruction cycle time of theCPU. When

the requested data is back in memory, the process is removed from the "w"ait
queue and put back on the "r"un queue.

At what point does Solaris push back and say "no more, I'm busy" ?  I am
now


You answered that in your follow up email:

*       Description :  max_nprocs
*               Maximum number of processes that can be created on a system.
*       Description :  maxuprc
*               Maximum number of processes that can be created on a system
*               by any one user.

The pushback _is_ a bit harsh - it says "no more" instead of "pleasewait"...Up to that point, processes get created and added to the "r" queue to bemanaged

by the scheduler and the VM system.

The dynamics of your testbed  come into play - fork/execing a shell to
fork/exec a prioctl which exec's unzip which does a bunch of file I/O....
Your examples are not nearly as simple as the pseudo code I gave above :-)
It sounds like a perfect place to play with DTrace, which should help paint
a very pretty picture of what exactly is happening under the hood.

  -John


_______________________________________________
opensolaris-discuss mailing list
opensolaris-discuss@opensolaris.org

Re: [osol-discuss] A question of scheduling.

Reply via email to