On Fri, 8 Jun 2007, John Baldwin wrote:

On Friday 08 June 2007 10:50:15 am Bruce Evans wrote:
On Thu, 7 Jun 2007, John Baldwin wrote:

Hmm, one problem with the non-preemption page zero is that it doesn't
yield
the lock when it voluntarily yields.  I wonder if something like this
patch
would help things for the non-preemption case:

This works well.  It fixed the extra 1.4 million voluntary context switches
and even reduced the number by 10-100k.  The real runtime is still up a bit,
but user+sys+pgzero time is down a bit.

Some numbers are:
- real time up from 827 (best time on June 4 kernel) to 835 seconds (current
  with this patch and others in common with June 4 kernel)
- real mysteriously time up from 827 seconds to 832 even with the June 4
  kernel.

Index: vm_zeroidle.c
===================================================================
RCS file: /usr/cvs/src/sys/vm/vm_zeroidle.c,v
retrieving revision 1.45
diff -u -r1.45 vm_zeroidle.c
--- vm_zeroidle.c       18 May 2007 07:10:50 -0000      1.45
+++ vm_zeroidle.c       7 Jun 2007 14:56:02 -0000
@@ -147,8 +147,10 @@
#ifndef PREEMPTION
                       if (sched_runnable()) {
                               mtx_lock_spin(&sched_lock);
+                               mtx_unlock(&vm_page_queue_free_mtx);
                               mi_switch(SW_VOL, NULL);
                               mtx_unlock_spin(&sched_lock);
+                               mtx_lock(&vm_page_queue_free_mtx);
                       }
#endif
               } else {

The sched_locks are now of course thread_locks.  I put the vm unlock
before the thread lock since the above order seems to risk a LOR.  That
may have been a mistake -- we would prefer not to be switched after
deciding to do it ourself.

Actually, the order is on purpose so that we don't waste time doing a
preemption when we're about to switch anyway.

OK, changed back (not tested with it changed back).

What about after returning from mi_switch()?  The spinlock must be dropped
before acquiring the sleep lock, but that wastes some time if preemption
occurs.  Here we expect to have more work to do and it seems best to
ensure doing at least one page of work per context switch if possible.

Index: vm_zeroidle.c
===================================================================
RCS file: /usr/cvs/src/sys/vm/vm_zeroidle.c,v
retrieving revision 1.45
diff -u -r1.45 vm_zeroidle.c
--- vm_zeroidle.c       18 May 2007 07:10:50 -0000      1.45
+++ vm_zeroidle.c       7 Jun 2007 14:58:39 -0000
@@ -110,8 +110,10 @@
       if (m != NULL && (m->flags & PG_ZERO) == 0) {
               vm_pageq_remove_nowakeup(m);
               mtx_unlock(&vm_page_queue_free_mtx);
+               critical_exit();
               pmap_zero_page_idle(m);
               mtx_lock(&vm_page_queue_free_mtx);
+               critical_enter();
               m->flags |= PG_ZERO;
               vm_pageq_enqueue(PQ_FREE + m->pc, m);
               ++vm_page_zero_count;

Next I will try this.  I put the critical_exit() before the vm unlock.
mtx_unlock() should be allowed to switch if it wants.  However, we
would prefer to keep context switches disabled in the above -- just drop
the lock so that other CPUs can proceed.

Again, the order here is on purpose to make sure we don't preempt while
holding the lock.

critical_enter(9) says that critical regions should not be interlocked
with other synchronization primitives like this.  It makes an exception
for spin locks, but vm_page_queue_free_mtx is a sleep lock.

I think this is as good a place to preempt as any.  In fact, the code
can be simplified by preempting here and not in the main loop.  The
vm mutex is already dropped here, so extra code isn't needed to drop
in the main loop.  In the !PREEMPTION case, check sched_runnable()
here.  In the PREEMPTION case, can you explain why involuntary preemption
never worked right with SMP for SCHED_4BSD?  I thought that it was
because SCHED_4BSD doesn't do enough IPIs, so it doesn't preempt idle
threads running on other CPUs, but shouldn't preemption occur when the
vm mutex is unlocked in the above, if sched_runnable() is true and
there is something better to run than pagezero?

@@ -141,20 +143,25 @@
       idlezero_enable = idlezero_enable_default;

       mtx_lock(&vm_page_queue_free_mtx);
+       critical_enter();
       for (;;) {
               if (vm_page_zero_check()) {
                       vm_page_zero_idle();
#ifndef PREEMPTION
                       if (sched_runnable()) {
                               mtx_lock_spin(&sched_lock);
+                               mtx_unlock(&vm_page_queue_free_mtx);
                               mi_switch(SW_VOL, NULL);
                               mtx_unlock_spin(&sched_lock);
+                               mtx_lock(&vm_page_queue_free_mtx);
                       }
#endif

I added the necessary critical_exit/enter() calls here.

They aren't needed in the !PREEMPTION case since the context switch is
explicit.  The critical sections are only needed for the PREEMPTION case
really.

They are needed to pass the KASSERT(td_critnest == 1 || ...)  in mi_switch().

PREEMPTION ifdefs on the critical sections would be messy.

Next I will try moving the PREEMPTION code to where the vm mutex is dropped
naturally.  I will try the following order:

        // Stay in critical section.
        drop vm mutex         // though critical_enter(9) says this is wrong
        zero the page         // intentionally in critical section
#ifdef PREEMPTION(?)          // even more needed now that the critical
                              // prevents prevents a switch on the mutex drop
        if (sched_runnable()) {
                thread_lock(...)
                critical_exit()    // if using critical sections with !PRE*
                mi_switch(...)
                thread_unlock(...)
        } else
#endif
                critical_exit()
        // I would like to avoid another switch here if there was a switch
        // above, but don't know how to.
        acquire vm mutex
        critical_enter()

Bruce
_______________________________________________
cvs-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/cvs-all
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to