Re: svn commit: r247068 - head/sys/x86/isa

Bruce Evans Thu, 21 Feb 2013 22:26:32 -0800

On Thu, 21 Feb 2013, Warner Losh wrote:

On Feb 21, 2013, at 5:31 AM, Bruce Evans wrote:

On Thu, 21 Feb 2013, Warner Losh wrote:

Log:
Fix broken usage of splhigh() by removing it.


This is more broken than before.  The splhigh() served to indicate
missing locking.


Depends on what you mean by more :)  It is less depessimized after the 
nopification of splhigh().


The null spl didn't cost anything.

Modified: head/sys/x86/isa/atrtc.c
==============================================================================
--- head/sys/x86/isa/atrtc.c    Thu Feb 21 00:36:12 2013        (r247067)
+++ head/sys/x86/isa/atrtc.c    Thu Feb 21 00:40:08 2013        (r247068)
@@ -328,7 +328,6 @@ static int
atrtc_gettime(device_t dev, struct timespec *ts)
{
        struct clocktime ct;
-       int s;

        /* Look if we have a RTC present and the time is valid */
        if (!(rtcin(RTC_STATUSD) & RTCSD_PWR)) {
@@ -338,11 +337,8 @@ atrtc_gettime(device_t dev, struct times

        /* wait for time update to complete */
        /* If RTCSA_TUP is zero, we have at least 244us before next update */


As the comment says, this is time-critical code.  It needs to do something
to prevent it being preempted for more than 244 usec


That's a long time...


No, preemptions can easily be for as long as the quantum (default 100
milliseconds).

-       s = splhigh();


It used to do something...

-       while (rtcin(RTC_STATUSA) & RTCSA_TUP) {
-               splx(s);
-               s = splhigh();
-       }
+       while (rtcin(RTC_STATUSA) & RTCSA_TUP)
+               continue;


You should probably have changed this to a critical section like you did
in ppc.  Disabling hardware interrupts would be even better.


I'll replace with a critical section.

There is a problem with the "show rtc" command in ddb.  It was born
broken (racy), and the races were turned into deadlocks by adding
locking in rtcin().  So if you trace through this code, then
"show rtc" while holding the lock in rtcin() will deadlock.  It is a
bug for ddb to call any code that might try to acquire a mutex, but
"show rtc" always calls rtcin() and rtcin() always tries to aquire a
mutex.  Similar deadlocks on the i8254 lock in DELAY() are worked
around by not trying to acquire the lock in kdb mode.


kbd_active is what I need to check, right? I'll fix that while here.


It is logically wrong, since only ddb has a command for accessing the
RTC.

        ct.nsec = 0;
        ct.sec = readrtc(RTC_SEC);
        ct.min = readrtc(RTC_MIN);


There are 8 or 9 readrtc()'s altogether.  These must be atomic, and all
within the 244 usec limit.  There is considerable danger of exceeding the
limit without even being preempted.  Each readrtc() does 1 or 4 isa bus
accesses.  I've seen a single isa bus access taking 139 usec (delayed by
PCI DMA).

Another way of doing this without any locking against preemption or
timing magic is to read the time before and after.  If it took more than
244 usec from before seeing RTCSA_TUP deasserted to after the last
readrtc(), then retry.


By computing a time difference, or by checking RTCSA_TUP?


The former.  Needs a working timecounter or just some other timer with
enough precision.  RTCSA_TUP is volatile sand only gives the state at
the time you read it, but you need to know if it became set while you
were reading the registers.

My version still depends on splhigh() working, but it does more
sophisticated waiting that gives a full second to read the registers
without getting more than a stale time.  The above reads an RTC value
that may be stale by almost 1 second.  My version busy-waits until
just after after the counter rolls over.  This is only suitable for
boot=time initialization.  Debugging printfs show that this never got
preempted over the last few years -- the whole operation always
completed within about 40 usec of seeing the rollover.


So this isn't a problem on fast machines?


Fast machines shrink the race windows.

I forgot that inittodr() is almost a non-problem for another reason:
it is, or should, only be called at boot time and resume times.  At
these times, user threads shouldn't be running so we don't have to
worry about preemption for a full quantum.

resettodr() is more interesting.  Privileged applications can try
racing and/or mistiming it it using settime(2) in loops from multiple
threads.

Locking in the kernel settime() is quite broken too.  It was broken even
in FreeBSD-4 where the splclock() part of it worked, since it was:

@       s = splclock();
@       ...
@       set_timecounter(&ts);               /* OK so far. */
@       (void) splsoftclock();          /* Wrong (1). */
@       lease_updatetime(delta.tv_sec);
@       splx(s);                        /* Wronger (2). */
@       resettodr();
@       return (0);

(1) We want to continue with lower priority for lease_updatetime(), but
    we haven't done resettodr() yet.  OTOH, resettodr() might take too
    long to be done all at splclock().
(2) We even completely lower the priority before resettodr().

In -current, this has rotted to:

@       s = splclock();                 /* Wrong (3). */
@       ...
@       mtx_lock(&Giant);           /* Bogus (4). */
@       tc_setclock(&ts);
@       // lease_updatetime() was removed, including its splsoftclock()
@       // and also the splx(s) which doesn't belong to it.  So now there
@       // is an splclock() with no corresponding splx().  Compilers
@       // should complain that s is unused.
@       resettodr();
@       mtx_unlock(&Giant);
@       return (0);

After splclock() became null, anti-premption became broken.  This isn't
even Giant-locked.  It accesses some globals without locking:
- boottime, via microtime().  Locking for boottime is quite broken.
  Here we can cause boottime to be volatile by racing with another
  thread doing the same micro-adjustments as us.  It is a bug for
  boottime to be volatile at all.  It shouldn't be changed by
  micro-adjustments.
- securelevel, via securelevel_gt().
Giant locking wouldn't help here.

(4) Giant locking is fairly bogus for tc_setclock(), but better than
    nothing.  It prevents contention from here.  Similarly for
    resettodr().  tc_setclock() is called from elsewhere mainly
    from inittodr().  I think it has no Giant locking then.

Linux does even more sophisticated waiting for writing the registers,
so as not to be off by another second in the write operation.


Yea, that would be nice, I may have to look into that, once I'm done fetching 
these stones.


The above resettodr() call is the main thing that writes the registers.
I think settime() should block for a second or 2 as necessary to sync
with the hardware.  It must release all locks while waiting for the
hardware.  Then when it writes, it updates the hardware with the
current time, which is normally a fractional second after settime()
starts.  The splx(s) before the resettodr() was actually correct for
this.  Now resettodr() can be Giant locked, but if its callers acquire
Giant as above, then it must release Giant while sleeping.

Bruce
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r247068 - head/sys/x86/isa

Reply via email to