[This is the continuation of a thread that started on -committers]

On Sun, Sep 16, 2001 at 02:48:48PM +0100, Josef Karthauser wrote:
> On Sun, Sep 16, 2001 at 01:35:20AM +0100, Josef Karthauser wrote:
> > On Sat, Sep 15, 2001 at 03:51:07PM +0200, Dag-Erling Smorgrav wrote:
> > > Josef Karthauser <[EMAIL PROTECTED]> writes:
> > > > Is there a possibility that this commit is causing me to lose key
> > > > presses?  I'm finding it hard to imagine that I'm miss typing as
> > > > I've never noticed it before.  (Every N, where N is > 30 or 40, a key
> > > > that I press doesn't register and I have to press it again).
> > > 
> > > Educated guess: your interrupt latency just went to hell (where mine's
> > > been for three months now, I'm still waiting to hear if Matt could
> > > make any sense out of my crash dump) and you're losing interrupts.  If
> > > you have a serial mouse, try moving it around a lot and see if it
> > > seems to hang (you should see mentions of interrupt-level buffer
> > > overflows in your /var/log/messages).  Also, just for kicks, check how
> > > much CPU time your syncer process is using, and try running sync(8)
> > > and see if your keyboard wedges for a couple of seconds when you do
> > > that.
> > 
> > My mouse is /dev/psm0. From time to time the ata device's
> > interrupt/second goes through the roof for not apparent reason (i.e.
> > several hundred interrupts/sec).  Sync never wedges anything.
> 
> There's almost definitely an interrupt problem.  I regularly have
> the machine wedge almost solid when rsyncing a lot of data to and
> fro.  The machine begins to behave eratically, which I now think
> happens mainly because all the timers stop working (maybe the
> interrupts stop working?), 'systat -vmstat' doesn't produce any
> numbers because the initial time delay never passes.  :(.  Also, I
> don't appear to be able to enter the kernel debugger when this
> happens!  :(  Can someone in the know give me a hand debugging this.
> It really ought to be fixed, but my knowledge isn't sufficient to
> find this on my own.
> 
> Thanks,
> Joe

This also happens from time to time:


    6 users    Load  1.39  1.23  1.14                  Sep 21 13:32             
                                                                                
Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER      
        Tot   Share      Tot    Share    Free         in  out     in  out       
Act   62696    8932   111764    14728   15052 count                             
All  249864   12164  2806932    25860         pages                             
                                                                 Interrupts     
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt      1 cow    1743 total     
           6 32     12398   13  866 1823        26  45516 wire        stray irq0
                                                    90820 act         stray irq6
 8.3%Sys   5.1%Intr  0.2%User  0.0%Nice 86.4%Idl   102140 inact       stray irq7
|    |    |    |    |    |    |    |    |    |      11388 cache     1 acpi0 irq9
====+++                                              3664 free   1505 ata0 irq14
                                                          daefr       uhci0 irq5
Namei         Name-cache    Dir-cache                   5 prcfr     2 pcm0 irq5 
    Calls     hits    %     hits    %                     react     7 atkbd0 irq
      688      687  100                                   pdwak       psm0 irq12
                                        4 zfod            pdpgs   100 clk irq0  
Disks   ad0   fd0                         ofod            intrn   128 rtc irq8  
KB/t   6.00  0.00                       9 %slo-z    35712 buf                   
tps    1507     0                       7 tfree        10 dirtybuf              
MB/s   8.83  0.00                                   17913 desiredvnodes         
% busy   98     0                                   14595 numvnodes             
                                                     4798 freevnodes            
                                                                                

Look at the number of interrupts that the ata device is generating.
This is in no way normal!  It happens randomly and causes the machine
to basically grind to a halt.

As a comparison on the same machine, here's the output of systat -vmstat
for the machine after I rebooted it and it was running a background
fsck:


    4 users    Load  1.01  0.42  0.16                  Sep 21 13:50             
                                                                                
Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER      
        Tot   Share      Tot    Share    Free         in  out     in  out       
Act   40328    3848    71980     4408   53308 count                             
All  200248    6884  1085132    10232         pages                             
                                                                 Interrupts     
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow     329 total     
           2 30       622   11  955  402    2   34  35928 wire        stray irq0
                                                    35492 act         stray irq6
 1.4%Sys   1.9%Intr  1.2%User  0.6%Nice 94.9%Idl   128800 inact       stray irq7
|    |    |    |    |    |    |    |    |    |         28 cache       acpi0 irq9
=+-                                                 53280 free     97 ata0 irq14
                                                          daefr       uhci0 irq5
Namei         Name-cache    Dir-cache                     prcfr     1 pcm0 irq5 
    Calls     hits    %     hits    %                     react     3 atkbd0 irq
      536      534  100                                   pdwak       psm0 irq12
                                        8 zfod            pdpgs   100 clk irq0  
Disks   ad0   fd0                       1 ofod            intrn   128 rtc irq8  
KB/t   7.99  0.00                       7 %slo-z    35712 buf                   
tps      97     0                       1 tfree        33 dirtybuf              
MB/s   0.76  0.00                                   17913 desiredvnodes         
% busy   98     0                                    1655 numvnodes             
                                                       29 freevnodes            


Who's responsible for this area?  I'm happy to help in getting to the
bottom of it.  Is it an interrupt routing problem?  It is a ata device
problem?  It is something else (maybe locking) altogether?

This problem has existed in -current for at least 6 weeks.

Thanks for any suggestions,
Joe

PGP signature

Reply via email to