I've got to correct and update myself:


Am 06.07.2012, 19:19 Uhr, schrieb Michael Ross <g...@ross.cx>:

Hello,

I rented a new machine a couple of days ago,
and it happens:

Test: Transfer some 5GB of files to the machine

Works fine as long as I use one of the drives individually.

If I gmirror the drives
        gmirror label gm0 ada0
        gmirror insert gm0 ada1
        ...wait for rebuild

the machine reliably locks up on the file transfer,
with a frozen systat screen showing both drives at 100% busy:

ok it doesn't actually lock up, it just stays at 100% busy drives for a (long) time. Last attempt I managed to transfer 690KB in 8 files before the machine stalled.
So I interrupted the transfer. That was about 10 minutes ago.
System has not yet recovered, drive load keeps jumping to 100% on an idle system, load 0,0,0.
Mirror is synchronized.
20 minutes, still not recovered (as in, launching any program takes the better part of 5 minutes.)
rebooted and transferred ~2.5GB before stall.

I have no problems with buildworld and installing a bunch of bigger ports.


dmesg: http://pastebin.com/GWWbLrL2



Systat looks as before/below,
here's a vmstat -i:

interrupt                          total       rate
irq1: atkbd0                          14          0
irq16: re0                        531857        191
irq20: atapci0                      9188          3
cpu0:timer                        322709        116
cpu1:timer                         79970         28
Total                             943738        339


origin> ps auxwww
USER  PID  %CPU %MEM   VSZ  RSS TT  STAT STARTED      TIME COMMAND
root   11 199,0  0,0     0   32 ??  RL    7:21am 101:49,04 [idle]
root    0   0,0  0,0     0  144 ??  DLs   7:21am   0:00,00 [kernel]
root    1   0,0  0,0  6276  592 ??  ILs   7:21am   0:00,01 /sbin/init --
root    2   0,0  0,0     0   16 ??  DL    7:21am   0:00,00 [ctl_thrd]
root    3   0,0  0,0     0   16 ??  DL    7:21am   0:00,00 [fdc0]
root    4   0,0  0,0     0   16 ??  DL    7:21am   0:00,00 [sctp_iterator]
root    5   0,0  0,0     0   16 ??  DL    7:21am   0:00,00 [xpt_thrd]
root    6   0,0  0,0     0   16 ??  DL    7:21am   0:00,00 [pagedaemon]
root    7   0,0  0,0     0   16 ??  DL    7:21am   0:00,00 [vmdaemon]
root    8   0,0  0,0     0   16 ??  DL    7:21am   0:00,00 [pagezero]
root    9   0,0  0,0     0   16 ??  DL    7:21am   0:00,01 [bufdaemon]
root   10   0,0  0,0     0   16 ??  DL    7:21am   0:00,00 [audit]
root   12   0,0  0,0     0  240 ??  WL    7:21am   0:05,98 [intr]
root   13   0,0  0,0     0   48 ??  DL    7:21am   0:00,44 [geom]
root   14   0,0  0,0     0   16 ??  DL    7:21am   0:00,12 [yarrow]
root   15   0,0  0,0     0  320 ??  DL    7:21am   0:00,03 [usb]
root   16   0,0  0,0     0   16 ??  DL    7:21am   0:00,01 [vnlru]
root   17   0,0  0,0     0   16 ??  DL    7:21am   0:00,03 [syncer]
root   18   0,0  0,0     0   16 ??  DL    7:21am   0:00,10 [softdepflush]
root   19   0,0  0,0     0   16 ??  DL    7:21am   0:00,11 [g_mirror gm0]
root  887   0,0  0,2 10376 3496 ??  Is    7:21am   0:00,00 /sbin/devd
root 1033 0,0 0,1 12052 1692 ?? Is 7:21am 0:00,01 /usr/sbin/syslogd -s -s root 1119 0,0 0,1 12024 1856 ?? Is 7:21am 0:00,00 ntpd: [priv] (ntpd) _ntp 1120 0,0 0,1 12024 1904 ?? S 7:21am 0:00,03 ntpd: ntp engine (ntpd) _ntp 1122 0,0 0,1 12024 1884 ?? I 7:21am 0:00,00 ntpd: dns engine (ntpd)
root 1131   0,0  0,2 46748 4712 ??  Is    7:21am   0:00,01 /usr/sbin/sshd
root 1145 0,0 0,1 14128 1828 ?? Ss 7:21am 0:00,01 /usr/sbin/cron -s root 1192 0,0 0,3 67888 5524 ?? Ss 7:21am 0:00,08 sshd: root@pts/0 (sshd) root 1197 0,0 0,3 67888 5564 ?? Ss 7:21am 0:00,10 sshd: root@pts/1 (sshd) root 1277 0,0 0,1 22688 2164 ?? Is 7:40am 0:00,01 /usr/libexec/ftpd -D root 1176 0,0 0,1 12052 1644 v0 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv0 root 1177 0,0 0,1 12052 1644 v1 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv1 root 1178 0,0 0,1 12052 1644 v2 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv2 root 1179 0,0 0,1 12052 1644 v3 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv3 root 1180 0,0 0,1 12052 1644 v4 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv4 root 1181 0,0 0,1 12052 1644 v5 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv5 root 1182 0,0 0,1 12052 1644 v6 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv6 root 1183 0,0 0,1 12052 1644 v7 Is+ 7:21am 0:00,00 /usr/libexec/getty Pc ttyv7
root 1195   0,0  0,2 17464 3968  0  Ss    7:21am   0:00,05 -csh (csh)
root 1403   0,0  0,1 14188 1820  0  R+    8:12am   0:00,00 ps auxwww
root 1200   0,0  0,2 17464 3380  1  Is    7:21am   0:00,01 -csh (csh)
root 1231   0,0  0,2 18680 3692  1  S+    7:22am   0:02,07 systat -vms 1




    10 users    Load  0,41  0,44  0,20                   6 Jul 18:47

Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out
Act   23496    6036   600772    12252 1361840  count
All   71680    6632 1074428k    28264          pages
Proc: Interrupts r p d s w Csw Trp Sys Int Sof Flt cow 121 total 28 199 2 121 4 67 zfod atkbd0 1 ozfod 4 re0 16 0,4%Sys 0,0%Intr 0,0%User 0,0%Nice 99,6%Idle %ozfod atapci0 20 | | | | | | | | | | | daefr 94 cpu0:timer prcfr 23 cpu1:timer
                                       1333 dtbuf        4 totfr
Namei     Name-cache   Dir-cache    111358 desvn          react
    Calls    hits   %    hits   %      1009 numvn          pdwak
        3       3 100                    32 frevn          pdpgs
                                                           intrn
Disks  ada0  ada1 pass0 pass1                      302680 wire
KB/t  16,00 16,00  0,00  0,00                       14716 act
tps       1     1     0     0                      334260 inact
MB/s   0,02  0,02  0,00  0,00                             cache
%busy   100   100     0     0                     1361840 free
                                                    217488 buf

While the network stays responsive, i. e. I can ping the machine and _connect_ via ssh,
I can't actually log in (or, in already open shell, execute anything).
System requires a hardware reset. Nothing in the logs whatsoever (no surprise here).

I have no KVM access to this system.

OS is generic 9.0 stable from two days ago.

I run 8.2-R on an identical machine without trouble.
I run 9.0 stable as of May 4th on an similiar (other CPU and NIC) machine without trouble.
On both machines, the drives are recognized as ``ad''.
(Why btw? ``man ada'' says ``device ada'', but there is no such option in the GENERIC config. Do I get ``ada'' with ``device ATA_CAM ''? I'm going to try this next, kick ata_cam from the kernel, see if drives are ``ad'' and system doesn't crash.)

Right, should have remembered the release notes.
Still the other machine doesn't ``ada'' in spite of running 9.0-STABLE.




I'd appreciate suggestions on what I could do.

Thanks,

Michael
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to