I've got to correct and update myself:
Am 06.07.2012, 19:19 Uhr, schrieb Michael Ross <g...@ross.cx>:
Hello,
I rented a new machine a couple of days ago,
and it happens:
Test: Transfer some 5GB of files to the machine
Works fine as long as I use one of the drives individually.
If I gmirror the drives
gmirror label gm0 ada0
gmirror insert gm0 ada1
...wait for rebuild
the machine reliably locks up on the file transfer,
with a frozen systat screen showing both drives at 100% busy:
ok it doesn't actually lock up, it just stays at 100% busy drives for a
(long) time.
Last attempt I managed to transfer 690KB in 8 files before the machine
stalled.
So I interrupted the transfer. That was about 10 minutes ago.
System has not yet recovered, drive load keeps jumping to 100% on an idle
system, load 0,0,0.
Mirror is synchronized.
20 minutes, still not recovered (as in, launching any program takes the
better part of 5 minutes.)
rebooted and transferred ~2.5GB before stall.
I have no problems with buildworld and installing a bunch of bigger ports.
dmesg: http://pastebin.com/GWWbLrL2
Systat looks as before/below,
here's a vmstat -i:
interrupt total rate
irq1: atkbd0 14 0
irq16: re0 531857 191
irq20: atapci0 9188 3
cpu0:timer 322709 116
cpu1:timer 79970 28
Total 943738 339
origin> ps auxwww
USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND
root 11 199,0 0,0 0 32 ?? RL 7:21am 101:49,04 [idle]
root 0 0,0 0,0 0 144 ?? DLs 7:21am 0:00,00 [kernel]
root 1 0,0 0,0 6276 592 ?? ILs 7:21am 0:00,01 /sbin/init --
root 2 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [ctl_thrd]
root 3 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [fdc0]
root 4 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [sctp_iterator]
root 5 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [xpt_thrd]
root 6 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [pagedaemon]
root 7 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [vmdaemon]
root 8 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [pagezero]
root 9 0,0 0,0 0 16 ?? DL 7:21am 0:00,01 [bufdaemon]
root 10 0,0 0,0 0 16 ?? DL 7:21am 0:00,00 [audit]
root 12 0,0 0,0 0 240 ?? WL 7:21am 0:05,98 [intr]
root 13 0,0 0,0 0 48 ?? DL 7:21am 0:00,44 [geom]
root 14 0,0 0,0 0 16 ?? DL 7:21am 0:00,12 [yarrow]
root 15 0,0 0,0 0 320 ?? DL 7:21am 0:00,03 [usb]
root 16 0,0 0,0 0 16 ?? DL 7:21am 0:00,01 [vnlru]
root 17 0,0 0,0 0 16 ?? DL 7:21am 0:00,03 [syncer]
root 18 0,0 0,0 0 16 ?? DL 7:21am 0:00,10 [softdepflush]
root 19 0,0 0,0 0 16 ?? DL 7:21am 0:00,11 [g_mirror gm0]
root 887 0,0 0,2 10376 3496 ?? Is 7:21am 0:00,00 /sbin/devd
root 1033 0,0 0,1 12052 1692 ?? Is 7:21am 0:00,01
/usr/sbin/syslogd -s -s
root 1119 0,0 0,1 12024 1856 ?? Is 7:21am 0:00,00 ntpd: [priv]
(ntpd)
_ntp 1120 0,0 0,1 12024 1904 ?? S 7:21am 0:00,03 ntpd: ntp
engine (ntpd)
_ntp 1122 0,0 0,1 12024 1884 ?? I 7:21am 0:00,00 ntpd: dns
engine (ntpd)
root 1131 0,0 0,2 46748 4712 ?? Is 7:21am 0:00,01 /usr/sbin/sshd
root 1145 0,0 0,1 14128 1828 ?? Ss 7:21am 0:00,01 /usr/sbin/cron
-s
root 1192 0,0 0,3 67888 5524 ?? Ss 7:21am 0:00,08 sshd:
root@pts/0 (sshd)
root 1197 0,0 0,3 67888 5564 ?? Ss 7:21am 0:00,10 sshd:
root@pts/1 (sshd)
root 1277 0,0 0,1 22688 2164 ?? Is 7:40am 0:00,01
/usr/libexec/ftpd -D
root 1176 0,0 0,1 12052 1644 v0 Is+ 7:21am 0:00,00
/usr/libexec/getty Pc ttyv0
root 1177 0,0 0,1 12052 1644 v1 Is+ 7:21am 0:00,00
/usr/libexec/getty Pc ttyv1
root 1178 0,0 0,1 12052 1644 v2 Is+ 7:21am 0:00,00
/usr/libexec/getty Pc ttyv2
root 1179 0,0 0,1 12052 1644 v3 Is+ 7:21am 0:00,00
/usr/libexec/getty Pc ttyv3
root 1180 0,0 0,1 12052 1644 v4 Is+ 7:21am 0:00,00
/usr/libexec/getty Pc ttyv4
root 1181 0,0 0,1 12052 1644 v5 Is+ 7:21am 0:00,00
/usr/libexec/getty Pc ttyv5
root 1182 0,0 0,1 12052 1644 v6 Is+ 7:21am 0:00,00
/usr/libexec/getty Pc ttyv6
root 1183 0,0 0,1 12052 1644 v7 Is+ 7:21am 0:00,00
/usr/libexec/getty Pc ttyv7
root 1195 0,0 0,2 17464 3968 0 Ss 7:21am 0:00,05 -csh (csh)
root 1403 0,0 0,1 14188 1820 0 R+ 8:12am 0:00,00 ps auxwww
root 1200 0,0 0,2 17464 3380 1 Is 7:21am 0:00,01 -csh (csh)
root 1231 0,0 0,2 18680 3692 1 S+ 7:22am 0:02,07 systat -vms 1
10 users Load 0,41 0,44 0,20 6 Jul 18:47
Mem:KB REAL VIRTUAL VN PAGER SWAP
PAGER
Tot Share Tot Share Free in out
in out
Act 23496 6036 600772 12252 1361840 count
All 71680 6632 1074428k 28264 pages
Proc:
Interrupts
r p d s w Csw Trp Sys Int Sof Flt cow 121
total
28 199 2 121 4 67 zfod
atkbd0 1
ozfod 4
re0 16
0,4%Sys 0,0%Intr 0,0%User 0,0%Nice 99,6%Idle %ozfod
atapci0 20
| | | | | | | | | | | daefr 94
cpu0:timer
prcfr 23
cpu1:timer
1333 dtbuf 4 totfr
Namei Name-cache Dir-cache 111358 desvn react
Calls hits % hits % 1009 numvn pdwak
3 3 100 32 frevn pdpgs
intrn
Disks ada0 ada1 pass0 pass1 302680 wire
KB/t 16,00 16,00 0,00 0,00 14716 act
tps 1 1 0 0 334260 inact
MB/s 0,02 0,02 0,00 0,00 cache
%busy 100 100 0 0 1361840 free
217488 buf
While the network stays responsive, i. e. I can ping the machine and
_connect_ via ssh,
I can't actually log in (or, in already open shell, execute anything).
System requires a hardware reset. Nothing in the logs whatsoever (no
surprise here).
I have no KVM access to this system.
OS is generic 9.0 stable from two days ago.
I run 8.2-R on an identical machine without trouble.
I run 9.0 stable as of May 4th on an similiar (other CPU and NIC)
machine without trouble.
On both machines, the drives are recognized as ``ad''.
(Why btw? ``man ada'' says ``device ada'', but there is no such option
in the GENERIC config.
Do I get ``ada'' with ``device ATA_CAM ''? I'm going to try this next,
kick ata_cam from the kernel, see if drives are ``ad'' and system
doesn't crash.)
Right, should have remembered the release notes.
Still the other machine doesn't ``ada'' in spite of running 9.0-STABLE.
I'd appreciate suggestions on what I could do.
Thanks,
Michael
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"