Hi,
I have a Netra X1 server with 512MB ram and two ATA disk, model ST340016A.
Processor is a UltraSPARC-IIe 500MHz.
Version of solaris is: Solaris 10 10/09 s10s_u8wos_08a SPARC
I jumpstarted the server with ZFS as root, two disks as a mirror:
========================================================
pool: rpool
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
mirror ONLINE 0 0 0
c0t0d0s0 ONLINE 0 0 0
c0t2d0s0 ONLINE 0 0 0
errors: No known data errors
========================================================
Here's the following partition table:
========================================================
NAME USED AVAIL REFER MOUNTPOINT
rpool 28.9G 7.80G 98K /rpool
rpool/ROOT 4.06G 7.80G 21K legacy
rpool/ROOT/newbe 4.06G 7.80G 4.06G /
rpool/dump 512M 7.80G 512M -
rpool/opt 23.8G 7.80G 4.11G /opt
rpool/opt/export 19.7G 7.80G 19.7G /opt/export
rpool/swap 512M 8.12G 187M -
========================================================
When I do this command:
# digest -a md5 /opt/export/BIGFILE (4.6GB)
It took around 1 hour and 45 minutes to process it. I can understand that
Netra X1 with ATA can be slow but not like it. Also, there are some
inconsistencies between "zfs iostat 1" and dtrace outputs.
Here's the small snippet of zpool iostat 1:
========================================================
capacity operations bandwidth
pool used avail read write read write
---------- ----- ----- ----- ----- ----- -----
rpool 28.5G 8.70G 33 1 3.28M 5.78K
rpool 28.5G 8.70G 124 0 15.5M 0
rpool 28.5G 8.70G 150 0 18.8M 0
rpool 28.5G 8.70G 134 0 16.8M 0
rpool 28.5G 8.70G 135 0 16.7M 0
========================================================
No so bad as I would say, for ATA IDE drives. But, if I dig deeper, for
example, if I use iostat -D 1 command:
========================================================
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
dad0 63.4 0.0 8115.1 0.0 32.7 2.0 547.5 100 100
dad1 63.4 0.0 8115.1 0.0 33.0 2.0 551.9 100 100
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
dad0 52.0 0.0 6152.3 0.0 9.3 1.6 210.9 75 84
dad1 70.0 0.0 8714.7 0.0 16.0 2.0 256.5 93 99
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
dad0 75.3 0.0 9634.7 0.0 23.1 1.9 332.7 96 97
dad1 71.3 0.0 9121.0 0.0 22.3 1.9 339.4 91 95
========================================================
zfs says 15MB worth of data being transferred but iostat says 8MB worth of data.
We can also clearly see that the bus and hard drive are busy (%w and %b).
Let's see the % of IO with iotop from DTracet toolkit:
========================================================
2009 Nov 13 09:06:03, load: 0.37, disk_r: 93697 KB, disk_w: 0 KB
UID PID PPID CMD DEVICE MAJ MIN D %I/O
0 2250 1859 digest dad1 136 8 R 3
0 2250 1859 digest dad0 136 0 R 4
0 0 0 sched dad0 136 0 R 88
0 0 0 sched dad1 136 8 R 89
========================================================
sched is taking up to 90% of IO. I tried to trace it:
========================================================
[x]: ps -efl |grep sche
1 T root 0 0 0 0 SY ? 0 21:31:44 ?
0:48 sched
[x]: truss -f -p 1
1: pollsys(0x0002B9DC, 1, 0xFFBFF588, 0x00000000) (sleeping...)
[x]:
========================================================
Only one ouput from truss. Weird. As if sched is doing nothing but is taking
up to 90% of IO.
prstat is showing this (I'm gonna show the only first process at the top):
=======================================================
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
2250 root 2464K 1872K run 0 0 0:01:57 15% digest/1
=======================================================
digest command is taking only 15% of cpu. We can see that the IO is no so fast.
pfilestat from Dtracetoolkit is reporting this:
=======================================================
STATE FDNUM Time Filename
running 0 6%
waitcpu 0 7%
read 4 13% /opt/export/BIGFILE
sleep-r 0 69%
STATE FDNUM KB/s Filename
read 4 614 /opt/export/BIGFILE
=======================================================
Around 615KB is really read each second. That's is very slow ! This is not
normal I think, even if those are ATA drives.
So my question are the following:
1.- Why zpool iostat is reporting 15MB/s of data read when in reality only
615KB/s is read ?
2.- Why sched is taking so much io?
3.- What I can do to improve IO performance? It find it very unbelievable that
this is the best performance the current hardware can provide...
Thank you!
--
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss