For those following the saga: With the prefetch problem fixed, and data coming off the L2ARC instead of the disks, the system switched from IO bound to CPU bound, I opened up the throttles with some explicit PARALLEL hints in the Oracle commands, and we were finally able to max out the single SSD:
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 826.0 3.2 104361.8 35.2 0.0 9.9 0.0 12.0 3 100 c0t0d4 So, when we maxed out the SSD cache, it was delivering 100+MB/s, and 830 IOPS with 3.4 TB behind it in a 4 disk SATA RAIDz1. Still have to remap it to 8k blocks to get more efficiency, but for raw numbers, it's right what I was looking for. Now, to add the second SSD ZIL/L2ARC for a mirror. I may even splurge for one more to get a three way mirror. That will completely saturate the SCSI channel. Now I need a bigger server.... Did I mention it was <$1000 for the whole setup? Bah-ha-ha-ha..... Tracey On Sat, Feb 13, 2010 at 11:51 PM, Tracey Bernath <tbern...@ix.netcom.com>wrote: > OK, that was the magic incantation I was looking for: > - changing the noprefetch option opened the floodgates to the L2ARC > - changing the max queue depth relived the wait time on the drives, > although I may undo this again in the benchmarking since these drives all > have NCQ > > I went from all four disks of the array at 100%, doing about 170 read > IOPS/25MB/s > to all four disks of the array at 0%, once hitting nealyr 500 IOPS/65MB/s > off the cache drive (@ only 50% load). > This bodes well for adding a second mirrored cache drive to push for the > 1KIOPS. > > Now I am ready to insert the mirror for the ZIL and the CACHE, and we will > be ready > for some production benchmarking. > > > BEFORE: > device r/s w/s kr/s kw/s wait actv svc_t %w %b us sy wt id > sd0 170.0 0.4 7684.7 0.0 0.0 35.0 205.3 0 100 11 80 0 82 > > sd1 168.4 0.4 7680.2 0.0 0.0 34.6 205.1 0 100 > sd2 172.0 0.4 7761.7 0.0 0.0 35.0 202.9 0 100 > sd4 170.0 0.4 7727.1 0.0 0.0 35.0 205.3 0 100 > sd5 1.6 2.6 182.4 104.8 0.0 0.5 117.8 0 31 > > AFTER: > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d1 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d2 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d3 > 285.2 0.8 36236.2 14.4 0.0 0.5 0.0 1.8 1 37 c0t0d4 > > > And, keep in mind this was on less than $1000 of hardware. > > Thanks for the pointers guys, > Tracey > > > > On Sat, Feb 13, 2010 at 9:22 AM, Richard Elling > <richard.ell...@gmail.com>wrote: > >> comment below... >> >> On Feb 12, 2010, at 2:25 PM, TMB wrote: >> > I have a similar question, I put together a cheapo RAID with four 1TB WD >> Black (7200) SATAs, in a 3TB RAIDZ1, and I added a 64GB OCZ Vertex SSD, with >> slice 0 (5GB) for ZIL and the rest of the SSD for cache: >> > # zpool status dpool >> > pool: dpool >> > state: ONLINE >> > scrub: none requested >> > config: >> > >> > NAME STATE READ WRITE CKSUM >> > dpool ONLINE 0 0 0 >> > raidz1 ONLINE 0 0 0 >> > c0t0d0 ONLINE 0 0 0 >> > c0t0d1 ONLINE 0 0 0 >> > c0t0d2 ONLINE 0 0 0 >> > c0t0d3 ONLINE 0 0 0 >> > [b] logs >> > c0t0d4s0 ONLINE 0 0 0[/b] >> > [b] cache >> > c0t0d4s1 ONLINE 0 0 0[/b] >> > spares >> > c0t0d6 AVAIL >> > c0t0d7 AVAIL >> > >> > capacity operations bandwidth >> > pool used avail read write read write >> > ---------- ----- ----- ----- ----- ----- ----- >> > dpool 72.1G 3.55T 237 12 29.7M 597K >> > raidz1 72.1G 3.55T 237 9 29.7M 469K >> > c0t0d0 - - 166 3 7.39M 157K >> > c0t0d1 - - 166 3 7.44M 157K >> > c0t0d2 - - 166 3 7.39M 157K >> > c0t0d3 - - 167 3 7.45M 157K >> > c0t0d4s0 20K 4.97G 0 3 0 127K >> > cache - - - - - - >> > c0t0d4s1 17.6G 36.4G 3 1 249K 119K >> > ---------- ----- ----- ----- ----- ----- ----- >> > I just don't seem to be getting any bang for the buck I should be. This >> was taken while rebuilding an Oracle index, all files stored in this pool. >> The WD disks are at 100%, and nothing is coming from the cache. The cache >> does have the entire DB cached (17.6G used), but hardly reads anything from >> it. I also am not seeing the spike of data flowing into the ZIL either, >> although iostat show there is just write traffic hitting the SSD: >> > >> > extended device statistics cpu >> > device r/s w/s kr/s kw/s wait actv svc_t %w %b us sy wt id >> > sd0 170.0 0.4 7684.7 0.0 0.0 35.0 205.3 0 100 11 8 0 82 >> > sd1 168.4 0.4 7680.2 0.0 0.0 34.6 205.1 0 100 >> > sd2 172.0 0.4 7761.7 0.0 0.0 35.0 202.9 0 100 >> > sd3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 >> > sd4 170.0 0.4 7727.1 0.0 0.0 35.0 205.3 0 100 >> > [b]sd5 1.6 2.6 182.4 104.8 0.0 0.5 117.8 0 31 [/b] >> >> iostat has a "n" option, which is very useful for looking at device names >> :-) >> >> The SSD here is perfoming well. The rest are clobbered. 205 millisecond >> response time will be agonizingly slow. >> >> By default, for this version of ZFS, up to 35 I/Os will be queued to the >> disk, which is why you see 35.0 in the "actv" column. The combination >> of actv=35 and svc_t>200 indicates that this is the place to start >> working. >> Begin by reducing zfs_vdev_max_pending from 35 to something like 1 to 4. >> This will reduce the concurrent load on the disks, thus reducing svc_t. >> >> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29 >> >> -- richard >> >> > Since this SSD is in a RAID array, and just presents as a regular disk >> LUN, is there a special incantation required to turn on the Turbo mode? >> > >> > Doesnt it seem that all this traffic should be maxing out the SSD? Reads >> from the cache, and writes to the ZIL? I have a seocnd identical SSD I >> wanted to add as a mirror, but it seems pointless if there's no zip to be >> had.... >> > >> > help? >> > >> > Thanks, >> > Tracey >> > -- >> > This message posted from opensolaris.org >> > _______________________________________________ >> > zfs-discuss mailing list >> > zfs-discuss@opensolaris.org >> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> > -- Tracey Bernath 913-488-6284
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss