Thanks Brendan, I was going to move it over to 8kb block size once I got through this index rebuild. My thinking was that a disproportionate block size would show up as excessive IO thruput, not a lack of thruput.
The question about the cache comes from the fact that the 18GB or so that it says is in the cache IS the database. This was why I was thinking the index rebuild should be CPU constrained, and I should see a spike in reading from the cache. If the entire file is cached, why would it go to the disks at all for the reads? The disks are delivering about 30MB/s of reads, but this SSD is rated for sustained 70MB/s, so there should be a chance to pick up 100% gain. I've seen lots of mention of kernel settings, but those only seem to apply to cache flushes on sync writes. Any idea on where to look next? I've spent about a week tinkering with it.I'm trying to get a major customer to switch over to zfs and an open storage solution, but I'm afraid if I cant get it to work in the small scale, I cant convince them about the large scale. Thanks, Tracey On Fri, Feb 12, 2010 at 4:43 PM, Brendan Gregg - Sun Microsystems < bren...@sun.com> wrote: > On Fri, Feb 12, 2010 at 02:25:51PM -0800, TMB wrote: > > I have a similar question, I put together a cheapo RAID with four 1TB WD > Black (7200) SATAs, in a 3TB RAIDZ1, and I added a 64GB OCZ Vertex SSD, with > slice 0 (5GB) for ZIL and the rest of the SSD for cache: > > # zpool status dpool > > pool: dpool > > state: ONLINE > > scrub: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > dpool ONLINE 0 0 0 > > raidz1 ONLINE 0 0 0 > > c0t0d0 ONLINE 0 0 0 > > c0t0d1 ONLINE 0 0 0 > > c0t0d2 ONLINE 0 0 0 > > c0t0d3 ONLINE 0 0 0 > > [b] logs > > c0t0d4s0 ONLINE 0 0 0[/b] > > [b] cache > > c0t0d4s1 ONLINE 0 0 0[/b] > > spares > > c0t0d6 AVAIL > > c0t0d7 AVAIL > > > > capacity operations bandwidth > > pool used avail read write read write > > ---------- ----- ----- ----- ----- ----- ----- > > dpool 72.1G 3.55T 237 12 29.7M 597K > > raidz1 72.1G 3.55T 237 9 29.7M 469K > > c0t0d0 - - 166 3 7.39M 157K > > c0t0d1 - - 166 3 7.44M 157K > > c0t0d2 - - 166 3 7.39M 157K > > c0t0d3 - - 167 3 7.45M 157K > > c0t0d4s0 20K 4.97G 0 3 0 127K > > cache - - - - - - > > c0t0d4s1 17.6G 36.4G 3 1 249K 119K > > ---------- ----- ----- ----- ----- ----- ----- > > I just don't seem to be getting any bang for the buck I should be. This > was taken while rebuilding an Oracle index, all files stored in this pool. > The WD disks are at 100%, and nothing is coming from the cache. The cache > does have the entire DB cached (17.6G used), but hardly reads anything from > it. I also am not seeing the spike of data flowing into the ZIL either, > although iostat show there is just write traffic hitting the SSD: > > > > extended device statistics cpu > > device r/s w/s kr/s kw/s wait actv svc_t %w %b us sy wt id > > sd0 170.0 0.4 7684.7 0.0 0.0 35.0 205.3 0 100 11 8 0 82 > > sd1 168.4 0.4 7680.2 0.0 0.0 34.6 205.1 0 100 > > sd2 172.0 0.4 7761.7 0.0 0.0 35.0 202.9 0 100 > > sd3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 > > sd4 170.0 0.4 7727.1 0.0 0.0 35.0 205.3 0 100 > > [b]sd5 1.6 2.6 182.4 104.8 0.0 0.5 117.8 0 31 [/b] > > > > Since this SSD is in a RAID array, and just presents as a regular disk > LUN, is there a special incantation required to turn on the Turbo mode? > > > > Doesnt it seem that all this traffic should be maxing out the SSD? Reads > from the cache, and writes to the ZIL? I have a seocnd identical SSD I > wanted to add as a mirror, but it seems pointless if there's no zip to be > had.... > > The most likely reason is that this workload has been identified as > streaming > by ZFS, which is prefetching from disk instead of the L2ARC > (l2arc_nopreftch=1). > > It also looks like you've used a 128 Kbyte ZFS record size. Is Oracle > doing > 128 Kbyte random I/O? We usually tune that down before creating the > database; > which will use the L2ARC device more efficiently. > > Brendan > > -- > Brendan Gregg, Fishworks > http://blogs.sun.com/brendan > -- Tracey Bernath 913-488-6284
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss