-----Original Message----- From: Sašo Kiselkov [mailto:skiselkov...@gmail.com] Sent: Tuesday, September 11, 2012 9:52 AM To: Dan Swartzendruber Cc: 'James H'; zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] Interesting question about L2ARC
On 09/11/2012 03:41 PM, Dan Swartzendruber wrote: > LOL, I actually was unclear not you. I understood what you were > saying, sorry for being unclear. I have 4 disks in raid10, so my max > random read throughput is theoretically somewhat faster than the L2ARC > device, but I never really do that intensive of reads. But here's the kicker, prefetch is never random, it's always linear, so you need measure prefetch throughput against the near linear throughput of your disks. Your average 7k2 disk is capable of ~100MB/s in linear reads, so in a pair-of-mirrors scenario (raid10) you get effectively in excess of 400MB/s in prefetch throughput. ** True, and badly worded on my part. In theory, the 4 nearline SAS drives could deliver 600MB/sec, but my path to the guests is maybe 3GB (this is all running virtualized, so I can exceed gig/e speed.) The bottom line there is that no amount of read effort by guests (via the hypervisor) is going to come anywhere near the pool's capabilities. > My point was: if a guest does read > a bunch of data sequentially, that will trigger the prefetch L2ARC > code path, correct? No, when a client does a linear read, the initial buffer is random, so that makes sense to serve from the l2arc - thus it is cached in the l2arc. Subsequently ZFS detects that the client is likely to want more buffers, so it will start prefetching the following blocks on the background. Then when the client returns, they will receive the blocks from ARC. The result is that the client wasn't latency constrained to process this block, so there's no need to cache the subsequently prefetched blocks in l2arc. *** Sorry, that's what I meant by 'the prefetch l2arc code path'. E.g. the heuristics you referred to. It seems to me that if the client never wants the prefetched block it was a waste to cache it, if he does, at worst, he'll miss once, and then it will be cached, since it will have been a demand read? > If so, I *do* want that cache in L2ARC, so that a return visit from > that guest will hit as much as possible in the cache. It will be in the normal ARC cache, however, l2arc is meant to primarily accelerate the initial block hit (as noted above), not the subsequently prefetched ones (which we have time to refetch from the main pool). This covers most generic filesystem use cases as well as random-read heavy workloads (such as databases, which rarely, if ever, do linear reads). > One other > thing (I don't think I mentioned this): my entire ESXi dataset is only > like 160GB (thin provisioning in action), so it seems to me, I should > be able to fit the entire thing in L2ARC? Please try to post the output of this after you let it run on your dataset for a few minutes: $ arcstat.pl -f \ arcsz,read,dread,pread,hit%,miss%,l2size,l2read,l2hit%,l2miss% 60 It should give us a good idea of the kind of workload we're dealing with and why your L2 hits are so low. *** Thanks a lot for clarifying how this works. Since I'm quite happy having an SSD in my workstation, I will need to purchase another SSD :) I'm wondering if it makes more sense to buy two SSDs of half the size (e.g. 128GB), since the total price is about the same? _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss