-----Original Message-----
From: Sašo Kiselkov [mailto:skiselkov...@gmail.com] 
Sent: Tuesday, September 11, 2012 9:52 AM
To: Dan Swartzendruber
Cc: 'James H'; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Interesting question about L2ARC

On 09/11/2012 03:41 PM, Dan Swartzendruber wrote:
> LOL, I actually was unclear not you.  I understood what you were 
> saying, sorry for being unclear.  I have 4 disks in raid10, so my max 
> random read throughput is theoretically somewhat faster than the L2ARC 
> device, but I never really do that intensive of reads.

But here's the kicker, prefetch is never random, it's always linear, so you
need measure prefetch throughput against the near linear throughput of your
disks. Your average 7k2 disk is capable of ~100MB/s in linear reads, so in a
pair-of-mirrors scenario (raid10) you get effectively in excess of 400MB/s
in prefetch throughput.

** True, and badly worded on my part.  In theory, the 4 nearline SAS drives
could deliver 600MB/sec, but my path to the guests is maybe 3GB (this is all
running virtualized, so I can exceed gig/e speed.)  The bottom line there is
that no amount of read effort by guests (via the hypervisor) is going to
come anywhere near the pool's capabilities.

> My point was: if a guest does read
> a bunch of data sequentially, that will trigger the prefetch L2ARC 
> code path, correct?

No, when a client does a linear read, the initial buffer is random, so that
makes sense to serve from the l2arc - thus it is cached in the l2arc.
Subsequently ZFS detects that the client is likely to want more buffers, so
it will start prefetching the following blocks on the background. Then when
the client returns, they will receive the blocks from ARC. The result is
that the client wasn't latency constrained to process this block, so there's
no need to cache the subsequently prefetched blocks in l2arc.

*** Sorry, that's what I meant by 'the prefetch l2arc code path'.  E.g. the
heuristics you referred to.  It seems to me that if the client never wants
the prefetched block it was a waste to cache it, if he does, at worst, he'll
miss once, and then it will be cached, since it will have been a demand
read?

> If so, I *do* want that cache in L2ARC, so that a return visit from 
> that guest will hit as much as possible in the cache.

It will be in the normal ARC cache, however, l2arc is meant to primarily
accelerate the initial block hit (as noted above), not the subsequently
prefetched ones (which we have time to refetch from the main pool). This
covers most generic filesystem use cases as well as random-read heavy
workloads (such as databases, which rarely, if ever, do linear reads).

> One other
> thing (I don't think I mentioned this): my entire ESXi dataset is only 
> like 160GB (thin provisioning in action), so it seems to me, I should 
> be able to fit the entire thing in L2ARC?

Please try to post the output of this after you let it run on your dataset
for a few minutes:

$ arcstat.pl -f \
  arcsz,read,dread,pread,hit%,miss%,l2size,l2read,l2hit%,l2miss% 60

It should give us a good idea of the kind of workload we're dealing with and
why your L2 hits are so low.

*** Thanks a lot for clarifying how this works.  Since I'm quite happy
having an SSD in my workstation, I will need to purchase another SSD :)  I'm
wondering if it makes more sense to buy two SSDs of half the size (e.g.
128GB), since the total price is about the same?

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to