Bob Friesenhahn wrote:
On Wed, 15 Jul 2009, Ross wrote:

Yes, that makes sense. For the first run, the pool has only just been mounted, so the ARC will be empty, with plenty of space for prefetching.

I don't think that this hypothesis is quite correct. If you use 'zpool iostat' to monitor the read rate while reading a large collection of files with total size far larger than the ARC, you will see that there is no fall-off in read performance once the ARC becomes full.

Unfortunately, "zpool iostat" doesn't really tell you anything about
performance.  All it shows is bandwidth. Latency is what you need
to understand performance, so use iostat.

The performance problem occurs when there is still metadata cached for a file but the file data has since been expunged from the cache. The implication here is that zfs speculates that the file data will be in the cache if the metadata is cached, and this results in a cache miss as well as disabling the file read-ahead algorithm. You would not want to do read-ahead on data that you already have in a cache.

I realized this morning that what I posted last night might be
misleading to the casual reader. Clearly the first time through
the data is prefetched and misses the cache.  On the second
pass, it should also miss the cache, if it were a simple cache.
But the ARC tries to be more clever and has ghosts -- where
the data is no longer in cache, but the metadata is.  I suspect
the prefetching is not being used for the ghosts.  The arcstats
will show this. As benr blogs,
   "These Ghosts lists are magic. If you get a lot of hits to the
   ghost lists, it means that ARC is WAY too small and that
   you desperately need either more RAM or an L2 ARC
   device (likely, SSD). Please note, if you are considering
   investing in L2 ARC, check this FIRST."
http://www.cuddletech.com/blog/pivot/entry.php?id=979
This is the explicit case presented by your test. This also
explains why the entry from the system with an L2ARC
did not have the performance "problem."

Also, another test would be to have two large files.  Read from
one, then the other, then from the first again.  Capture arcstats
from between the reads and see if the haunting stops ;-)
-- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to