Yes, that makes sense. For the first run, the pool has only just been mounted, so the ARC will be empty, with plenty of space for prefetching.
On the second run however, the ARC is already full of the data that we just read, and I'm guessing that the prefetch code is less aggressive when there is already data in the ARC. Which for normal use may be what you want - it's trying to keep things in the ARC in case they are needed. However that does mean that ZFS prefetch is always going to suffer performance degradation on a live system, although early signs are that this might not be so severe in snv_117. I wonder if there is any tuning that can be done to counteract this? Is there any way to tell ZFS to bias towards prefetching rather than preserving data in the ARC? That may provide better performance for scripts like this, or for random access workloads. Also, could there be any generic algorithm improvements that could help. Why should ZFS keep data in the ARC if it hasn't been used? This script has 8GB files, but the ARC should be using at least 1GB of RAM. That's a minimum of 128 files in memory, none of which would have been read more than once. If we're reading a new file now, prefetching should be able to displace any old object in the ARC that hasn't been used - in this case all 127 previous files should be candidates for replacement. I wonder how that would interact with a L2ARC. If that was fast enough I'd certainly want to allocate more of the ARC to prefetching. Finally, would it make sense for the ARC to always allow a certain percentage for prefetching, possibly with that percentage being tunable, allowing us to balance the needs of the two systems according to the expected usage? Ross -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss