The problem I see with "sequential access jump all over the place" is
that this increases the utilization of the disks -
over the years disks have become even faster for sequential access,
whereas random access (as they have
to move the actuator) has not improved at the same pace - this is what
ZFS exploits when writing.
With its fancy detection of sequential access patterns and improved
readahead, ZFS should be able to
deal with the latency aspect of randomized read accesses - but at the
expense of a higher disk utilization.
If you think of many processes accessing the same disks this may result
in disks "running out of IOPS"
earlier than in an environment with sequential accesses (though
contiguos data).
Obviously this heavily depends on the workload - but with the trend
towards even higher capacity disks,
IOPS become a valuable resource and it may be worth to think about how
to most efficiently use disks -
a "self-optimizing" mechanism that in the background or on request
rearranges files to become contigous
may therefore be useful.
- Franz
Gregory Shaw wrote:
Rich, correct me if I'm wrong, but here's the scenario I was thinking
of:
- A large file is created.
- Over time, the file grows and shrinks.
The anticipated layout on disk due to this is that extents are
allocated as the file changes. The extents may or may not be on
multiple spindles.
I envision a fragmentation over time that will cause sequential
access to jump all over the place. If you use smart controllers or
disks with read caching, their use of stripes and read-ahead (if
enabled) could cause performance to be bad.
So, my thought was to de-fragment the file to make it more contiguous
and to allow hardware read-ahead to be effective.
An additional benefit would be to spread it across multiple spindles
in a contiguous fashion, such as:
disk0: 32mb
disk1: 32mb
disk2: 32mb
... etc.
Perhaps this is unnecessary. I'm simply trying to grasp the long
term performance implications of COW data.
On May 15, 2006, at 8:47 AM, Roch Bourbonnais - Performance
Engineering wrote:
Gregory Shaw writes:
I really like the below idea:
- the ability to defragment a file 'live'.
I can see instances where that could be very useful. For instance,
if you have multiple LUNs (or spindles, whatever) using ZFS, you
could re-optimize large files to spread the chunks across as many
spindles as possible. Or, as stated below, make it contiguous.
I don't know if that is automatic with ZFS today, but it's an idea.
I think the expected benefits of making it contiguous is
rooted in the belief that bigger I/Os is the only way to
reach top performance.
I think that before ZFS, both physical and logical
contiguity was required to enable sufficient readahead and
get performance.
Once we have good readahead based on detected logical
contiguous accesses, It may well be possible to get top
device speed through reasonably-sized I/O concurrency.
-r
-----
Gregory Shaw, IT Architect
Phone: (303) 673-8273 Fax: (303) 673-8273
ITCTO Group, Sun Microsystems Inc.
1 StorageTek Drive ULVL4-382 [EMAIL PROTECTED] (work)
Louisville, CO 80028-4382 [EMAIL PROTECTED] (home)
"When Microsoft writes an application for Linux, I've Won." - Linus
Torvalds
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss