On Sun, Dec 27, 2009 at 6:43 PM, Bob Friesenhahn <
bfrie...@simple.dallas.tx.us> wrote:

> On Sun, 27 Dec 2009, Tim Cook wrote:
>
>>
>> That is ONLY true when there's significant free space available/a fresh
>> pool.  Once those files have been deleted and the blocks put back into the
>> free pool, they're no longer "sequential" on disk, they're all over the
>> disk.  So it makes a VERY big difference.  I'm not sure why you'd be shocked
>> someone would bring this up.   --
>>
>
> While I don't know what zfs actually does, I do know that it performs large
> disk allocations (e.g. 1MB) and then parcels 128K zfs blocks from those
> allocations.  If the zfs designers are wise, then they will use knowledge of
> sequential access to ensure that all of the 128K blocks from a metaslab
> allocation are pre-assigned for use by that file, and they will try to
> choose metaslabs which are followed by free metaslabs, or close to other
> free metaslabs.  This approach would tend to limit the sequential-access
> damage caused by COW and free block fragmentation on a "dirty" disk.
>
>
How is that going to prevent blocks being spread all over the disk when
you've got files several GB in size being written concurrently and deleted
at random?  And then throw in a mix of small files as well, kiss that
goodbye.



> This sort of planning is not terribly different than detecting sequential
> read I/O and scheduling data reads in advance of application requirements.
>  If you can intelligently pre-fetch data blocks, then you can certainly
> intelligently pre-allocate data blocks.
>
>
Pre-allocating data blocks is also not going to cure head seek and the
latency it induces on slow 7200/5400RPM drives.




> Today I did an interesting (to me) test where I ran two copies of iozone at
> once on huge (up to 64GB) files.  The results were somewhat amazing to me.
>  The cause of the amazement was that I noticed that the reported data rates
> from iozone did not drop very much (e.g. a single-process write rate of
> 359MB/second dropped to 298MB/second with two processes).  This clearly
> showed that zfs is doing quite a lot of smart things when writing files and
> that it is optimized for several/many writers rather than just one.
>
>
On a new, empty pool, or a pool that's been filled completely and emptied
several times?  It's not amazing to me on a new pool.  I would be surprised
to see you accomplish this feat repeatedly after filling and emptying the
drives.  It's a drawback of every implementation of copy-on-write I've ever
seen.  By it's very nature, I have no idea how you would avoid it.


-- 
--Tim
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to