On 16/01/2010 00:09, Jeffry Molanus wrote:
-----Original Message-----
From: neil.per...@sun.com [mailto:neil.per...@sun.com]
I think you misunderstand the function of the ZIL. It's not a journal,
and doesn't get transferred to the pool as of a txg. It's only ever
written except
after a crash it's read to do replay. See:
http://blogs.sun.com/perrin/entry/the_lumberjack
I also read another blog[1]; the part of interest here is this:
The zil behaves differently for different size of writes that happens. For
small writes, the data is stored as a part of the log record. For writes
greater than zfs_immediate_write_sz (64KB), the ZIL does not store a copy of
the write, but rather syncs the write to disk and only a pointer to the sync-ed
data is stored in the log record.
If I understand this right, writes<64KB get stored on the SSD devices.
if an application requests a synchronous write then it is commited to
ZIL immediately, once it is done the IO is acknowledged to application.
But data written to ZIL is still in memory as part of an currently open
txg and will be committed to a pool with no need to read anything from
ZIL. Then there is an optimization you wrote above so data block not
necesarilly need to be writen just pointers which point to them.
Now it is slightly more complicated as you need to take into account
logbias property and a possibility that a dedicated zil device could be
present.
As Neil wrote zfs will read from ZIL only if while importing a pool it
will be detected that there is some data in ZIL which hasn't been
commited to a pool yet which could happen due to system reset, power
loss or devices suddenly disappearing.
--
Robert Milkowski
http://milek.blogspot.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss