> Hi,

 Hello Bernd,

> 
> After I published a blog entry about installing
> OpenSolaris 2008.11 on a 
> USB stick, I read a comment about a possible issue
> with wearing out 
> blocks on the USB stick after some time because ZFS
> overwrites its 
> uberblocks in place.
 I did not understand well what you are trying to say with "wearing out 
blocks", but in fact the uberblocks are not overwriten in place. The pattern 
you did notice with the dtrace script, is the update of the uberblock that is 
maintained in an array of 128 elements (1K each, just one active at time). Each 
physical vdev has four labes (256K structures) L0, L1, L2, and L3. Two in the 
begining and two at the end.
 Because the labels are in fixed location on disk, is the only update that zfs 
does not uses cow, but a two staged update. IIRC, the update is L0 and L2,and 
after that L1 and L3.
 Take a look:

 
http://cvs.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/vdev_label.c

 So:
 - The label is overwritten (in a two staged update);
 - The uberblock is not overwritten, but do write to a new element on the 
array. So, the transition from one uberblock(txg and timestamp) to another is 
atomic.

 I'm deploying a USB solution too, so if you can clarify the problem, i would 
appreciate it. 

ps.: I did look your blog, but did not see any comments around that, and the 
comments section is closed. ;-)

 Leal
[http://www.eall.com.br/blog]

> 
> I tried to get more information about how updating
> uberblocks works with 
> the following dtrace script:
> 
> /* io:genunix::start */
> io:genunix:default_physio:start,
> io:genunix:bdev_strategy:start,
> io:genunix:biodone:done
> {
> printf ("%d %s %d %d", timestamp, execname,
>  args[0]->b_blkno, 
> rgs[0]->b_bcount);
> }
> 
> fbt:zfs:uberblock_update:entry
> {
> printf ("%d (%d) %d, %d, %d, %d, %d, %d, %d, %d",
>  timestamp,
>      args[0]->ub_timestamp,
>  args[0]->ub_rootbp.blk_prop, args[0]->ub_guid_sum,
> args[0]->ub_rootbp.blk_birth,
>  args[0]->ub_rootbp.blk_fill,
> args[1]->vdev_id, args[1]->vdev_asize,
>  args[1]->vdev_psize,
>      args[2]);
> e output shows the following pattern after most of
> the 
> uberblock_update events:
> 
> 0  34404 uberblock_update:entry 244484736418912
>  (1231084189) 
> 226475971064889345, 4541013553469450828, 26747, 159,
> 0, 0, 0, 26747
> 0   6668    bdev_strategy:start 244485190035647
>  sched 502 1024
> 0   6668    bdev_strategy:start 244485190094304
>  sched 1014 1024
> 0   6668    bdev_strategy:start 244485190129133
>  sched 39005174 1024
> 0   6668    bdev_strategy:start 244485190163273
>  sched 39005686 1024
> 0   6656          biodone:done 244485190745068
>  sched 502 1024
> 0   6656          biodone:done 244485191239190
>  sched 1014 1024
> 0   6656          biodone:done 244485191737766
>  sched 39005174 1024
> 0   6656          biodone:done 244485192236988
>  sched 39005686 1024
> ...
> 0  34404           uberblock_update:entry
>  244514710086249 
> 1231084219) 9226475971064889345, 4541013553469450828,
> 26747, 159, 0, 0, 
> 0, 26748
> 0  34404           uberblock_update:entry
>  244544710086804 
> 1231084249) 9226475971064889345, 4541013553469450828,
> 26747, 159, 0, 0, 
> 0, 26749
> ...
> 0  34404           uberblock_update:entry
>  244574740885524 
> 1231084279) 9226475971064889345, 4541013553469450828,
> 26750, 159, 0, 0, 
> 0, 26750
> 0   6668     bdev_strategy:start 244575189866189
>  sched 508 1024
> 0   6668     bdev_strategy:start 244575189926518
>  sched 1020 1024
> 0   6668     bdev_strategy:start 244575189961783
>  sched 39005180 1024
> 0   6668     bdev_strategy:start 244575189995547
>  sched 39005692 1024
> 0   6656           biodone:done 244575190584497
>  sched 508 1024
> 0   6656           biodone:done 244575191077651
>  sched 1020 1024
> 0   6656           biodone:done 244575191576723
>  sched 39005180 1024
> 0   6656           biodone:done 244575192077070
>  sched 39005692 1024
> I am not a dtrace or zfs expert, but to me it looks
> like in many cases, 
> an uberblock update is followed by a write of 1024
> bytes to four 
> different disk blocks. I also found that the four
> block numbers are 
> incremented with always even numbers (256, 258, 260,
> ,..) 127 times and 
> then the first block is written again. Which would
> mean that for a txg 
> of 50000, the four uberblock copies have been written
> 50000/127=393 
> times (Correct?).
> 
> What I would like to find out is how to access fields
> from arg1 (this is 
> the data of type vdev in:
> 
> int uberblock_update(uberblock_t *ub, vdev_t *rvd,
> uint64_t txg)
> 
> ). When using the fbt:zfs:uberblock_update:entry
> probe, its elements are 
> always 0, as you can see in the above output. When
> using the 
> fbt:zfs:uberblock_update:return probe, I am getting
> an error message 
> like the following:
> 
> dtrace: failed to compile script
> zfs-uberblock-report-04.d: line 14: 
> operator -> must be applied to a pointer
> 
> Any idea how to access the fields of vdev, or how to
> print out the pool 
> name associated to an uberblock_update event?
> 
> Regards,
> 
> Bernd
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discu
> ss
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to