Thanks for reporting this. I have fixed this bug (6822816) in build
127. Here is the evaluation from the bug report:
The problem is that the clone's dsobj does not appear in the origin's
ds_next_clones_obj.
The bug can occur can occur under certain circumstances if there was a
"botched upgrade" when doing "zpool upgrade" from pool version 10 or
earlier to version 11 or later, while there was a clone in the pool.
The problem is caused because upgrade_clones_cb() failed to call
dmu_buf_will_dirty(origin->ds_dbuf).
This bug can have several effects:
1. assertion failure from dsl_dataset_destroy_sync()
2. assertion failure from dsl_dataset_snapshot_sync()
3. assertion failure from dsl_dataset_promote_sync()
4. incomplete scrub or resilver, potentially leading to data loss
The fix will address the root cause, and also work around all of these
issues on pools that have already experienced the botched upgrade,
whether or not they have encountered any of the above effects.
Anyone who may have a botched upgrade should run "zpool scrub" after
upgrading to bits with the fix in place (build 127 or later).
--matt
Albert Chin wrote:
Running snv_114 on an X4100M2 connected to a 6140. Made a clone of a
snapshot a few days ago:
# zfs snapshot a...@b
# zfs clone a...@b tank/a
# zfs clone a...@b tank/b
The system started panicing after I tried:
# zfs snapshot tank/b...@backup
So, I destroyed tank/b:
# zfs destroy tank/b
then tried to destroy tank/a
# zfs destroy tank/a
Now, the system is in an endless panic loop, unable to import the pool
at system startup or with "zpool import". The panic dump is:
panic[cpu1]/thread=ffffff0010246c60: assertion failed: 0 == zap_remove_int(mos,
ds_prev->ds_phys->ds_next_clones_obj, obj, tx) (0x0 == 0x2), file:
../../common/fs/zfs/dsl_dataset.c, line: 1512
ffffff00102468d0 genunix:assfail3+c1 ()
ffffff0010246a50 zfs:dsl_dataset_destroy_sync+85a ()
ffffff0010246aa0 zfs:dsl_sync_task_group_sync+eb ()
ffffff0010246b10 zfs:dsl_pool_sync+196 ()
ffffff0010246ba0 zfs:spa_sync+32a ()
ffffff0010246c40 zfs:txg_sync_thread+265 ()
ffffff0010246c50 unix:thread_start+8 ()
We really need to import this pool. Is there a way around this? We do
have snv_114 source on the system if we need to make changes to
usr/src/uts/common/fs/zfs/dsl_dataset.c. It seems like the "zfs
destroy" transaction never completed and it is being replayed, causing
the panic. This cycle continues endlessly.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss