Re: [zfs-discuss] zpool import hangs

Erik Gulliksson Thu, 25 Sep 2008 02:19:44 -0700

To keep everyone updated - Thanks to Victor we have recovered AND
repaired all of the data that was lost in the incident. Victor may be
able to explain in detail what he did to accomplish this, I only know
it involved loading a patched zfs kernel module.


I would like to shout a big thanks to Victor Latushkin, a true hero!

Best regards
Erik Gulliksson

On Fri, Aug 22, 2008 at 6:03 PM, Erik Gulliksson <[EMAIL PROTECTED]> wrote:
> Victor,
>
>> Well, since we are talking about ZFS any thread somewhere in ZFS module are
>> interesting, and there should not be too many of them. Though in this case
>> it is clear - it is trying to update config object and waits for the update
>> to sync. There should be another thread with stack similar to this:
>>
>> genunix:cv_wait()
>> zfs:zio_wait()
>> zfs:dbuf_read()
>> zfs:dmu_buf_will_dirty()
>> zfs:dmu_write()
>> zfs:spa_sync_nvlist()
>> zfs:spa_sync_config_object()
>> zfs:spa_sync()
>> zfs:txg_sync_thread()
>> unix:thread_start()
>>
>
> Ok, this would be the thread you're referring to:
>
> ffffff000746ec80 fffffffffbc287b0                0   0  60 ffffff01b53ebd88
>  PC: _resume_from_idle+0xf1    THREAD: txg_sync_thread()
>  stack pointer for thread ffffff000746ec80: ffffff000746e860
>  [ ffffff000746e860 _resume_from_idle+0xf1() ]
>    swtch+0x17f()
>    cv_wait+0x61()
>    zio_wait+0x5f()
>    dbuf_read+0x1b5()
>    dbuf_will_dirty+0x3d()
>    dmu_write+0xcd()
>    spa_sync_nvlist+0xa7()
>    spa_sync_config_object+0x71()
>    spa_sync+0x20b()
>    txg_sync_thread+0x226()
>    thread_start+8()
>
> As you say, there are more zfs/spa/txg-threads that end up in the same
> wait-state.
>
>> It wait due to checksum error detected while reading old config object from
>> disk (call to dmu_read() above). It means that all ditto-blocks of config
>> object got corrupted. On Solaris 10 there's no
>
> Now this is getting interesting. Is there any chance to recover from
> this scenario?
>
>>>> Btw, why does timestamp on your uberblock show July 1?
>>>
>>> Well, this is about the time when the crash happened. The clock on the
>>> server is correct.
>>
>> Wow! Why did you wait almost two months?
> There has been a lot on reading up about zfs and finding local
> recovery-experts. But yes, of course I should have posted to this
> mailing-list earlier.
>
> Again, thanks Victor!
>
> Regards
> Erik
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zpool import hangs

Reply via email to