Chris Murray wrote:
Another hang on zpool import thread, I'm afraid, because I don't seem to have
observed any great successes in the others and I hope there's a way of saving
my data ...
In March, using OpenSolaris build 134, I created a zpool, some zfs filesystems,
enabled dedup on them, moved content into them and promptly discovered how slow
it was because I only have 4GB RAM. Even with 30GB L2ARC, the performance was
unacceptable. The trouble started when the machine hung one day. Ever since,
I've been unable to import my pool without it hanging again. At the time I saw
posts from others who had run into similar problems, so I thought it best that
I wait until a later build, on the assumption that some ZFS dedup bug would be
fixed and I could see my data again. I've been waiting ever since, and only
just had a chance to try build 147, thanks to illumos and a schillix live CD.
However, the pool still won't import, so I'd much appreciate any
troubleshooting hints and tips to help me on my way.
schillix b147i
My process is:
1. boot the live CD.
2. on the console session, run vmstat 1
3. from another machine, SSH in with multiple sessions and:
vmstat 60
vmstat 1
zpool import -f zp
zpool iostat zp 1
zpool iostat zp -v 5
4. wait until it all stops
What I observe is that the zpool import command never finishes, there will be a
lengthy period of read activity made up of very small reads which then stops
before an even longer period of what looks like no disk activity.
zp 512G 1.31T 0 0 0 0
The box will be responsive for quite some time, seemingly doing not a great
deal:
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr cd cd rm s0 in sy cs us sy id
0 0 0 2749064 3122988 0 7 0 0 0 0 0 0 1 0 0 365 218 714 0 1 99
Then after a matter of hours it'll hang. SSH sessions are no longer responsive.
On the console I can press return which creates a new line, but vmstat will
have stopped updating.
Interestingly, what I observed in b134 was the same thing, however the free
memory would slowly decrease over the course of hours, before a sudden
nose-dive right before the lock up. Now it appears to hang without that same
effect.
While the import appears to be working, I can cd to /zp and look at content of the
filesystems of 5 of the 9 "esx*" directories.
Coincidence or not, it's the last four which appear to be empty - esx_prod
onward.
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
zp 905G 1.28T 23K /zp
zp/nfs 889G 1.28T 32K /zp/nfs
zp/nfs/esx_dev 264G 1.28T 264G /zp/nfs/esx_dev
zp/nfs/esx_hedgehog 25.8G 1.28T 25.8G /zp/nfs/esx_hedgehog
zp/nfs/esx_meerkat 223G 1.28T 223G /zp/nfs/esx_meerkat
zp/nfs/esx_meerkat_dedup 938M 1.28T 938M /zp/nfs/esx_meerkat_dedup
zp/nfs/esx_page 8.90G 1.28T 8.90G /zp/nfs/esx_page
zp/nfs/esx_prod 306G 1.28T 306G /zp/nfs/esx_prod
zp/nfs/esx_skunk 21K 1.28T 21K /zp/nfs/esx_skunk
zp/nfs/esx_temp 45.5G 1.28T 45.5G /zp/nfs/esx_temp
zp/nfs/esx_template 15.2G 1.28T 15.2G /zp/nfs/esx_template
Any help would be appreciated. What could be going wrong here? Is it getting
progressively closer to becoming imported each time I try this, or will it be
starting from scratch? Feels to me like there's an action in the
/zp/nfs/esx_prod filesystem it's trying to replay and never getting to the end
of, for some reason. In case it was getting in a muddle with the l2arc, I
removed the cache device a matter of minutes into this run. It hasn't hung yet,
vmstat is still updating, but I tried a 'zpool import' in one of the windows to
see if I could even see a pool on another disk, and that hasn't returned me
back to the prompt yet. Also tried to SSH in with another session, and that
hasn't produced the login prompt.
Thanks in advance,
Chris
It looks like you may be past the import phase and into the mounting
phase. What I would recommend is that you 'zpool import -N zp' so that
none of the datasets get mounted and only the import happens. Then one
by one you can mount the datasets in order (starting with 'zp') so you
can find out which one maybe hanging.
- George
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss