Another hang on zpool import thread, I'm afraid, because I don't seem to have 
observed any great successes in the others and I hope there's a way of saving 
my data ...

In March, using OpenSolaris build 134, I created a zpool, some zfs filesystems, 
enabled dedup on them, moved content into them and promptly discovered how slow 
it was because I only have 4GB RAM. Even with 30GB L2ARC, the performance was 
unacceptable. The trouble started when the machine hung one day. Ever since, 
I've been unable to import my pool without it hanging again. At the time I saw 
posts from others who had run into similar problems, so I thought it best that 
I wait until a later build, on the assumption that some ZFS dedup bug would be 
fixed and I could see my data again. I've been waiting ever since, and only 
just had a chance to try build 147, thanks to illumos and a schillix live CD.

However, the pool still won't import, so I'd much appreciate any 
troubleshooting hints and tips to help me on my way.

schillix b147i

My process is:
1. boot the live CD.
2. on the console session, run  vmstat 1
3. from another machine, SSH in with multiple sessions and:
        vmstat 60
        vmstat 1
        zpool import -f zp
        zpool iostat zp 1
        zpool iostat zp -v 5
4. wait until it all stops

What I observe is that the zpool import command never finishes, there will be a 
lengthy period of read activity made up of very small reads which then stops 
before an even longer period of what looks like no disk activity.

zp           512G  1.31T      0      0      0      0

The box will be responsive for quite some time, seemingly doing not a great 
deal:

 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr cd cd rm s0   in   sy   cs us sy id
 0 0 0 2749064 3122988 0  7  0  0  0  0  0  0  1  0  0  365  218  714  0  1 99

Then after a matter of hours it'll hang. SSH sessions are no longer responsive. 
On the console I can press return which creates a new line, but vmstat will 
have stopped updating.

Interestingly, what I observed in b134 was the same thing, however the free 
memory would slowly decrease over the course of hours, before a sudden 
nose-dive right before the lock up. Now it appears to hang without that same 
effect.

While the import appears to be working, I can cd to /zp and look at content of 
the filesystems of 5 of the 9 "esx*" directories.
Coincidence or not, it's the last four which appear to be empty - esx_prod 
onward.

# zfs list
NAME                       USED  AVAIL  REFER  MOUNTPOINT
zp                         905G  1.28T    23K  /zp
zp/nfs                     889G  1.28T    32K  /zp/nfs
zp/nfs/esx_dev             264G  1.28T   264G  /zp/nfs/esx_dev
zp/nfs/esx_hedgehog       25.8G  1.28T  25.8G  /zp/nfs/esx_hedgehog
zp/nfs/esx_meerkat         223G  1.28T   223G  /zp/nfs/esx_meerkat
zp/nfs/esx_meerkat_dedup   938M  1.28T   938M  /zp/nfs/esx_meerkat_dedup
zp/nfs/esx_page           8.90G  1.28T  8.90G  /zp/nfs/esx_page
zp/nfs/esx_prod            306G  1.28T   306G  /zp/nfs/esx_prod
zp/nfs/esx_skunk            21K  1.28T    21K  /zp/nfs/esx_skunk
zp/nfs/esx_temp           45.5G  1.28T  45.5G  /zp/nfs/esx_temp
zp/nfs/esx_template       15.2G  1.28T  15.2G  /zp/nfs/esx_template

Any help would be appreciated. What could be going wrong here? Is it getting 
progressively closer to becoming imported each time I try this, or will it be 
starting from scratch? Feels to me like there's an action in the 
/zp/nfs/esx_prod filesystem it's trying to replay and never getting to the end 
of, for some reason. In case it was getting in a muddle with the l2arc, I 
removed the cache device a matter of minutes into this run. It hasn't hung yet, 
vmstat is still updating, but I tried a 'zpool import' in one of the windows to 
see if I could even see a pool on another disk, and that hasn't returned me 
back to the prompt yet. Also tried to SSH in with another session, and that 
hasn't produced the login prompt.

Thanks in advance,
Chris
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to