Found my first problems with this today.  The ZFS mirror appears to work fine, 
but if you disconnect one of the iSCSI targets it hangs for 5 mins or more.  
I'm also seeing very concerning behaviour when attempting to re-attach the 
missing disk.

My test scenario is:
 - Two 35GB iSCSI targets are being shared using ZFS shareiscsi=on
 - They are imported to a 3rd Solaris box and used to create a mirrored ZFS pool
 - I use that to mount a NFS share, and connected to that with VMware ESX server

My first test was to clone a virtual machine onto the new volume.  That 
appeared to work fine, so I decided to test the mirroring.  I started another 
clone operation then powered down one of the iSCSI targets.  Well, the clone 
operation seemed to hang as soon as I did that, so I ran "zpool status" to see 
what was going on.  The news wasn't good:  That hung too.

Nothing happened in either window for a good 5 minutes, then ESX popped up with 
an error saying "the virtual disk is either corrupted or not a supported 
format", and at the exact same time the zpool status command completed, but 
showing that all the drives were still ONLINE.

I immediately re-ran zpool status, now it reported that one iSCSI was now 
offline and the pool was running in a degraded state.

So, for some reason it's taken 5 minutes for the iSCSI device to go offline, 
it's locked up ZFS for that entire time, and ZFS reported the wrong status the 
first time around too.

The only good news is that now that ZFS is in a degraded state I can start the 
clone operation again and it completes fine with just half of the mirror 
available.

Next, I powered on the missing server, checked "format < /dev/null" to ensure 
the drives had re-connected, and used "zpool online" to re-attach the missing 
disk.  So far it's taken over an hour to attempt to resilver files from a 10 
minute copy, and the progress report is up and down like a yo-yo.  The progress 
reporting from ZFS so far has been:
 - 2.25% done, 0h13m to go
 - 7.20% done, 0h12m to go
 - 6.14% done, 0h8m to go    (odd, how does it go down?)
 ...
 - 78.50% done, 0h2m to go
 - 41.67% done, 0h8m to go   (huh?)
 ...
 - 72.45% done, 0h3m to go
 - 42.42% done, 0h9m to go

Getting concerned now, I'm actually wondering if this is ever going to 
complete, and I have no idea if these problems are ZFS or iSCSI related.
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to