Re: [zfs-discuss] How to properly tell zfs of new GUID after a firmware upgrade changes the IDs

Shawn Ferry Thu, 13 Dec 2007 09:38:41 -0800

Jill,

I was recently looking for a similar solution to try and reconnect a
renumbered device while the pool was live.


e.g. zpool online mypool <old target> <old target at new location>
As in zpool replace but with the indication that this isn't a new  
device.

What I have been doing to deal with the renumbering is exactly the
export, import and clear.  Although I have been dealing with  
significantly
smaller devices and can't speak to the delay issues.

Shawn



On Dec 13, 2007, at 12:16 PM, Jill Manfield wrote:

>
> My customer's zfs pools and their 6540 disk array had a firmware  
> upgrade that changed GUIDs so we need a procedure to let the zfs  
> know it changed. They are getting errors as if they replaced  
> drives.  But I need to make sure you know they have not "replaced"  
> any drives, and no drives have failed or are "bad". As such, they  
> have no interest in wiping any disks clean as indicated in 88130  
> info doc.
>
> Some background from customer:
>
> We have a large 6540 disk array, on which we have configured a  
> series of
> large RAID luns.  A few days ago, Sun sent a technician to upgrade the
> firmware of this array, which worked fine but which had the  
> deleterious
> effect of changing the "Volume IDs" associated with each lun.  So, the
> resulting luns now appear to our solaris 10 host (under mpxio) as  
> disks in
> /dev/rdsk with different 'target' components than they had before.
>
> Before the firmware upgrade we took the precaution of creating  
> duplicate
> luns on a different 6540 disk array, and using these to mirror each  
> of our
> zfs pools (as protection in case the firmware upgrade corrupted our  
> luns).
>
> Now, we simply want to ask zfs to find the devices under their new
> targets, recognize that they are existing zpool components, and have  
> it
> correct the configuration of each pool.  This would be similar to  
> having
> Veritas vxvm re-scan all disks with vxconfigd in the event of a
> "controller renumbering" event.
>
> The proper zfs method for doing this, I believe, is to simply do:
>
> zpool export mypool
> zpool import mypool
>
> Indeed, this has worked fine for me a few times today, and several  
> of our
> pools are now back to their original mirrored configuration.
>
> Here is a specific example, for the pool "ospf".
>
> The zpool status after the upgrade:
>
> diamond:root[1105]->zpool status ospf
>  pool: ospf
> state: DEGRADED
> status: One or more devices could not be opened.  Sufficient replicas
> exist for
>        the pool to continue functioning in a degraded state.
> action: Attach the missing device and online it using 'zpool online'.
>   see: http://www.sun.com/msg/ZFS-8000-D3
> scrub: resilver completed with 0 errors on Tue Dec 11 18:26:53 2007
> config:
>
>        NAME                                        STATE     READ  
> WRITE CKSUM
>        ospf                                        DEGRADED      
> 0     0     0
>          mirror                                    DEGRADED      
> 0     0     0
>            c27t600A0B8000292B0200004BDC4731A7B8d0  UNAVAIL       
> 0     0     0  cannot open
>            c27t600A0B800032619A0000093747554A08d0  ONLINE        
> 0     0     0
>
> errors: No known data errors
>
> This is due to the fact that the LUN which used to appear as
> c27t600A0B8000292B0200004BDC4731A7B8d0 is now actually
> c27t600A0B8000292B0200004D5B475E6E90d0.  It's the same LUN, but  
> since the
> firmware changed the Volume ID, the target portion is different.
>
> Rather than treating this as a "replaced" disk (which would incur an
> entire mirror resilvering, and would require the "trick" you sent of
> obliterating the disk label so the "in use" safeguard could be  
> avoided),
> we simply want to ask zfs to re-read its configuration to find this  
> disk.
>
> So we do this:
>
> diamond:root[1110]->zpool export -f ospf
> diamond:root[1111]->zpool import ospf
>
> and sure enough:
>
> diamond:root[1112]->zpool status ospf
>  pool: ospf
> state: ONLINE
> status: One or more devices is currently being resilvered.  The pool  
> will
>        continue to function, possibly in a degraded state.
> action: Wait for the resilver to complete.
> scrub: resilver in progress, 0.16% done, 2h53m to go
> config:
>
>        NAME                                        STATE     READ  
> WRITE CKSUM
>        ospf                                        ONLINE        
> 0     0     0
>          mirror                                    ONLINE        
> 0     0     0
>            c27t600A0B8000292B0200004D5B475E6E90d0  ONLINE        
> 0     0     0
>            c27t600A0B800032619A0000093747554A08d0  ONLINE        
> 0     0     0
>
> errors: No known data errors
>
> (Note that it has self-initiated a resilvering, since in this case the
> mirror has been changed by users since the firmware upgrade.)
>
> The problem that Robert had was that when he initiated an export of  
> a pool
> (called "bgp") it froze for quite some time.  The corresponding  
> "import"
> of the same pool took 12 hours to complete.  I have not been able to
> replicate this myself, but that was the essence of the problem.
>
> So again, we do NOT want to "zero out" any of our disks, we are not  
> trying
> to forcibly use "replaced" disks.  We simply wanted zfs to re-read the
> devices under /dev/rdsk and update each pool with the correct disk
> targets.
>
> If you can confirm that a simple export/import is the proper  
> procedure for
> this (followed by a "clear" once the resulting resilvering  
> finishes), I
> would appreciate it.  And, if you can postulate what may have caused  
> the
> "freeze" that Robert noticed, that would put our minds at ease.
>
>
>
> TIA,
>
> Any assistance on this would be greatly appreciated and or pointers  
> on helpful documentation.
>
> -- 
>       S U N  M I C R O S Y S T E M S  I N C.
>
>               Jill Manfield - TSE-OS Administration Group
>               email: [EMAIL PROTECTED]
>               phone: (800)USA-4SUN (Reference your case number)
>               address:  1617 Southwood Drive Nashua,NH 03063
>               mailstop: NSH-01- B287
>       
>               OS Support Team     9AM to 6PM EST
>        Manager  [EMAIL PROTECTED]  x74110
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--
Shawn Ferry              shawn.ferry at sun.com
Senior Primary Systems Engineer
Sun Managed Operations
571.291.4898





_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] How to properly tell zfs of new GUID after a firmware upgrade changes the IDs

Reply via email to