We ran into something similar with controllers changing after a x4500 to
x4540 upgrade.

In our case the the spares were in a separate data pool so the recovery
procedure we developed was relatively easy to implement as long as downtime
could be scheduled.

You may be able to tweak the procedure to boot off of a jumpstart server or
CD/DVD, I believe the trick will be to format the right disks with the rpool
exported.

For future upgrades where you know the controllers will change, remove the
spares ahead of time, then after booting into the new system compare the
disks listed in zpool status vs. format to find your spares and re-add them.

Hope this helps,

Jordan


Procedure to recover invalid or corrupted spare after X4500 to X4540 SC
upgrade.

1. The data zpool, z, was exported and the server  (X4500) was shutdown.
2. In this case after the SC was upgraded to a 4540, new root disks  were
installed with  Solaris 10 Update 6 and the following patches were
installed, including a ZFS patch.

126420-02
138286-02
139387-02
139580-02
140176-01
140191-01
139463-01
139467-04
138889-07

3. After the X4540 was booted, the z zpool was imported.

4. zpool status z was run and the invalid spares were listed as c6t0d0 and
c6t4d0.  Reviewing the "echo | format" output it was obvious that c6 no
longer existed, c0 through c5 became the naming of the 6 channels in the
X4540 after the os upgrade.

4. The following awk one liner was used to list all the disks being used by
zfs, and this was compared to the list of all know disks in the system,

zpool status | awk '$1 ~/^c[0-9]/ {print $1}' | sort

the disks c4t0d0 and  c4t1d0 were listed in format but not in the zpool
output.

5.  Attempts to zpool add or replace the disks failed with an error stating
that the disks were spares in the z zpool.

6  The z  zpool was exported

7. format was run on c4t0d0 and c4t1d0.

    In the fdisk menu the  partition  1 was removed and the information was
saved.

8.  The z zpool was then imported.

9.  The failed spares were removed from the zpool.

10. The "missing" disks were readded as spares, example: zpool add z spare
c4t1d0 .




On Tue, Aug 4, 2009 at 5:34 PM, Will Murnane <will.murn...@gmail.com> wrote:

> On Tue, Aug 4, 2009 at 19:05, <cindy.swearin...@sun.com> wrote:
> > Hi Will,
> >
> > It looks to me like you are running into this bug:
> >
> > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6664649
> >
> > This is fixed in Nevada and a fix will also be available in an
> > upcoming Solaris 10 release.
> That looks like exactly the problem we hit.  Thanks for Googling for me.
>
> > This doesn't help you now, unfortunately.
> Would it cause problems to temporarily import the pool on an
> OpenSolaris machine, remove the spare, and move it back to the Sol10
> machine?  I think it'd be safe provided I don't do "zpool upgrade" or
> anything like that, but I'd like to make sure.
>
> Thanks,
> Will
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to