Saturday, August 9, 2008

Replacing Mirrored Root disk in Solaris

1. Collect output from the following:

# metastat

# metastat -p

# metadb –i

2. To identify the disk to be replaced:

Examine the "metadb -i" output. You should see a "W" in the flags field associated with slice 7 of the disk experiencing write errors. Another indication is to look at the output from the “format” command.

For this example, we will assume the failed disk device is c1t0d0 and c1t1d0 is the good mirror

3. Delete any metadevice state database replicas that are on the 'bad' disk:

# metadb -d c1t0d0s4

# metadb -i (to make sure they have been deleted)

4. State of the submirrors:

The “metastat” command output reports that all submirrors on the bad disk are at a State of “Needs maintenance”. This indicates that DiskSuite has automatically disabled the submirrors, so there is no need to “metadetach” the submirrors.

Note : if you see any replica in OK state. Then we need to detach the submirror first ( with metaclear/ metadetach )

5. Physically replace the failed hot-swappable disk.

6. Partition the new disk: Easiest way to do this is to copy the partition table from the root mirror (c1t1d0s2) to the new disk

(c0t0d0s2) with the following dd command: ( I would prefer to take a back up of

# dd if=/dev/rdsk/c1t1d0s2 of=/dev/rdsk/c1t0d0s2 count=16

# dd if=/dev/rdsk/(original good disk slice 2) of=/dev/rdsk/( new replaced disk slice 2) count=16

OR

USE prtvtoc /dev/dsk/c1t1d0s2 | fmthard -s - /dev/rdsk/c1t0d0s2

Verify the partition table was copied correctly using the format utility. Type "format", select the corresponding disk number from the disk selection menu, then type "p", then "p" again to view the partition table. Compare and make sure the partition tables match EXACTLY.

7. Run newfs on all the newly created slice and then fsck to verify every thins is ok )

8. Recreate the metadevice state database replicas that were deleted in step 3 from c0t0d0s7:

# metadb -a -f -c 3 /dev/dsk/c1t0d0s4

# metadb -i (verify the creation)

9. Re-enable the submirrors:

# metareplace –e d0 c1t0d0s0 (d0 is / and c1t0d0s0 is device associated w/ d10 submirror)

# metareplace –e d1 c1t0d0s0 (d1 is /swap and c1t0d0s1 is device associated w/ d11 submirror)

# metareplace –e d3 c1t0d0s0 (d3 is /var and c1t0d0s0 is device associated w/ d13 submirror)

# metareplace –e d6 c1t0d0s0 (d6 is /usr and c1t0d0s0 is device associated w/ d16 submirror)

# metareplace –e d7 c1t0d0s0 (d7 is /local and c1t0d0s0 is device associated w/ d17 submirror)

10. Run metastat | grep sync to check if disks are in sync.

Issue I faced:

1> /swap file system was showing under Maintenance even after metareplace ran successfully and I was not able detach the submirror. Then I used #metaclear d1 to clear all the submirror and recreated the mirror and worked fine.

No comments: