Monday, November 16, 2009

Vxplex DISABLED RECOVER state

After a NetApp 6080 hosting FCP LUNs failed this weekend we came into the office to notice many of the servers using those LUNs had offline volumes and disk groups.


Here was the state of the volume in question

v szdbor006du02 - DISABLED ACTIVE 2727606272 SELECT - fsgen
pl szdbor006du02-01 szdbor006du02 DISABLED RECOVER 2727606272 CONCAT - RW
sd szdbor006ddg01-01 szdbor006du02-01 szdbor006ddg01 0 209646560 0 c1t500A098187197B34d10 ENA
sd szdbor006ddg02-02 szdbor006du02-01 szdbor006ddg02 209648096 943459616 209646560 c1t500A098187197B34d11 ENA
sd szdbor006ddg03-01 szdbor006du02-01 szdbor006ddg03 0 1153107712 1153106176 c3t500A098287197B34d15 ENA
sd szdbor006ddg04-01 szdbor006du02-01 szdbor006ddg04 0 421392384 2306213888 c1t500A09828759382Fd50 ENA

Issued vxrecover on the volume and plex but the state never changed and I didn't find a vxrecovery task with ps or vxtask list. The recovery task was somehow confused I am guessing so to fix here is what I needed to do.
vxplex -g diskgroup det szdbor006du02-01
This put the plex into a DETACHED STALE state
vxmend -g diskgroup fix clean szdbor006du02-01
This put the plex back into a DETACHED CLEAN state at which point I could do a
vxvol -g diskgroup startall (I could have just put the volume name as well)
This enabled and started the volume. FSCK'd and remounted the FS.

Now to figure out why exactly the FAS6080 crashed just because of an HBA hiccup.
Hope this may be useful if you ever run into the same scenario.

No comments:

Post a Comment