Monday, October 18, 2010

AIX: ASM Metadata & DISKS

lspv: command used to get disk list (disk, assigned PVID, vg, status)


$lspv
hdisk0          00c2e2503cddd971                    rootvg          active
hdisk9          00c2e2508b71c3a9                    sw_vg           active
hdisk1          none                                None          
hdisk2          none                                None          
hdisk3          none                                None          
...

ASM disks should not have PVID assigned to them (Configuring Storage for Grid Infrastructure, 3.3.3 Configuring Disk Devices for Oracle ASM, step 7). They should be removed with fallowing command before assigned to ASM:

$/usr/sbin/chdev -l hdiskn -a pv=clear

Note: If you already assigned disks without clearing PVID, don't clear it using above command. It corrupts asm disks while clearing header. Because it is the same header where asm keeps its internal info. Actually ASM clears header and PVID and writes its own header when a disk assigned to a datagroup. However because OS odm (aix's internal database) is not updated, you can see PVIDs using lspv although it is not there.

By using following command you can query disk header in order to see ASM internal information written on disk (requires root permission)


$lquerypv -h /dev/rhdisk5
00000000 00820101 00000000 80000000 81AA935B |...............[|
00000010 00000000 00000000 00000000 00000000 |................|
00000020 4F52434C 4449534B 00000000 00000000 |ORCLDISK........|
00000030 00000000 00000000 00000000 00000000 |................|
00000040 0A100000 00000103 54455354 5F303030 |........TEST_000|
00000050 30000000 00000000 00000000 00000000 |0...............|
00000060 00000000 00000000 54455354 00000000 |........TEST....|
00000070 00000000 00000000 00000000 00000000 |................|
00000080 00000000 00000000 54455354 5F303030 |........TEST_000|


In 11g you can backup and recover corrupted disk headers using md_backup and md_restore. The corrupted diskgroup must be specified with the ‘-g’ flag for the restore to complete
successfully:

ASMCMD> md_restore -b /oracle/backup/asm_metadata020409.bkp -g 'TEST'

Ref:
11gRAC_ASM_1.pdf (doc link)

How to Drop ASM Datagroup while not mounted and disks corrupted

In a test environment we accidentally corrupt asm disks and reset their headers. When I query V$ASM_DISK view, I saw that disks were discovered as "CANDIDATE". It was like newly added disks available for ASM. The problem was, my datagroup which was using these disks, was exists but i could not drop it. That was because a diskgroup has to be mounted in order to be dropped, but its disks were corrupted so i could not mount it:

SQL> drop diskgroup DATA;
drop diskgroup DATA
*
ERROR at line 1:
ORA-15039: diskgroup not dropped
ORA-15001: diskgroup "DATA" does not exist or is not mounted


SQL> ALTER DISKGROUP DATA mount;
ALTER DISKGROUP DATA mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15017: diskgroup "DATA" cannot be mounted
ORA-15063: ASM discovered an insufficient number of disks for diskgroup "DATA"

How can i drop/delete datagroup which i can not mount?

Solution is to create a new datagroup, add all disks to new datagroup. At the end, old datagroup disappears... Then you may drop newly added diskgroup.

create diskgroup DATA_DROP external redundancy disk '/dev/rhdisk#' force;
alter diskgroup DATA_DROP add disk '/dev/rhdisk#';
...
DATA disappears when i add all its disks to DATA_DROP...
...
drop diskgroup DATA_DROP including contents;

now, DATA diskgroup has gone and all its disks are in "FORMER" status.