If we have 3 nodes, presumably to the voting disk each node will write message as follows:
Node 1 writes : I can see Node 2 & 3
Node 2 writes : I can see Node 1 & 3
Node 3 writes : I can see Node 1 & 2
If for example Node 3's private network has problem, the message may become:
Node 1 writes : I can see Node 2 only
Node 2 writes : I can see Node 1 only
Node 3 writes : I can not see either Node 1 or Node 2 ( or it does not write anything)
In this situation, clearly Node 3 should be evicted from the cluster.
To avoid a single point of failure, we can multiplex voting disk. By design, if strictly more than half of the voting disks are up and contain consistent information, the cluster will be fine. That is to say if we have 5 voting disks, we can have at most 2 voting disk failures.
So the number_of_voting_disk = number_of_tolerable_disk_failure * 2 + 1.
This post is to document my test with the following task regarding voting disk administration:
Task - Recover from the lost of voting disks
1. Check the current voting disk configuration
[oracle@rac1 backup]$ crsctl query css votedisk
0. 0 /dev/raw/raw6
1. 0 /dev/raw/raw7
2. 0 /dev/raw/raw8
located 3 votedisk(s).
2. Backup voting disk
[oracle@rac1 backup]$ dd if=/dev/raw/raw6 of=/home/oracle/backup/votingdisk_050710
80325+0 records in
80325+0 records out
[oracle@rac1 backup]$
[oracle@rac1 backup]$ ls -lhtr
total 40M
-rw-r--r-- 1 oracle oinstall 40M May 7 16:23 votingdisk_050710
3. Wipe out the first voting disk
dd if=/dev/zero of=/dev/raw/raw6
Note: I have three voting disk files, in my understanding, the cluster should survive with 1 voting disk failure, however,rac1 and rac2 reboot right after I issue this command. I don't know why.
--------------- RAC 1 alert log --------------------
Fri May 7 16:25:21 2010
Trace dumping is performing id=[cdmp_20100507162519]
Fri May 7 16:25:23 2010
Error: KGXGN aborts the instance (6)
Fri May 7 16:25:24 2010
Errors in file /u01/app/oracle/admin/devdb/bdump/devdb1_lmon_10476.trc:
ORA-29702: error occurred in Cluster Group Service operation
LMON: terminating instance due to error 29702
--------------- RAC 2 alert log --------------------
ri May 7 16:25:19 2010
Error: KGXGN aborts the instance (6)
Fri May 7 16:25:19 2010
Error: unexpected error (6) from the Cluster Service (LCK0)
Fri May 7 16:25:19 2010
Errors in file /u01/app/oracle/admin/devdb/bdump/devdb2_lmon_3150.trc:
ORA-29702: error occurred in Cluster Group Service operation
Fri May 7 16:25:19 2010
Errors in file /u01/app/oracle/admin/devdb/bdump/devdb2_lck0_3236.trc:
ORA-29702: error occurred in Cluster Group Service operation
Fri May 7 16:25:19 2010
LMON: terminating instance due to error 29702
Fri May 7 16:25:21 2010
System state dump is made for local instance
System State dumped to trace file /u01/app/oracle/admin/devdb/bdump/devdb2_diag_3146.trc
Fri May 7 16:31:01 2010
4. Restart CRS stack
[oracle@rac1 ~]$ sudo $ORA_CRS_HOME/bin/crsctl stop crs
Password:
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
[oracle@rac1 ~]$
[oracle@rac1 ~]$ ssh rac2 sudo $ORA_CRS_HOME/bin/crsctl stop crs
Password:vz123ys
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
[oracle@rac1 ~]$
[oracle@rac1 ~]$
[oracle@rac1 ~]$
[oracle@rac1 ~]$ ps -ef | grep d.bin
oracle 14672 30539 0 16:56 pts/1 00:00:00 grep d.bin
[oracle@rac1 ~]$ ssh rac2 ps -ef | grep d.bin
[oracle@rac1 ~]$ ./crsstat.sh
HA Resource Target State
----------- ------ -----
error connecting to CRSD at [(ADDRESS=(PROTOCOL=ipc)(KEY=ora_crsqs))] clsccon 184
[oracle@rac1 ~]$ sudo $ORA_CRS_HOME/bin/crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
[oracle@rac1 ~]$ ssh rac2 sudo $ORA_CRS_HOME/bin/crsctl start crs
Attempting to start CRS stack
The CRS stack will be started shortly
[oracle@rac1 ~]$ ps -ef | grep d.bin
root 14242 1 0 16:54 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/crsd.bin reboot
oracle 15219 14240 2 16:58 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/evmd.bin
oracle 15383 15357 2 16:58 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/ocssd.bin
oracle 15602 30539 0 16:58 pts/1 00:00:00 grep d.bin
[oracle@rac1 ~]$ ssh rac2 ps -ef | grep d.bin
root 23610 1 0 16:56 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/crsd.bin reboot
oracle 24394 23609 2 16:58 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/evmd.bin
oracle 24575 24549 2 16:58 ? 00:00:00 /u01/app/oracle/product/10.2.0/crs_1/bin/ocssd.bin
[oracle@rac1 ~]$ ./crsstat.sh
HA Resource Target State
----------- ------ -----
ora.devdb.SLBA.cs OFFLINE OFFLINE
ora.devdb.SLBA.devdb1.srv OFFLINE OFFLINE
ora.devdb.SLBA.devdb2.srv OFFLINE OFFLINE
ora.devdb.SNOLBA.cs OFFLINE OFFLINE
ora.devdb.SNOLBA.devdb1.srv OFFLINE OFFLINE
ora.devdb.SNOLBA.devdb2.srv OFFLINE OFFLINE
ora.devdb.db ONLINE ONLINE on rac2
ora.devdb.devdb1.inst ONLINE ONLINE on rac1
ora.devdb.devdb2.inst ONLINE ONLINE on rac2
ora.rac1.ASM1.asm ONLINE ONLINE on rac1
ora.rac1.LISTENER_RAC1.lsnr ONLINE ONLINE on rac1
ora.rac1.gsd ONLINE ONLINE on rac1
ora.rac1.ons ONLINE ONLINE on rac1
ora.rac1.vip ONLINE ONLINE on rac1
ora.rac2.ASM2.asm ONLINE ONLINE on rac2
ora.rac2.LISTENER_RAC2.lsnr ONLINE ONLINE on rac2
ora.rac2.gsd ONLINE ONLINE on rac2
ora.rac2.ons ONLINE ONLINE on rac2
ora.rac2.vip ONLINE ONLINE on rac2
5. Check log
[oracle@rac1 rac1]$ tail alertrac1.log
2010-05-07 16:58:50.358
[cssd(15383)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 rac2 .
2010-05-07 16:58:54.722
[crsd(14242)]CRS-1201:CRSD started on node rac1.
2010-05-07 16:59:45.626
[cssd(15383)]CRS-1604:CSSD voting file is offline: /dev/raw/raw6. Details in /u01/app/oracle/product/10.2.0/crs_1 /log/rac1/cssd/ocssd.log.
2010-05-07 17:00:47.657
[cssd(15383)]CRS-1604:CSSD voting file is offline: /dev/raw/raw6. Details in /u01/app/oracle/product/10.2.0/crs_1 /log/rac1/cssd/ocssd.log.
2010-05-07 17:01:49.730
[cssd(15383)]CRS-1604:CSSD voting file is offline: /dev/raw/raw6. Details in /u01/app/oracle/product/10.2.0/crs_1 /log/rac1/cssd/ocssd.log.
[oracle@rac1 rac1]$ tail /u01/app/oracle/product/10.2.0/crs_1/log/rac1/cssd/ocssd.log
[ CSSD]2010-05-07 17:00:16.337 [132250528] >TRACE: clssgmClientConnectMsg: Connect from con(0x8358600) proc( 0x8389ac8) pid() proto(10:2:1:1)
[ CSSD]2010-05-07 17:00:22.183 [132250528] >TRACE: clssgmClientConnectMsg: Connect from con(0x8358600) proc( 0x838eb90) pid() proto(10:2:1:1)
[ CSSD]2010-05-07 17:00:30.776 [132250528] >TRACE: clssgmClientConnectMsg: Connect from con(0x8358600) proc( 0x838ee90) pid() proto(10:2:1:1)
[ CSSD]2010-05-07 17:00:47.657 [62401440] >TRACE: clssnmDiskStateChange: state from 3 to 3 disk (0//dev/raw/ raw6)
[ CSSD]2010-05-07 17:01:07.263 [132250528] >TRACE: clssgmClientConnectMsg: Connect from con(0x8358600) proc( 0x837e6b8) pid() proto(10:2:1:1)
[ CSSD]2010-05-07 17:01:09.009 [132250528] >TRACE: clssgmClientConnectMsg: Connect from con(0x8357dd8) proc( 0x8379340) pid() proto(10:2:1:1)
[ CSSD]2010-05-07 17:01:49.730 [62401440] >TRACE: clssnmDiskStateChange: state from 3 to 3 disk (0//dev/raw/ raw6)
[ CSSD]2010-05-07 17:02:09.984 [132250528] >TRACE: clssgmClientConnectMsg: Connect from con(0x835a580) proc( 0x8365a50) pid() proto(10:2:1:1)
[ CSSD]2010-05-07 17:02:51.784 [62401440] >TRACE: clssnmDiskStateChange: state from 3 to 3 disk (0//dev/raw/ raw6)
[ CSSD]2010-05-07 17:03:12.292 [132250528] >TRACE: clssgmClientConnectMsg: Connect from con(0x835a580) proc( 0x8365a50) pid() proto(10:2:1:1)
Note: similar message from rac2 alertrac2.log and ocssd.log. It can be seen with three voting disk files, if one of them unavailable, the RAC is still functioning.
6. Wipe out the second voting disk
dd if=/dev/zero of=/dev/raw/raw7
Two nodes reboot right after issuing above command. After reboot, only see evmd running:
[oracle@rac1 ~]$ ps -ef | grep d.bin
oracle 8139 6985 3 17:12 ? 00:00:14 /u01/app/oracle/product/10.2.0/crs_1/bin/evmd.binoracle 11075 10255 0 17:20 pts/1 00:00:00 grep d.bin
7. Check log:
--- alertrac1.log show two voting disk files are offline
2010-05-07 17:12:14.926
[cssd(8303)]CRS-1604:CSSD voting file is offline: /dev/raw/raw6. Details in /u01/app/oracle/product/10.2.0/crs_1/log/rac1/cssd/ocssd.log.
2010-05-07 17:12:15.099
[cssd(8303)]CRS-1604:CSSD voting file is offline: /dev/raw/raw7. Details in /u01/app/oracle/product/10.2.0/crs_1/log/rac1/cssd/ocssd.log.
2010-05-07 17:12:15.147
[cssd(8303)]CRS-1605:CSSD voting file is online: /dev/raw/raw8. Details in /u01/app/oracle/product/10.2.0/crs_1/log/rac1/cssd/ocssd.log.
[oracle@rac1 rac1]$
[oracle@rac1 crsd]$ tail crsd.log
2010-05-07 17:14:04.532: [ COMMCRS][36494240]clsc_connect: (0x8655528) no listener at (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_rac1_crs))
2010-05-07 17:14:04.532: [ CSSCLNT][3086931648]clsssInitNative: connect failed, rc 9
2010-05-07 17:14:04.533: [ CRSRTI][3086931648]0CSS is not ready. Received status 3 from CSS. Waiting for good status ..
2010-05-07 17:14:05.536: [ CRSMAIN][3086931648][PANIC]0CRSD exiting: Could not init the CSS context
2010-05-07 17:14:05.540: [ default][3086931648]Terminating clsd session
8. Restore Voting Disk
[oracle@rac1 ~]$ dd if=/home/oracle/backup/votingdisk_050710 of=/dev/raw/raw6
80325+0 records in
80325+0 records out
[oracle@rac1 ~]$ dd if=/home/oracle/backup/votingdisk_050710 of=/dev/raw/raw7
dd: writing to `/dev/raw/raw7': No space left on device
80263+0 records in
80262+0 records out
9. Restart CRS
[oracle@rac1 ~]$ sudo $ORA_CRS_HOME/bin/crsctl start crs
Password:
Attempting to start CRS stack
The CRS stack will be started shortly
[oracle@rac1 ~]$ ssh rac2 sudo $ORA_CRS_HOME/bin/crsctl start crs
Password:vz123ys
Attempting to start CRS stack
The CRS stack will be started shortly
---- in alertrac1.log --------------------------
[oracle@rac1 rac1]$ pwd
/u01/app/oracle/product/10.2.0/crs_1/log/rac1
[oracle@rac1 rac1]$ tail -15 alertrac1.log
[cssd(8303)]CRS-1605:CSSD voting file is online: /dev/raw/raw8. Details in /u01/app/oracle/product/10.2.0/crs_1/log/rac1/cssd/ocssd.log.
2010-05-07 17:29:31.679
[cssd(13301)]CRS-1605:CSSD voting file is online: /dev/raw/raw6. Details in /u01/app/oracle/product/10.2.0/crs_1/log/rac1/cssd/ocssd.log.
2010-05-07 17:29:31.714
[cssd(13301)]CRS-1605:CSSD voting file is online: /dev/raw/raw7. Details in /u01/app/oracle/product/10.2.0/crs_1/log/rac1/cssd/ocssd.log.
2010-05-07 17:29:31.729
[cssd(13301)]CRS-1605:CSSD voting file is online: /dev/raw/raw8. Details in /u01/app/oracle/product/10.2.0/crs_1/log/rac1/cssd/ocssd.log.
2010-05-07 17:29:35.433
[cssd(13301)]CRS-1601:CSSD Reconfiguration complete. Active nodes are rac1 rac2 .
2010-05-07 17:29:38.247
[crsd(8910)]CRS-1012:The OCR service started on node rac1.
2010-05-07 17:29:38.287
[evmd(13364)]CRS-1401:EVMD started on node rac1.
2010-05-07 17:31:02.432
[crsd(8910)]CRS-1201:CRSD started on node rac1.
---- CTS resource are on-line
[oracle@rac1 ~]$ ~/crsstat.sh
HA Resource Target State
----------- ------ -----
ora.devdb.SLBA.cs OFFLINE OFFLINE
ora.devdb.SLBA.devdb1.srv OFFLINE OFFLINE
ora.devdb.SLBA.devdb2.srv OFFLINE OFFLINE
ora.devdb.SNOLBA.cs OFFLINE OFFLINE
ora.devdb.SNOLBA.devdb1.srv OFFLINE OFFLINE
ora.devdb.SNOLBA.devdb2.srv OFFLINE OFFLINE
ora.devdb.db ONLINE ONLINE on rac2
ora.devdb.devdb1.inst ONLINE ONLINE on rac1
ora.devdb.devdb2.inst ONLINE ONLINE on rac2
ora.rac1.ASM1.asm ONLINE ONLINE on rac1
ora.rac1.LISTENER_RAC1.lsnr ONLINE ONLINE on rac1
ora.rac1.gsd ONLINE ONLINE on rac1
ora.rac1.ons ONLINE ONLINE on rac1
ora.rac1.vip ONLINE ONLINE on rac1
ora.rac2.ASM2.asm ONLINE ONLINE on rac2
ora.rac2.LISTENER_RAC2.lsnr ONLINE ONLINE on rac2
ora.rac2.gsd ONLINE ONLINE on rac2
ora.rac2.ons ONLINE ONLINE on rac2
ora.rac2.vip ONLINE ONLINE on rac2
[oracle@rac1 ~]$ date
Fri May 7 17:37:35 EDT 2010
In conclusion, with 3 voting disks, my test did show RAC can be operational if one of them is offline; if two of them are not available, then the CRS daemon can not start at all. However, when zero-out one of the voting disks, the server reboots, this is not desirable, not sure if this is due to my particular environment.
No comments:
Post a Comment