Friday, April 30, 2010

Execution Plan of a Subquery inside CASE statement

I found that in our 9.2.0.8 database the execution plan of a subquery inside CASE statement can not be seen in the SQL_TRACE/TKPROF report or from the excution plan extracted from the shared pool. However, it can be seen from Autotrace explain plan. I constructed a test case to demonstrate this as follows:

rem script: xplan_diff.sql
rem   execution plan for the subquery in the CASE statement not shown in SQL_TRACE/TKPROF in 9i
rem
rem

spool xplan_diff.log
set echo on

drop table t;
drop table t2;

create table t 
as 
select rownum id,
       'GOOD' val
from all_objects
where rownum <=20;
 
update t set val='BAD' where mod(id,2) = 0;
commit;

create table t2
as
select rownum id,
       'TEST' val
from all_objects
where rownum <=20; 


alter session set tracefile_identifier = test;
alter session set timed_statistics=true;
alter session set events '10046 trace name context forever, level 12';
set autotrace traceonly
select id,
       case when  val = 'GOOD' then 
  (select t2.val 
  from t, t2
  where t.id = t2.id
    and rownum = 1
         )
       else 'NO TEST'
       END  as IS_TEST
from t;
set autotrace off
alter session set events '10046 trace name context forever, level 12';
spool off
exit;


set doc off
doc

---- 9.2.0.8 Autotrace xplan  ---------------------------

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE
   1    0   COUNT (STOPKEY)
   2    1     MERGE JOIN
   3    2       SORT (JOIN)
   4    3         TABLE ACCESS (FULL) OF 'T2'
   5    2       SORT (JOIN)
   6    5         TABLE ACCESS (FULL) OF 'T'
   7    0   TABLE ACCESS (FULL) OF 'T'


----- 9.2.0.8  tkprof  -----------------------------------

Rows     Row Source Operation
-------  ---------------------------------------------------
     20  TABLE ACCESS FULL T
 
 

------ 10.2.0.4  Autotrace  xplan -------------------------------------------
----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |    20 |   380 |     2   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY      |      |       |       |            |          |
|*  2 |   HASH JOIN         |      |    20 |   640 |     5  (20)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T    |    20 |   260 |     2   (0)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| T2   |    20 |   380 |     2   (0)| 00:00:01 |
|   5 |  TABLE ACCESS FULL  | T    |    20 |   380 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter(ROWNUM=1)
   2 - access("T"."ID"="T2"."ID")

------ 10.2.0.4  tkprof ------------
Rows     Row Source Operation
-------  ---------------------------------------------------
      1  COUNT STOPKEY (cr=6 pr=0 pw=0 time=1435 us)
      1   HASH JOIN  (cr=6 pr=0 pw=0 time=1417 us)
     20    TABLE ACCESS FULL T (cr=3 pr=0 pw=0 time=109 us)
      1    TABLE ACCESS FULL T2 (cr=3 pr=0 pw=0 time=64 us)
     20  TABLE ACCESS FULL T (cr=4 pr=0 pw=0 time=72 us)

#


Note: in 10.2.0.4 no such problem.


I was puzzled the other day about a similar sql to that in the above test case in our 9i  production database, which has two cursors with very different  sql_plan_hash_value and gets_per_exections when I checked it from v$sql. But when I pulled the execution plan out from v$sql_plan, the execution plan from both cursors are exactly same. Now I understand the subquery execution plan parts are missing. The different cost is due to the join order swapped in the subquery.

Wednesday, April 28, 2010

Play with my toy 10g RAC V - Move OCR and voting disk from OCFS2 to RAW device

My toy 10g RAC was set up based on Vincent Chan's article originally, in which OCFS2 filesystem was used for the shared OCR and votingdisk files.  In this post, I document steps I have taken to move them to RAW devices.

Steps:

1. Create a new shared virtual SCSI hard disk

a. Add a virtual disk (3GB) for RAC1

b. In the "Add Hardward Wizard" and "Hard Disk" windows for RAC2, Select the following option:
Use an Existing Virtual Disk - Choose this option to reuse or share a hard disk from another Virtual Machine

2. Modify virtual machine configuration file

Additional parameters are required to enable disk sharing between the two virtual RAC nodes. (Details see Vincent's article). Additonal reference is here.

3. Identify the device name for the disk

In RAC1 and RAC2, login as root, run command: fdisk -l, the following message can be seen:

Disk /dev/sdf: 3221 MB, 3221225472 bytes
255 heads, 63 sectors/track, 391 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdf doesn't contain a valid partition table


4. Create disk partitions
General reference about fdisk can be seen here.

At the end, I created partitions for /dev/sdf shown as follows:

Disk /dev/sdf: 3221 MB, 3221225472 bytes
255 heads, 63 sectors/track, 391 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdf1               1          63      506016   83  Linux
/dev/sdf2              64         126      506047+  83  Linux
/dev/sdf3             127         131       40162+  83  Linux
/dev/sdf4             132         391     2088450    5  Extended
/dev/sdf5             132         136       40131   83  Linux
/dev/sdf6             137         141       40131   83  Linux
/dev/sdf7             142         391     2008093+  83  Linux



Note: in reality, OCR, OCRMIRROR and each votingdisk file should be placed on seperate physical disks. Here I placed them in the same disk but different partitions. This is just for learning purpose.

5. Binding the partition
To map the raw devices to the shared partitions created, I did the following:

a. Edit /etc/sysconfig/rawdevices on both nodes, add the following:

/dev/raw/raw4 /dev/sdf1             
/dev/raw/raw5 /dev/sdf2             
/dev/raw/raw6 /dev/sdf3             
/dev/raw/raw7 /dev/sdf5             
/dev/raw/raw8 /dev/sdf6             
/dev/raw/raw9 /dev/sdf7             

b. Issue: /sbin/service rawdevices restart

root@rac1:/dev/raw [devdb1]# /sbin/service rawdevices restart
Assigning devices:
           /dev/raw/raw1  -->   /dev/sdc1
/dev/raw/raw1:  bound to major 8, minor 33
           /dev/raw/raw2  -->   /dev/sdd1
/dev/raw/raw2:  bound to major 8, minor 49
           /dev/raw/raw3  -->   /dev/sde1
/dev/raw/raw3:  bound to major 8, minor 65
           /dev/raw/raw4  -->   /dev/sdf1
/dev/raw/raw4:  bound to major 8, minor 81
           /dev/raw/raw5  -->   /dev/sdf2
/dev/raw/raw5:  bound to major 8, minor 82
           /dev/raw/raw6  -->   /dev/sdf3
/dev/raw/raw6:  bound to major 8, minor 83
           /dev/raw/raw7  -->   /dev/sdf5
/dev/raw/raw7:  bound to major 8, minor 85
           /dev/raw/raw8  -->   /dev/sdf6
/dev/raw/raw8:  bound to major 8, minor 86
           /dev/raw/raw9  -->   /dev/sdf7
/dev/raw/raw9:  bound to major 8, minor 87
done


Note: when issuing above command in RAC2, I recieve message: /dev/sdfn (n=1-7) is not a block device.
After reboot both node, everything is fine:

root@rac1:~ [devdb1]# raw -qa
/dev/raw/raw1:  bound to major 8, minor 33
/dev/raw/raw2:  bound to major 8, minor 49
/dev/raw/raw3:  bound to major 8, minor 65
/dev/raw/raw4:  bound to major 8, minor 81
/dev/raw/raw5:  bound to major 8, minor 82
/dev/raw/raw6:  bound to major 8, minor 83
/dev/raw/raw7:  bound to major 8, minor 85
/dev/raw/raw8:  bound to major 8, minor 86
/dev/raw/raw9:  bound to major 8, minor 87

[root@rac2 ~]# raw -qa
/dev/raw/raw1:  bound to major 8, minor 33
/dev/raw/raw2:  bound to major 8, minor 49
/dev/raw/raw3:  bound to major 8, minor 65
/dev/raw/raw4:  bound to major 8, minor 81
/dev/raw/raw5:  bound to major 8, minor 82
/dev/raw/raw6:  bound to major 8, minor 83
/dev/raw/raw7:  bound to major 8, minor 85
/dev/raw/raw8:  bound to major 8, minor 86
/dev/raw/raw9:  bound to major 8, minor 87

c. Edit /etc/udev/permissions.d/50-udev.permissions, add the following entries, and reboot

raw/raw1:oracle:dba:0660
raw/raw2:oracle:dba:0660
raw/raw3:oracle:dba:0660
raw/raw4:root:dba:0640
raw/raw5:root:dba:0640
raw/raw6:oracle:dba:0640
raw/raw7:oracle:dba:0640
raw/raw8:oracle:dba:0640
raw/raw9:oracle:dba:0660

Note: my intended use of raw devices is raw4 and raw5 for OCR and OCRMIRROR; raw6, raw7, raw8 for voting disk files.

6. Moving OCR file from OCFS to raw devices
a. shutdown CRS resource
b. Determine current OCR configuration:

[oracle@rac2 ~]$ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     262144
         Used space (kbytes)      :       5364
         Available space (kbytes) :     256780
         ID                       :  645781380
         Device/File Name         : /ocfs/clusterware/ocr
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

c. Relocate ocr and add ocr mirror

root@rac1:~ [devdb1]# ocrconfig -replace ocrmirror /dev/raw/raw5
root@rac1:~ [devdb1]# ocrconfig -replace ocr /dev/raw/raw4

Note: if ocrmirror is not added first, the following errors will be seen:

root@rac1:~ [devdb1]# ocrconfig -replace ocr /dev/raw/raw4
PROT-16: Internal Error

d. Verify and start CRS resources

root@rac1:/ocfs/clusterware [devdb1]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     505928
         Used space (kbytes)      :       5364
         Available space (kbytes) :     500564
         ID                       :  645781380
         Device/File Name         : /dev/raw/raw4
                                    Device/File needs to be synchronized with the other device
         Device/File Name         : /dev/raw/raw5
                                    Device/File integrity check succeeded

         Cluster registry integrity check succeeded

[May 4,2010, update: I did not notice "Device/File needs to be synchronized  ..." in the above ocrcheck output on Apr 28. CRS processes could be started despite of this. But today I have to fix this to get CRS started. see http://oracle-study-notes.blogspot.com/2010/05/play-with-my-toy-10g-rac-vi-failed-to.html ]

7. Moving voting disk files from OCFS to raw devices

a. Determine the current voting disk location

oracle@rac1:~ [devdb1]$ crsctl query css votedisk
 0.     0    /ocfs/clusterware/votingdisk

located 1 votedisk(s).

b. Shutdown Oracle Clusterware on both node;

oracle@rac1:~ [devdb1]$ sudo $ORA_CRS_HOME/bin/crsctl stop crs
Password:
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.
oracle@rac1:~ [devdb1]$ ssh rac2 sudo $ORA_CRS_HOME/bin/crsctl stop crs
Password:vz123ys

Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.


c. Take a backup of all voting disk:
oracle@rac1:~ [devdb1]$ dd if=/ocfs/clusterware/votingdisk of=/home/oracle/backup/votingdisk_042810
20000+0 records in
20000+0 records out

d. Move

root@rac1:~ [devdb1]# crsctl delete votedisk css /ocfs/clusterware/votingdisk -force
Unexpected parameter: votedisk
root@rac1:~ [devdb1]# crsctl delete css votedisk /ocfs/clusterware/votingdisk -force
only one votedisk /ocfs/clusterware/votingdisk configured.
please add another disk before removing the last one.
root@rac1:~ [devdb1]# crsctl add  css votedisk  /dev/raw/raw7 -force
Now formatting voting disk: /dev/raw/raw7
successful addition of votedisk /dev/raw/raw7.
root@rac1:~ [devdb1]# crsctl delete css votedisk /ocfs/clusterware/votingdisk -force
successful deletion of votedisk /ocfs/clusterware/votingdisk.
root@rac1:~ [devdb1]# crsctl add  css votedisk  /dev/raw/raw6 -force
Now formatting voting disk: /dev/raw/raw6
successful addition of votedisk /dev/raw/raw6.
root@rac1:~ [devdb1]# crsctl add  css votedisk  /dev/raw/raw8 -force
Now formatting voting disk: /dev/raw/raw8
successful addition of votedisk /dev/raw/raw8.
root@rac1:~ [devdb1]# crsctl query css votedisk
 0.     0    /dev/raw/raw6
 1.     0    /dev/raw/raw7
 2.     0    /dev/raw/raw8

located 3 votedisk(s).


e. Start Oracle Clusterware and verify everything is ok

oracle@rac1:~ [devdb1]$ sudo $ORA_CRS_HOME/bin/crsctl start crs
Password:
Attempting to start CRS stack
The CRS stack will be started shortly
oracle@rac1:~ [devdb1]$ ssh rac2 sudo $ORA_CRS_HOME/bin/crsctl start crs
Password:vz123ys

Attempting to start CRS stack
The CRS stack will be started shortly

--- Wait for several min here  -----------

oracle@rac1:~ [devdb1]$
oracle@rac1:~ [devdb1]$ ./crs_rep.sh
HA Resource                                   Target     State
-----------                                   ------     -----
ora.devdb.SLBA.cs                             ONLINE     ONLINE on rac1
ora.devdb.SLBA.devdb1.srv                     ONLINE     ONLINE on rac1
ora.devdb.SLBA.devdb2.srv                     ONLINE     ONLINE on rac2
ora.devdb.SNOLBA.cs                           ONLINE     ONLINE on rac1
ora.devdb.SNOLBA.devdb1.srv                   ONLINE     ONLINE on rac1
ora.devdb.SNOLBA.devdb2.srv                   ONLINE     ONLINE on rac2
ora.devdb.db                                  ONLINE     ONLINE on rac2
ora.devdb.devdb1.inst                         ONLINE     ONLINE on rac1
ora.devdb.devdb2.inst                         ONLINE     ONLINE on rac2
ora.rac1.ASM1.asm                             ONLINE     ONLINE on rac1
ora.rac1.LISTENER_RAC1.lsnr                   ONLINE     ONLINE on rac1
ora.rac1.gsd                                  ONLINE     ONLINE on rac1
ora.rac1.ons                                  ONLINE     ONLINE on rac1
ora.rac1.vip                                  ONLINE     ONLINE on rac1
ora.rac2.ASM2.asm                             ONLINE     ONLINE on rac2
ora.rac2.LISTENER_RAC2.lsnr                   ONLINE     ONLINE on rac2
ora.rac2.gsd                                  ONLINE     ONLINE on rac2
ora.rac2.ons                                  ONLINE     ONLINE on rac2
ora.rac2.vip                                  ONLINE     ONLINE on rac2
oracle@rac1:~ [devdb1]$ crsctl query css votingdisk
Unexpected parameter: votingdisk
oracle@rac1:~ [devdb1]$ crsctl query css votedisk
 0.     0    /dev/raw/raw6
 1.     0    /dev/raw/raw7
 2.     0    /dev/raw/raw8

located 3 votedisk(s).

Wednesday, April 21, 2010

Play with my toy 10g RAC IV - Test two OCR adminstration tasks

The OCR contains information about the cluster node list, instance-to-node mapping information, and information about Oracle Clusterware resource profiles for applications that may have been customized

Task 1 - Mirror the OCR

Oracle RAC environments do not support more than two OCRs, at most a primary OCR and a second OCR. As my OCR is on the OCFS2 filesystem, I need to first create a new OCR file to complete the task of mirroring the OCR

1). Verify I don't have ocrmirror:

root@rac1:~ [devdb1]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     262144
         Used space (kbytes)      :       5348
         Available space (kbytes) :     256796
         ID                       :  645781380
         Device/File Name         : /ocfs/clusterware/ocr
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

2) Create a new OCR file:

dd if=/dev/zero of=/ocfs/clusterware/ocrmirror.dbf bs=1M count=128

root@rac1:/ocfs/clusterware [devdb1]# dd if=/dev/zero of=/ocfs/clusterware/ocrmirror.dbf bs=1M count=128
128+0 records in
128+0 records out
root@rac1:/ocfs/clusterware [devdb1]# ls -lh
total 144M
-rw-r-----  1 root   oinstall 5.5M Apr 15 10:15 ocr
-rw-r--r--  1 root   root     128M Apr 20 16:10 ocrmirror.dbf
-rw-r--r--  1 oracle oinstall 9.8M Apr 20 16:10 votingdisk


3) Add the OCRMIRROR

root@rac1:/ocfs/clusterware [devdb1]# ocrconfig -replace ocrmirror /ocfs/clusterware/ocrmirror.dbf

root@rac1:/ocfs/clusterware [devdb1]# ls -lhtr
total 397M
-rw-r-----  1 root   oinstall 5.5M Apr 15 10:15 ocr
-rw-r--r--  1 root   root     381M Apr 20 16:13 ocrmirror.dbf
-rw-r--r--  1 oracle oinstall 9.8M Apr 20 16:14 votingdisk


Note: the size of ocrmirror.dbf becomes 381M, this is quite unusal.


4) Verify:

root@rac1:/ocfs/clusterware [devdb1]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     262144
         Used space (kbytes)      :       5348
         Available space (kbytes) :     256796
         ID                       :  645781380
         Device/File Name         : /ocfs/clusterware/ocr
                                    Device/File integrity check succeeded
         Device/File Name         : /ocfs/clusterware/ocrmirror.dbf
                                    Device/File integrity check succeeded

         Cluster registry integrity check succeeded


5) Remove OCRMRROR if we want

root@rac1:/ocfs/clusterware [devdb1]# ocrconfig -replace ocrmirror
root@rac1:/ocfs/clusterware [devdb1]# ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     262144
         Used space (kbytes)      :       5348
         Available space (kbytes) :     256796
         ID                       :  645781380
         Device/File Name         : /ocfs/clusterware/ocr
                                    Device/File integrity check succeeded

                                    Device/File not configured

         Cluster registry integrity check succeeded

root@rac1:/ocfs/clusterware [devdb1]# rm ocrmirror.dbf
rm: remove regular file `ocrmirror.dbf'? yes
root@rac1:/ocfs/clusterware [devdb1]#
root@rac1:/ocfs/clusterware [devdb1]# df -h .
Filesystem            Size  Used Avail Use% Mounted on
/dev/sdb1             512M   90M  423M  18% /ocfs




Task 2 - Backup and restore an OCR file

1) For security purpose, generate a logical OCR backup file

[oracle@rac1 ~]$ sudo /u01/app/oracle/product/10.2.0/crs_1/bin/ocrconfig -export ./backup/logicalocrbak
Password:
[oracle@rac1 ~]$ ls -lh ./backup/
total 128K
-rw-r--r--  1 root root 122K Apr 21 15:16 logicalocrbak


2) Delete the OCR file to simulate lost of OCR

root@rac1:/ocfs/clusterware [devdb1]# ls -lhtr
total 138M
-rw-r--r--  1 root   root     128M Apr 21 15:17 ocr
-rw-r--r--  1 oracle oinstall 9.8M Apr 21 15:17 votingdisk
root@rac1:/ocfs/clusterware [devdb1]# rm -f ocr
root@rac1:/ocfs/clusterware [devdb1]# ls -lhtr
total 9.8M
-rw-r--r--  1 oracle oinstall 9.8M Apr 21 15:17 votingdisk

[oracle@rac1 ~]$ ocrcheck
PROT-602: Failed to retrieve data from the cluster registry


3) Locate a physical backup of OCR

[oracle@rac1 ~]$ ocrconfig -showbackup

rac2     2010/04/20 17:58:54     /u01/app/oracle/product/10.2.0/crs_1/cdata/crs

rac2     2010/04/19 19:47:53     /u01/app/oracle/product/10.2.0/crs_1/cdata/crs

rac2     2010/04/19 15:47:51     /u01/app/oracle/product/10.2.0/crs_1/cdata/crs

rac2     2010/04/19 11:47:47     /u01/app/oracle/product/10.2.0/crs_1/cdata/crs

rac2     2010/04/19 15:47:51     /u01/app/oracle/product/10.2.0/crs_1/cdata/crs
[oracle@rac1 ~]$ ssh rac2 ls -lhtr /u01/app/oracle/product/10.2.0/crs_1/cdata/crs
total 37M
-rw-r--r--  1 root root 4.6M Apr 12 21:14 week_.ocr
-rw-r--r--  1 root root 5.4M Apr 19 11:47 day.ocr
-rw-r--r--  1 root root 5.4M Apr 19 15:47 backup02.ocr
-rw-r--r--  1 root root 5.4M Apr 19 15:47 week.ocr
-rw-r--r--  1 root root 5.4M Apr 19 19:47 backup01.ocr
-rw-r--r--  1 root root 5.4M Apr 20 17:58 backup00.ocr
-rw-r--r--  1 root root 5.4M Apr 20 17:58 day_.ocr


Note: The Oracle Clusterware automatically creates OCR backups every four hours. Default location: CRS_home/cdata/cluster_name

4) Stop CRS resources on both nodes

[oracle@rac1 ~]$ crs_stop -all
Attempting to stop `ora.devdb.SNOLBA.devdb1.srv` on member `rac1`
Attempting to stop `ora.devdb.SLBA.devdb1.srv` on member `rac1`
Attempting to stop `ora.devdb.SNOLBA.cs` on member `rac1`
Attempting to stop `ora.rac1.gsd` on member `rac1`
Attempting to stop `ora.devdb.SLBA.cs` on member `rac1`
Attempting to stop `ora.devdb.SLBA.devdb2.srv` on member `rac2`
Attempting to stop `ora.devdb.SNOLBA.devdb2.srv` on member `rac2`
Stop of `ora.rac1.gsd` on member `rac1` succeeded.
Stop of `ora.devdb.SLBA.devdb2.srv` on member `rac2` succeeded.
Stop of `ora.devdb.SLBA.devdb1.srv` on member `rac1` succeeded.
Stop of `ora.devdb.SNOLBA.devdb2.srv` on member `rac2` succeeded.
Stop of `ora.devdb.SNOLBA.devdb1.srv` on member `rac1` succeeded.
Attempting to stop `ora.rac1.ons` on member `rac1`
Attempting to stop `ora.rac2.gsd` on member `rac2`
Attempting to stop `ora.rac2.ons` on member `rac2`
Stop of `ora.rac2.gsd` on member `rac2` succeeded.
Attempting to stop `ora.devdb.db` on member `rac2`
Stop of `ora.rac1.ons` on member `rac1` succeeded.
Stop of `ora.rac2.ons` on member `rac2` succeeded.
Stop of `ora.devdb.SLBA.cs` on member `rac1` succeeded.
Stop of `ora.devdb.SNOLBA.cs` on member `rac1` succeeded.
Stop of `ora.devdb.db` on member `rac2` succeeded.
Attempting to stop `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Attempting to stop `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2`
Stop of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
`ora.devdb.devdb1.inst` is already OFFLINE.
Attempting to stop `ora.rac1.ASM1.asm` on member `rac1`
Stop of `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2` succeeded.
`ora.devdb.devdb2.inst` is already OFFLINE.
Attempting to stop `ora.rac2.ASM2.asm` on member `rac2`
Stop of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Attempting to stop `ora.rac1.vip` on member `rac1`
Stop of `ora.rac2.ASM2.asm` on member `rac2` succeeded.
Stop of `ora.rac1.vip` on member `rac1` succeeded.
Attempting to stop `ora.rac2.vip` on member `rac2`
Stop of `ora.rac2.vip` on member `rac2` succeeded.
CRS-0216: Could not stop resource 'ora.devdb.devdb1.inst'.

CRS-0216: Could not stop resource 'ora.devdb.devdb2.inst'.

[oracle@rac1 ~]$ ./crs_rep.sh
HA Resource                                   Target     State
-----------                                   ------     -----
ora.devdb.SLBA.cs                             OFFLINE    OFFLINE
ora.devdb.SLBA.devdb1.srv                     OFFLINE    OFFLINE
ora.devdb.SLBA.devdb2.srv                     OFFLINE    OFFLINE
ora.devdb.SNOLBA.cs                           OFFLINE    OFFLINE
ora.devdb.SNOLBA.devdb1.srv                   OFFLINE    OFFLINE
ora.devdb.SNOLBA.devdb2.srv                   OFFLINE    OFFLINE
ora.devdb.db                                  OFFLINE    OFFLINE
ora.devdb.devdb1.inst                         OFFLINE    OFFLINE
ora.devdb.devdb2.inst                         OFFLINE    OFFLINE
ora.rac1.ASM1.asm                             OFFLINE    OFFLINE
ora.rac1.LISTENER_RAC1.lsnr                   OFFLINE    OFFLINE
ora.rac1.gsd                                  OFFLINE    OFFLINE
ora.rac1.ons                                  OFFLINE    OFFLINE
ora.rac1.vip                                  OFFLINE    OFFLINE
ora.rac2.ASM2.asm                             OFFLINE    OFFLINE
ora.rac2.LISTENER_RAC2.lsnr                   OFFLINE    OFFLINE
ora.rac2.gsd                                  OFFLINE    OFFLINE
ora.rac2.ons                                  OFFLINE    OFFLINE
ora.rac2.vip                                  OFFLINE    OFFLINE



5) Stop CRS on both node

The following command has failed due to OCR was deleted:

[oracle@rac1 ~]$ sudo $ORA_CRS_HOME/bin/crsctl stop crs
Password:
OCR initialization failed accessing OCR device: PROC-26: Error while accessing the physical storage Operating System error [No such file or directory] [2]

Note: normally we should see:

[oracle@rac1 ~]$ ssh rac2 sudo $ORA_CRS_HOME/bin/crsctl stop crs
Stopping resources.
Successfully stopped CRS resources
Stopping CSSD.
Shutting down CSS daemon.
Shutdown request successfully issued.


Tried to kill ocssd.bin processes on both nodes at OS level. Server reboot automatically after they are killed.
After reboot, no CRS procesess are running.

[oracle@rac1 ~]$ ps -ef | grep d.bin
oracle    8887  8391  0 15:36 pts/1    00:00:00 grep d.bin
[oracle@rac1 ~]$ ssh rac2 ps -ef | grep d.bin
[oracle@rac1 ~]$



6) Restore OCR using the backup identified at step 3

Due to my OCR resides on OCFS2 filesystem, I need to create the file at first.

root@rac1:/ocfs/clusterware [devdb1]# dd if=/dev/zero of=/ocfs/clusterware/ocr bs=1M count=128
128+0 records in
128+0 records out
root@rac1:/ocfs/clusterware [devdb1]# ls -lhtr
total 138M
-rw-r--r--  1 oracle oinstall 9.8M Apr  5 10:17 votingdisk
-rw-r--r--  1 root   root     128M Apr 21 15:41 ocr


Then I can do a restore from backup:


[oracle@rac1 clusterware]$ ssh rac2 sudo $ORA_CRS_HOME/bin/ocrconfig -restore /u01/app/oracle/product/10.2.0/crs_1/cdata/crs/backup00.ocr
Password:vz123ys




[oracle@rac1 ~]$ cluvfy comp ocr -n all -verbose

Verifying OCR integrity

Checking OCR integrity...

Checking the absence of a non-clustered configuration...
All nodes free of non-clustered, local-only configurations.

Uniqueness check for OCR device passed.

Checking the version of OCR...
OCR of correct Version "2" exists.

Checking data integrity of OCR...
Data integrity check for OCR passed.

OCR integrity check passed.

Verification of OCR integrity was successful.



Note: the syntax for restore from logical backup is: ocrconfig -import /home/oracle/backup/ocrlogicbackup

7) Restart CRS on both nodes

[oracle@rac1 ~]$ sudo $ORA_CRS_HOME/bin/crsctl start crs
Password:
Attempting to start CRS stack
The CRS stack will be started shortly
[oracle@rac1 ~]$ ssh rac2 sudo $ORA_CRS_HOME/bin/crsctl start crs
Password:vz123ys

Attempting to start CRS stack
The CRS stack will be started shortly

[oracle@rac1 ~]$ ps -ef | grep d.bin
root      6591     1  0 15:29 ?        00:00:04 /u01/app/oracle/product/10.2.0/crs_1/bin/crsd.bin reboot
oracle    9941  6589  0 15:45 ?        00:00:01 /u01/app/oracle/product/10.2.0/crs_1/bin/evmd.bin
oracle   10071 10045  0 15:45 ?        00:00:01 /u01/app/oracle/product/10.2.0/crs_1/bin/ocssd.bin
oracle   20444  8391  0 15:51 pts/1    00:00:00 grep d.bin



After serveral trials using SRVCTL to shutdown/bring up individual commponent and manually shutdown ASM and RAC2 instances, finally I got:

[oracle@rac1 ~]$ ./crs_rep.sh
HA Resource                                   Target     State
-----------                                   ------     -----
ora.devdb.SLBA.cs                             ONLINE     ONLINE on rac1
ora.devdb.SLBA.devdb1.srv                     ONLINE     ONLINE on rac1
ora.devdb.SLBA.devdb2.srv                     ONLINE     ONLINE on rac2
ora.devdb.SNOLBA.cs                           ONLINE     ONLINE on rac1
ora.devdb.SNOLBA.devdb1.srv                   ONLINE     ONLINE on rac1
ora.devdb.SNOLBA.devdb2.srv                   ONLINE     ONLINE on rac2
ora.devdb.db                                  ONLINE     ONLINE on rac2
ora.devdb.devdb1.inst                         ONLINE     ONLINE on rac1
ora.devdb.devdb2.inst                         ONLINE     ONLINE on rac2
ora.rac1.ASM1.asm                             ONLINE     ONLINE on rac1
ora.rac1.LISTENER_RAC1.lsnr                   ONLINE     ONLINE on rac1
ora.rac1.gsd                                  ONLINE     ONLINE on rac1
ora.rac1.ons                                  ONLINE     ONLINE on rac1
ora.rac1.vip                                  ONLINE     ONLINE on rac1
ora.rac2.ASM2.asm                             ONLINE     ONLINE on rac2
ora.rac2.LISTENER_RAC2.lsnr                   ONLINE     ONLINE on rac2
ora.rac2.gsd                                  ONLINE     ONLINE on rac2
ora.rac2.ons                                  ONLINE     ONLINE on rac2
ora.rac2.vip                                  ONLINE     ONLINE on rac2
[oracle@rac1 ~]$



Ref:
(1) http://download.oracle.com/docs/cd/B19306_01/rac.102/b14197/votocr.htm#BABIHADG
(2) OCR / Vote disk Maintenance Operations: (ADD/REMOVE/REPLACE/MOVE), including moving from RAW Devices to Block Devices. [MOS ID 428681.1]
(3) 'ocrconfig -replace ocr' Fails With PROT-16 [MOS ID 444757.1]

Friday, April 16, 2010

Play with my toy 10g RAC III - Demonstrate Load Balancing Advisory

By design, RAC involves multiple instances accessing the same database. By analogy, a database can be thought of as an island, which is accessible by multiple brigdes (i.e. instances). If there were two brigdes and 10 incoming cars that are being driven to the island, and if the 10 cars are directed randomly toward one of the bridges,  each bridge will be added 5 cars roughly, this is so-called Client-side connect-time load balancing . That is to say regardless of how busy a brigde is, how many cars already on that brigde, and how much longer to travel through that bridge etc, each brige is treated equally. We can easily imagine that if we have a mechanizm to monitor the condition of each brigde and direct the incoming cars based on the current traffic conditions, we will be much better in providing the service. Such a mechanizm in Oracle RAC is called Load balancing Advisory, which can be used in the Server-side connect-time load balancing.  Both client-side and server-side load balancings are applied to the time when a physical connection to a database is initially opened. Since opening a connection is an expensive operation, ideally application should be able to open a connection once, then reuse it for as many times as needed. There is a function called connection pooling at the application server layer, which establishs physical connections once and never close them. With respect to connection pools, there is a feature called "Runtime Connection Load Balancing", esstially it allocates logical connections among the physical connections in the connection pool at the transaction level instead of load balancing at the time of the initial database connection.

In this post, I will describe a test which is to demonstrate server-side connect-time load balancing with load balancing advisory feature. The idea used in this test was inspired from Jim Czuprynski's article , as well as in his Oracle University RAC class, which I attended about two and half years ago.

First of all, a "job" in this test is defined as a workload that involves:

(1) connect to the database as user "DENIS"
(2) do an insert and  an delete to a table 1000 times
(3) disconnect
(4) repeat (1) to (3) during predefined time period, e.g 20 min.

So if I submit 100 such jobs, I would expect 100 sessions count by username "DENIS"

Secondly, I created a service called 'SLBA', which is used for the job to connect to the database. The details about how to create a LBA-enabled service can be seen in Jim's article or Oracle documentation.

The steps are as follows:

1. Query GV$SERVICEMEATIC for instance goodness before starting any jobs


12:39:59 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:40:00   12:40:05       2     100     100
         12:39:58   12:40:01       1     100     100
         12:38:58   12:39:58       1     100     100
         12:38:53   12:39:53       2     100     100


Note: In Oracle 10.2 and above, the GOODNESS indicates how attractive the instance is with respect to processing the workload presented to the database service. Lower numbers are preferred. The umber is internally calculated based on any goal (long or short) that was specified for that database service. In Oracle 10.2 and above, the DELTA column predicts how much GOODNESS will increase for every
additional session that connects to the instance.


2. Start 10 jobs using SERVICE SLBA at RAC2
[oracle@rac2 ~]$ startSLBAload.sh 20


12:41:18 system@DEVDB> @ct
count # of session by username DENIS

       Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:41:27     1          3
2010-04-16 12:41:27     2          7


12:42:30 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:42:39   12:42:45       2     416    9998
         12:42:33   12:42:38       1     384    9844
         12:40:58   12:41:58       1     384    9844
         12:40:53   12:41:54       2     416    9998


3. Start CPU LOAD at RAC2

[oracle@rac2 ~]$ cpu_load.sh &
[1] 8882
[oracle@rac2 ~]$ ps -ef | grep zip
oracle    8883  8882 16 12:43 pts/1    00:00:01 gzip video1.mpg
oracle    8884  8882  8 12:43 pts/1    00:00:00 gzip video2.mpg
Note: I created two gzip/guzip jobs at RAC2 to increase CPU utilization at RAC2.


4. Query GV$SERVICEMEATIC for instance goodness

12:44:46 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:44:48   12:44:54       2    1666    9998
         12:44:43   12:44:49       1     500    9994
         12:43:54   12:44:54       2    1666    9998
         12:42:58   12:43:58       1     500    9994


Note: it is expected to see "Goodness" increasing in RAC2, compared to RAC1, RAC2 should be less attractive to the incomming connections.


5. Start 10 jobs using SERVICE SLBA at RAC2 - total 20 jobs

12:45:55 system@DEVDB> @ct
count # of session by username DENIS

                     Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:46:11     1          7
2010-04-16 12:46:11     2         13

12:46:03 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:46:10   12:46:14       2   12500    9998
         12:46:03   12:46:07       1     588    9998
         12:44:57   12:45:58       1     588    9998
         12:44:54   12:45:54       2   12500    9998


6. Start 10 jobs using SERVICE SLBA at RAC2 - total 30 jobs

12:47:52 system@DEVDB> @ct
count # of session by username DENIS

                     Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:48:06     1         15
2010-04-16 12:48:06     2         15

12:47:58 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:48:05   12:48:10       2   12500    9998
         12:47:43   12:47:48       1     750    9998
         12:46:54   12:47:55       2   12500    9998
         12:45:58   12:46:58       1     750    9998


7. Start 10 jobs using SERVICE SLBA at RAC2 - total 40 jobs

12:50:14 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:50:19   12:50:22       1    1428    9998
         12:50:19   12:50:23       2   12500    9998
         12:48:58   12:49:58       1    1428    9998
         12:48:55   12:49:54       2   12500    9998

12:50:21 system@DEVDB> @ct
count # of session by username DENIS

                     Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:50:39     1         15
2010-04-16 12:50:39     2         25


12:51:23 system@DEVDB> @ct
count # of session by username DENIS


8. Start 10 jobs using SERVICE SLBA at RAC2 - total 50 jobs

Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:51:37     1         26
2010-04-16 12:51:37     2         24

12:51:30 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:51:40   12:51:43       2   12500    9998
         12:51:28   12:51:33       1    2500    9998
         12:49:58   12:50:58       1    2500    9998
         12:49:54   12:50:55       2   12500    9998

Note: We started to see # of session in RAC1 greater than that in RAC2


8. Start 10 jobs using SERVICE SLBA at RAC2 - total 60 jobs

Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:53:15     1         34
2010-04-16 12:53:15     2         26

12:53:09 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:53:15   12:53:20       2   12500    9998
         12:53:08   12:53:13       1    1700    9998
         12:51:58   12:52:58       1    1700    9998
         12:51:55   12:52:55       2   12500    9998


9. Start 10 jobs using SERVICE SLBA at RAC2 - total 70 jobs

count # of session by username DENIS

                     Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:54:23     1         42
2010-04-16 12:54:23     2         28

12:54:15 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:54:19   12:54:24       2   12500    9998
         12:54:13   12:54:18       1    2100    9998
         12:52:58   12:53:58       1    2100    9998
         12:52:55   12:53:54       2   12500    9998

9. Start 10 jobs using SERVICE SLBA at RAC2 - total 80 jobs

12:55:37 system@DEVDB> @ct
count # of session by username DENIS

                     Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:55:55     1         53
2010-04-16 12:55:55     2         27

12:55:50 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:55:55   12:55:59       2   12500    9998
         12:55:48   12:55:52       1    2650    9998
         12:54:52   12:55:55       2   12500    9998
         12:53:58   12:54:58       1    2650    9998

9. Start 10 jobs using SERVICE SLBA at RAC2 - total 90 jobs

12:56:48 system@DEVDB> @ct
count # of session by username DENIS

                     Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:56:59     1         59
2010-04-16 12:56:59     2         31

12:56:52 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:56:55   12:57:00       2   12500    9998
         12:56:48   12:56:53       1    5000    9998
         12:55:55   12:56:55       2   12500    9998
         12:54:58   12:55:57       1    5000    9998


10. Start 10 jobs using SERVICE SLBA at RAC2 - total 100 jobs

12:58:55 system@DEVDB> @ct
count # of session by username DENIS

                     Inst
SYSDATE                ID   COUNT(*)
------------------- ----- ----------
2010-04-16 12:59:08     1         65
2010-04-16 12:59:08     2         35

12:59:01 system@DEVDB> @servicemetric.sql SLBA

Service                         Inst    Good
Name     Start Time End Time      ID    ness   DELTA
-------- ---------- ---------- ----- ------- -------
SLBA     12:59:05   12:59:10       2   12500    9998
         12:58:54   12:58:57       1   12500    9998
         12:57:58   12:58:57       1   12500    9998
         12:57:52   12:58:52       2   12500    9998


Note: At total jobs 100, the GOODNESS of two instances were evened out. The change of GOODNESS inidicated the LBA was working and somehow was used to direct the incomming connections to the less-loaded instance.

Scripts used in this test can be found here.

Monday, April 12, 2010

Play with my toy 10g RAC II - Familiar with SRVCTL

In this post, I continue to describe the experiences I have gained through playing with my little toy RAC. First of all, I found my RAC database could not fully function after restarting the VM servers, i.e. after my VM RAC1 and RAC2 nodes are powered on, the status of the cluster is usually shown as following:

[oracle@rac1 ~]$ ./crs_rep.sh
HA Resource                                   Target     State
-----------                                   ------     -----
ora.devdb.CRM.cs                              OFFLINE    OFFLINE
ora.devdb.CRM.devdb1.srv                      OFFLINE    OFFLINE
ora.devdb.db                                  ONLINE     ONLINE on rac1
ora.devdb.devdb1.inst                         ONLINE     ONLINE on rac1
ora.devdb.devdb2.inst                         ONLINE     ONLINE on rac2
ora.rac1.ASM1.asm                             ONLINE     ONLINE on rac1
ora.rac1.LISTENER_RAC1.lsnr                   ONLINE     UNKNOWN on rac1
ora.rac1.gsd                                  ONLINE     UNKNOWN on rac1
ora.rac1.ons                                  ONLINE     ONLINE on rac1
ora.rac1.vip                                  ONLINE     ONLINE on rac1
ora.rac2.ASM2.asm                             ONLINE     ONLINE on rac2
ora.rac2.LISTENER_RAC2.lsnr                   ONLINE     UNKNOWN on rac2
ora.rac2.gsd                                  ONLINE     UNKNOWN on rac2
ora.rac2.ons                                  ONLINE     ONLINE on rac2
ora.rac2.vip                                  ONLINE     ONLINE on rac2

It can be seen that the status of gsd and listener rescours are unknown. The service CRM is not started. I have to follow the following steps, which use SRVCTL exlusively to get the RAC work:

- Stop application resources

srvctl stop database -d devdb
srvctl stop asm -n rac1
srvctl stop asm -n rac2
srvctl stop nodeapps -n rac1
srvctl stop nodeapps -n rac2
- Start application resources

srvctl start nodeapps -n rac1

srvctl start nodeapps -n rac2
srvctl start asm -n rac1
srvctl start asm -n rac2
srvctl start database -d devdb
srvctl start service -d devdb -s CRM


Here I list some other srvctl commands which may be useful for reference:

- Start and stop instances
srvctl start instance -d devdb -i devdb1
srvctl stop instance -d devdb -i devdb1

- Start and stop listeners
srvctl start listener -n devdb1 -l LISTENER_RAC1
srvctl stop listener -n devdb2 -l LISTENER_RAC2

- View current configuration for the entire cluster database
srvctl config database -d devdb -a

[oracle@rac1 ~]$ srvctl config database -d devdb -a
rac1 devdb1 /u01/app/oracle/product/10.2.0/db_1
rac2 devdb2 /u01/app/oracle/product/10.2.0/db_1
DB_NAME: devdb
ORACLE_HOME: /u01/app/oracle/product/10.2.0/db_1
SPFILE: +DG1/devdb/spfiledevdb.ora
DOMAIN: null
DB_ROLE: null
START_OPTIONS: null
POLICY:  AUTOMATIC
ENABLE FLAG: DB ENABLED
- View the configuration for all node applications for node rac2
srvctl config nodeapps -n rac2

- Create a new service
srvctl add service -d devdb -s CRM

- Start a service on the whole cluster
srvctl start service -d devdb -s CRM

- Stop a service on both instance, force it to stop (-f)
srvctl stop service -d devdb -s CRM -i devdb1,devdb2 -f

Note: it does not work, I have to do:
srvctl stop service -d devdb -s CRM -i devdb1 -f
srvctl stop service -d devdb -s CRM -i devdb2 -f

- Enable a service on the devdb2 instance only
srvctl enable service -d devdb -s CRM -i devdb2

- Disable a service on the devdb1 instance only
srvctl disable service -d devdb -s CRM -i devdb1

- Modify a service so that devdb2 becomes preferred instance
srvctl modify service -d devdb -s CRM -i devdb2 -r

- Remove a service
srvctl remove service -d devdb -s CRM

- View config for a service
srvctl config service -d devdb -s CRM -a

Apr 17,2010 Update:

After all CRS resource offline, I was able to bring everything up by: crs_start -all

[oracle@rac1 ~]$ ./crs_rep.sh
HA Resource                                   Target     State
-----------                                   ------     -----
ora.devdb.SLBA.cs                             OFFLINE    OFFLINE
ora.devdb.SLBA.devdb1.srv                     OFFLINE    OFFLINE
ora.devdb.SLBA.devdb2.srv                     OFFLINE    OFFLINE
ora.devdb.SNOLBA.cs                           OFFLINE    OFFLINE
ora.devdb.SNOLBA.devdb1.srv                   OFFLINE    OFFLINE
ora.devdb.SNOLBA.devdb2.srv                   OFFLINE    OFFLINE
ora.devdb.db                                  OFFLINE    OFFLINE
ora.devdb.devdb1.inst                         OFFLINE    OFFLINE
ora.devdb.devdb2.inst                         OFFLINE    OFFLINE
ora.rac1.ASM1.asm                             OFFLINE    OFFLINE
ora.rac1.LISTENER_RAC1.lsnr                   OFFLINE    OFFLINE
ora.rac1.gsd                                  OFFLINE    OFFLINE
ora.rac1.ons                                  OFFLINE    OFFLINE
ora.rac1.vip                                  OFFLINE    OFFLINE
ora.rac2.ASM2.asm                             OFFLINE    OFFLINE
ora.rac2.LISTENER_RAC2.lsnr                   OFFLINE    OFFLINE
ora.rac2.gsd                                  OFFLINE    OFFLINE
ora.rac2.ons                                  OFFLINE    OFFLINE
ora.rac2.vip                                  OFFLINE    OFFLINE
oracle@rac1 ~]$ crs_start -all
Attempting to start `ora.rac1.vip` on member `rac1`
Attempting to start `ora.rac2.vip` on member `rac2`
Start of `ora.rac2.vip` on member `rac2` succeeded.
Start of `ora.rac1.vip` on member `rac1` succeeded.
Attempting to start `ora.rac1.ASM1.asm` on member `rac1`
Attempting to start `ora.rac2.ASM2.asm` on member `rac2`
Start of `ora.rac2.ASM2.asm` on member `rac2` succeeded.
Attempting to start `ora.devdb.devdb2.inst` on member `rac2`
Start of `ora.rac1.ASM1.asm` on member `rac1` succeeded.
Attempting to start `ora.devdb.devdb1.inst` on member `rac1`
Start of `ora.devdb.devdb2.inst` on member `rac2` succeeded.
Attempting to start `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2`
Start of `ora.rac2.LISTENER_RAC2.lsnr` on member `rac2` succeeded.
Start of `ora.devdb.devdb1.inst` on member `rac1` succeeded.
Attempting to start `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1`
Start of `ora.rac1.LISTENER_RAC1.lsnr` on member `rac1` succeeded.
Attempting to start `ora.devdb.SLBA.cs` on member `rac1`
Attempting to start `ora.devdb.SLBA.devdb2.srv` on member `rac2`
Attempting to start `ora.devdb.SLBA.devdb1.srv` on member `rac1`
Start of `ora.devdb.SLBA.devdb2.srv` on member `rac2` succeeded.
Attempting to start `ora.devdb.SNOLBA.cs` on member `rac1`
Attempting to start `ora.devdb.SNOLBA.devdb1.srv` on member `rac1`
Start of `ora.devdb.SLBA.cs` on member `rac1` succeeded.
Attempting to start `ora.rac1.gsd` on member `rac1`
CRS-1002: Resource 'ora.rac1.ons' is already running on member 'rac1'

Attempting to start `ora.devdb.SNOLBA.devdb2.srv` on member `rac2`
Start of `ora.devdb.SNOLBA.devdb2.srv` on member `rac2` succeeded.
Start of `ora.rac1.gsd` on member `rac1` succeeded.
CRS-1002: Resource 'ora.devdb.db' is already running on member 'rac1'

CRS-1002: Resource 'ora.rac2.ons' is already running on member 'rac2'

Start of `ora.devdb.SLBA.devdb1.srv` on member `rac1` succeeded.
Attempting to start `ora.rac2.gsd` on member `rac2`
Start of `ora.devdb.SNOLBA.cs` on member `rac1` succeeded.
Start of `ora.rac2.gsd` on member `rac2` succeeded.
Start of `ora.devdb.SNOLBA.devdb1.srv` on member `rac1` succeeded.
CRS-0223: Resource 'ora.devdb.db' has placement error.

CRS-0223: Resource 'ora.rac1.ons' has placement error.

CRS-0223: Resource 'ora.rac2.ons' has placement error.

[oracle@rac1 ~]$ ./crs_rep.sh
HA Resource                                   Target     State
-----------                                   ------     -----
ora.devdb.SLBA.cs                             ONLINE     ONLINE on rac1
ora.devdb.SLBA.devdb1.srv                     ONLINE     ONLINE on rac1
ora.devdb.SLBA.devdb2.srv                     ONLINE     ONLINE on rac2
ora.devdb.SNOLBA.cs                           ONLINE     ONLINE on rac1
ora.devdb.SNOLBA.devdb1.srv                   ONLINE     ONLINE on rac1
ora.devdb.SNOLBA.devdb2.srv                   ONLINE     ONLINE on rac2
ora.devdb.db                                  ONLINE     ONLINE on rac1
ora.devdb.devdb1.inst                         ONLINE     ONLINE on rac1
ora.devdb.devdb2.inst                         ONLINE     ONLINE on rac2
ora.rac1.ASM1.asm                             ONLINE     ONLINE on rac1
ora.rac1.LISTENER_RAC1.lsnr                   ONLINE     ONLINE on rac1
ora.rac1.gsd                                  ONLINE     ONLINE on rac1
ora.rac1.ons                                  ONLINE     ONLINE on rac1
ora.rac1.vip                                  ONLINE     ONLINE on rac1
ora.rac2.ASM2.asm                             ONLINE     ONLINE on rac2
ora.rac2.LISTENER_RAC2.lsnr                   ONLINE     ONLINE on rac2
ora.rac2.gsd                                  ONLINE     ONLINE on rac2
ora.rac2.ons                                  ONLINE     ONLINE on rac2
ora.rac2.vip                                  ONLINE     ONLINE on rac2

Tuesday, April 06, 2010

Play with my toy 10g RAC - Test Transparent Application Failover (TAF)

Finally I got my first toy 10g RAC set up based on Vincent Chan's excellent document (published in 2007, this doc has been not availabe on the original OTN websitr any more). This toy RAC has two nodes, which are VMWare server 2.0 Virtual Machines with CentOS 4.7 hosted on Windows XP. The next steps are of course to explore various RAC specific features on it as much as possible. I plan to re-visit all the hands-on labs I experienced when I was in an Oracle University Course "Oracle Database 10g: RAC for Adminstrators" about 2.5 years ago. I have been planning to set up a RAC since then, however, the actual action has been delayed again and again. Now the world is talking about 11g RAC though, I am still excited that I finally got this little out-dated toy done.

Vicent's document also provided a test to demonstrate TAF, which I repeated here as a warm-up for my exploration:



(1) Create a Service and start it


Service Name : CRM
Database Name : devdb
Preferred Instance : devdb1
Available Instance : devdb2
TAF Policy : BASIC


[oracle@rac1 ~]$ srvctl add service -d devdb -s CRM -r devdb1 -a devdb2 -P BASIC
[oracle@rac1 ~]$ srvctl config service -d devdb -s CRM -a
CRM PREF: devdb1 AVAIL: devdb2 TAF: BASIC
[oracle@rac1 ~]$ srvctl start service -d devdb -s CRM
[oracle@rac1 ~]$ ./crs_rep.sh
HA Resource                                   Target     State
-----------                                   ------     -----
ora.devdb.CRM.cs                              ONLINE     ONLINE on rac1
ora.devdb.CRM.devdb1.srv                      ONLINE     ONLINE on rac1
ora.devdb.CRM.devdb2.srv                      ONLINE     ONLINE on rac2
ora.devdb.db                                  ONLINE     ONLINE on rac1
ora.devdb.devdb1.inst                         ONLINE     ONLINE on rac1
ora.devdb.devdb2.inst                         ONLINE     ONLINE on rac2
ora.rac1.ASM1.asm                             ONLINE     ONLINE on rac1
ora.rac1.LISTENER_RAC1.lsnr                   ONLINE     ONLINE on rac1
ora.rac1.gsd                                  ONLINE     ONLINE on rac1
ora.rac1.ons                                  ONLINE     ONLINE on rac1
ora.rac1.vip                                  ONLINE     ONLINE on rac1
ora.rac2.ASM2.asm                             ONLINE     ONLINE on rac2
ora.rac2.LISTENER_RAC2.lsnr                   ONLINE     ONLINE on rac2
ora.rac2.gsd                                  ONLINE     ONLINE on rac2
ora.rac2.ons                                  ONLINE     ONLINE on rac2
ora.rac2.vip                                  ONLINE     ONLINE on rac2


(2) Add the following entry in the client tnsnames.ora
CRM =
(DESCRIPTION =
   (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.31)(PORT = 1521))
   (ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.2.32)(PORT = 1521))
   (LOAD_BALANCE = yes)
   (CONNECT_DATA =
     (SERVER = DEDICATED)
     (SERVICE_NAME = CRM)
     (FAILOVER_MODE =
        (TYPE = SELECT)
        (METHOD = BASIC)
        (RETRIES = 180)
        (DELAY = 5)
     )
   )
)


(3) Connect the first session using CRM Service

C:\Documents and Settings\denis>sqlplus denis@CRM

SQL*Plus: Release 11.1.0.6.0 - Production on Tue Apr 6 14:48:54 2010

Copyright (c) 1982, 2007, Oracle.  All rights reserved.

Enter password:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> select
  2   failover_type,
  3   failover_method,
  4   failed_over
  5   from v$session
  6   where username='DENIS';

FAILOVER_TYPE FAILOVER_M FAI
------------- ---------- ---
SELECT        BASIC      NO


SQL> select
  2   instance_number instance#,
  3   instance_name,
  4   host_name,
  5   status
  6   from v$instance;

 INSTANCE# INSTANCE_NAME    HOST_NAME            STATUS
---------- ---------------- -------------------- ------------
         1 devdb1           rac1.localdomain     OPEN


(4) Shutdwon instance devdb1 from another session login as SYS

sys@DEVDB> select
  2   instance_number instance#,
  3   instance_name,
  4   host_name,
  5   status
  6  from v$instance;

 INSTANCE# INSTANCE_NAME    HOST_NAME            STATUS
---------- ---------------- -------------------- ------------
         1 devdb1           rac1.localdomain     OPEN


sys@DEVDB> shutdown abort
ORACLE instance shut down.


(5) Verify the session has failed over.
In the first session:

SQL> select
  2   instance_number instance#,
  3   instance_name,
  4   host_name,
  5   status
  6  from v$instance;

 INSTANCE# INSTANCE_NAME    HOST_NAME            STATUS
---------- ---------------- -------------------- ------------
         2 devdb2           rac2.localdomain     OPEN


SQL> select
  2    failover_type,
  3    failover_method,
  4    failed_over
  5   from v$session
  6  where username='DENIS';

FAILOVER_TYPE FAILOVER_M FAI
------------- ---------- ---
SELECT        BASIC      YES


(6) Relocate the CRM service back to the preferred instance.

After devdb1 is brought back up, the CRM service does not automatically relocate
back to the preferred instance. This means any new connections using CRM service will be on devdb2, instead of the preferred instance
devdb1. This can be achieved by the following command:

$ srvctl relocate service -d devdb -s CRM -i devdb2 -t devdb1

The meaning of the command options can be seen here:

[oracle@rac1 ~]$ srvctl relocate service -h
Usage: srvctl relocate service -d <name> -s <service_name> -i <old_inst_name> -t <new_inst_name> [-f]
    -d <name>           Unique name for the database
    -s <service>        Service name
    -i <old_inst>       Old instance name
    -t <new_inst>       New instance name
    -f                  Disconnect all sessions during stop or relocate service operations
    -h                  Print usage

Thursday, April 01, 2010

Data block was loaded using the NOLOGGING option

A production database went live last weekend through cloning and applying archived logs by similiar approach as described in this post. However, when users did an insert into a table, they recieved " ORA-26040: Data block was loaded using the NOLOGGING option", "ORA-01578: ORACLE data block corrupted (file # xx, block # xxx)". By querying DBA_EXTENTS with file # and block #, I identfied this block belongs to an index segment.  After discussing with app DBAs, I realized that during cloning period, at source database, rebuild indexes for that table has been performed and those indexes have nologging option turned on. During dissussion, app DBAs also suggested to use DBVERIFY to check other possible block corruption, I was not so sure it could work. I was under assumption that DBVEFRIFY can only detect physical block corruptions, not logical block corruptions. As we have already identified the root cause, we have not done DBV for all data files, we just dropped the table and re-created all indexes (app DBA confirms that we don't need the data in the new database acutally).

Today, I did a test to see if DBV is supposed to catch such kind of block corruption. The overall test steps is described as follows:

1. Create a test tablespace, idx_temp
2. Create a table and an index on idx_temp tablespace
3. Backup idx_temp tablespace
4. Rebuild the index with nologging
5. Some DML activities
6. Lost of idx_temp tablespace datafile
7. Recover tablespace
8. Full scan the index
9. Identify corrupted block by DBV and analyze statement

Note: during test I have learned that  FORCE_LOGGING  should not be turned on at database level, otherwise the test will fail to demonstrate the points.

SQL> drop table t;
drop table t
           *
ERROR at line 1:
ORA-00942: table or view does not exist
 
 
SQL> drop tablespace idx_temp including contents and datafiles;
 
Tablespace dropped.
 
SQL> create tablespace idx_temp datafile '/tmp/idx_temp_data_1.dbf' size 200M;
 
Tablespace created.
 
SQL> create table t as select rownum id, a.* from dba_objects a;
 
Table created.
 
SQL> create index t_idx on t(id) tablespace idx_temp nologging;
 
Index created.
 
SQL> alter tablespace idx_temp begin backup;
 
Tablespace altered.
 
SQL> ho cp /tmp/idx_temp_data_1.dbf /tmp/idx_temp_data_1.dbf.bak
 
SQL> alter tablespace idx_temp end backup;
 
Tablespace altered.
 
SQL> alter index t_idx rebuild;
 
Index altered.
 
SQL> update t set object_name=lower(object_name);
 
50842 rows updated.
 
SQL> ho rm -f /tmp/idx_temp_data_1.dbf
 
SQL> alter tablespace idx_temp offline immediate;
 
Tablespace altered.
 
SQL> ho cp /tmp/idx_temp_data_1.dbf.bak /tmp/idx_temp_data_1.dbf
 
SQL> recover tablespace idx_temp;
Media recovery complete.
SQL> alter tablespace idx_temp online;
 
Tablespace altered.
 
SQL> select count(id) from t;
select count(id) from t
                      *
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 14, block # 141)
ORA-01110: data file 14: '/tmp/idx_temp_data_1.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option
 
 
SQL> analyze index t_idx validate structure;
analyze index t_idx validate structure
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 14, block # 141)
ORA-01110: data file 14: '/tmp/idx_temp_data_1.dbf'
ORA-26040: Data block was loaded using the NOLOGGING option
 

It can be seen that full scan the index blocks or using analyze validate structure can identify the block corruption. Let's see how DBVERIFY utility can tell us.

This utility can run against the data file:
dbv file='/tmp/idx_temp_data_1.dbf' logfile='dbv.log'

From the screen, I can see:

DBVERIFY: Release 10.2.0.1.0 - Production on Thu Apr 1 09:09:50 2010
 
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
 
 
DBV-00200: Block, dba 58720397, already marked corrupted
 
DBV-00200: Block, dba 58720398, already marked corrupted
 
DBV-00200: Block, dba 58720399, already marked corrupted
 
...

DBV-00200: Block, dba 58720513, already marked corrupted
 
DBV-00200: Block, dba 58720514, already marked corrupted
 
DBV-00200: Block, dba 58720515, already marked corrupted


In the log file, I have:

DBVERIFY: Release 10.2.0.1.0 - Production on Thu Apr 1 09:09:50 2010
 
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
 
 
DBVERIFY - Verification starting : FILE = /tmp/idx_temp_data_1.dbf
 
DBVERIFY - Verification complete
 
Total Pages Examined         : 25600
Total Pages Processed (Data) : 0
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 114
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 140
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 25346
Total Pages Marked Corrupt   : 112
Total Pages Influx           : 0
Highest block SCN            : 1781276169 (2536.1781276169)

So, the DBVERIFY utility did catch corrupted blocks.To find out which segments having corrupted blocks, I have to convert DBA to file# and block# by following, for example:

SQL> SELECT dbms_utility.data_block_address_block(58720399) "BLOCK",
       dbms_utility.data_block_address_file(58720399) "FILE"
FROM dual;
  2    3  
     BLOCK       FILE
---------- ----------
       143         14


SQL> SELECT owner, segment_name
  FROM DBA_EXTENTS
 WHERE file_id = &p1
   AND &p2 between block_id AND block_id + blocks - 1 ;
  2    3    4  Enter value for p1: 14
old   3:  WHERE file_id = &p1
new   3:  WHERE file_id = 14
Enter value for p2: 143
old   4:    AND &p2 between block_id AND block_id + blocks - 1
new   4:    AND 143 between block_id AND block_id + blocks - 1
 
OWNER
------------------------------
SEGMENT_NAME
--------------------------------------------------------------------------------
OPS$ORACLE
T_IDX

The DBVERIFY can also be used to check segments using segment_id as input. Per Oracle document:

SEGMENT_ID Specifies the segment that you want to verify. It is composed of the tablespace ID number (tsn), segment header file number (segfile), and segment header block number (segblock). You can get this information from SYS_USER_SEGS. The relevant columns are TABLESPACE_ID, HEADER_FILE, and HEADER_BLOCK. You must have SYSDBA privileges to query SYS_USER_SEGS.

However, in my test case, I can not find anything from SYS_USER_SEGS. Don't know why yet.

SQL> select TABLESPACE_ID, HEADER_FILE, HEADER_BLOCK from SYS_USER_SEGS where segment_name='T_IDX';
 
no rows selected

To obtain the segment_id, I have to do:

SQL> select segment_name, tablespace_name, header_file, header_block   
from dba_segments      
where segment_name='T_IDX';  
  2    3  
SEGMENT_NAME         TABLESPACE_NAME       HEADER_FILE HEADER_BLOCK
-------------------- --------------------- ----------- ------------
T_IDX                IDX_TEMP                       14          139
 

SQL> select ts#,name from v$tablespace where name='IDX_TEMP';
 
       TS# NAME
---------- ------------------------------
        17 IDX_TEMP


Finally I can do: dbv user/pass segment_id=17.14.139 , and in this case, I received same info as in the case of running against the data file.

In retrospective, "anaylyze validate structure" probably was the best option to check table and index block corruption for that database if we suspected there were other corruptions.