Saturday, October 11, 2014

OEM 12c Database Monitoring Concepts and Framework

Prepared a concept map for me to understand OEM 12c monitoring concepts and framework:



Thursday, June 12, 2014

Using analytic function can improve query performance greatly

Let's say we have a table: t (x,y,z), how to output the rows of the table with max value of z based on y?

For example if we have (1,1,2) (1,1,3),(2,1,1), we should get (1,1,3).

Below is a test case that demonstrats two approaches: Approach 1 uses subquery resulting in 38 consistent gets; Approach 2 uses analytic approach resulting in 7 consistent gets
SQL>@test
SQL>set echo on
SQL>drop table t;

Table dropped.

SQL>create table t(x number, y number, z number);

Table created.

SQL>
SQL>insert into t values(1,1,2);

1 row created.

SQL>insert into t values(1,1,3);

1 row created.

SQL>insert into t values(2,1,1);

1 row created.

SQL>
SQL>
SQL>set autotrace on
SQL>-- approach 1
SQL>
SQL>select x, y, z
  2  from t
  3  where
  4   exists ( select 1 from
  5            ( select y, max(z) mz
  6              from t
  7               group by y
  8             ) zz
  9             where t.y=zz.y
 10               and t.z=zz.mz
 11          )
 12  ;

         X          Y          Z
---------- ---------- ----------
         1          1          3

1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3359768323

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     3 |   117 |     5  (20)| 00:00:01 |
|*  1 |  FILTER              |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL  | T    |     3 |   117 |     2   (0)| 00:00:01 |
|*  3 |   FILTER             |      |       |       |            |          |
|   4 |    HASH GROUP BY     |      |     1 |    26 |     3  (34)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| T    |     1 |    26 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( EXISTS (SELECT /*+ */ 0 FROM "T" "T" WHERE "Y"=:B1 GROUP
              BY "Y",:B2 HAVING MAX("Z")=:B3))
   3 - filter(MAX("Z")=:B1)
   5 - filter("Y"=:B1)

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
         78  recursive calls
          0  db block gets
         38  consistent gets
          0  physical reads
          0  redo size
        270  bytes sent via SQL*Net to client
        247  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>
SQL>-- approach 2
SQL>
SQL>
SQL>select x,y,z from
  2  (
  3     select x,y, z, max(z) over (partition by y) mz
  4     from t
  5  )
  6  where z=mz
  7  ;

         X          Y          Z
---------- ---------- ----------
         1          1          3

1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2206009079

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     3 |   156 |     3  (34)| 00:00:01 |
|*  1 |  VIEW               |      |     3 |   156 |     3  (34)| 00:00:01 |
|   2 |   WINDOW SORT       |      |     3 |   117 |     3  (34)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T    |     3 |   117 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("Z"="MZ")

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        272  bytes sent via SQL*Net to client
        247  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

Approach 1 is essetially used in a query in one of our production databases that causes performance issue. I suggested we re-write the query using approach 2. My test results of the production-like query show that bad query needs 2M consistent gets to get 6382 rows out while the re-written query needs only about 6k.

Monday, June 02, 2014

Bitmap plans can estimate wrong cardinality

There is a problem query involving 3-table join in one of our prodcutin databases. The difference between the good plan and bad plan is the join order in a hash join execution plan. The reason for Oracle to pick up the bad join order is that Oracle CBO chooses the bitmap plan and this plan gives very wrong cardinality. What I did is to set "_b_tree_bitmap_plans"=false at system level, which allowing the CBO to generate the good plan. Then I created a sql plan baseline for that query from the good plan ( using sqlt/utl/coe_load_sql_baseline.sql script). After that, considering that there may be some queries that benfit from the bitmap access path, I changed back "_b_tree_bitmap_plans" to be true.

Below tests demonstrated how wrong the cardinaltiy estimeated by the bitmap plan could be.

  • in the bitmap plan,  the number of rows estimated is 578K
SQL> select * from BOD_OTHR_SERIAL_ACTIVITY_I8400 I8400
  2  where
  3  (I8400.DVTQ_ACCT_NUM = 'A145779917' OR (
  4      I8400.UJTJPM_CUSTOMER_ID = '' AND I8400.UJTJPM_ACCOUNT_ID = ''
  5  ))
  6  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 972526864

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                           | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                                |   578K|   170M|   161K (11)| 00:05:17 |
|   1 |  TABLE ACCESS BY INDEX ROWID     | BOD_OTHR_SERIAL_ACTIVITY_I8400 |   578K|   170M|   161K (11)| 00:05:17 |
|   2 |   BITMAP CONVERSION TO ROWIDS    |                                |       |       |            |          |
|   3 |    BITMAP OR                     |                                |       |       |            |          |
|   4 |     BITMAP CONVERSION FROM ROWIDS|                                |       |       |            |          |
|*  5 |      INDEX RANGE SCAN            | IX_I8400_04                    |       |       |     1   (0)| 00:00:01 |
|   6 |     BITMAP CONVERSION FROM ROWIDS|                                |       |       |            |          |
|*  7 |      INDEX RANGE SCAN            | IX_I8400_07                    |       |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("I8400"."UJTJPM_CUSTOMER_ID"='' AND "I8400"."UJTJPM_ACCOUNT_ID"='')
   7 - access("I8400"."DVTQ_ACCT_NUM"='A145779917')


  • -- disable the bitmap plan
SQL> alter session set "_b_tree_bitmap_plans"=false;

Session altered.


  • -- Now the estimated rows is 2
SQL> select * from BOD_OTHR_SERIAL_ACTIVITY_I8400 I8400
  2  where
  3  (I8400.DVTQ_ACCT_NUM = 'A145779917' OR (
  4      I8400.UJTJPM_CUSTOMER_ID = '' AND I8400.UJTJPM_ACCOUNT_ID = ''
  5  ))
  6  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 7978316

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                           | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                                |     2 |   618 |     9   (0)| 00:00:01 |
|   1 |  CONCATENATION               |                                |       |       |            |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| BOD_OTHR_SERIAL_ACTIVITY_I8400 |     1 |   309 |     6   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | IX_I8400_07                    |     1 |       |     4  (25)| 00:00:01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID| BOD_OTHR_SERIAL_ACTIVITY_I8400 |     1 |   309 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | IX_I8400_04                    |     1 |       |     2  (50)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("I8400"."DVTQ_ACCT_NUM"='A145779917')
   4 - filter(LNNVL("I8400"."DVTQ_ACCT_NUM"='A145779917'))
   5 - access("I8400"."UJTJPM_CUSTOMER_ID"='' AND "I8400"."UJTJPM_ACCOUNT_ID"='')


  •  the acutall number of row is 1:
SQL> select count(*)  from BOD_OTHR_SERIAL_ACTIVITY_I8400 I8400
  2      where
  3      (I8400.CUST_ACCT_NUM = 'A145779917' OR (
  4          I8400.UJTJPM_CUSTOMER_ID = '' AND I8400.UJTJPM_ACCOUNT_ID = ''
  5      ))
  6      ;

  COUNT(*)
----------
         1

1 row selected.

Sunday, April 13, 2014

SQL Tuning

Things I think important to know for sql tuning:

  • Be able to read execution plan
         - right-most first, from top to bottom
         - visual approach to assist understanding   
         - pay attention to the Predicate Information section
  • Obtaining the execution plan
         - AUTOTRACE 
         - dbms_xplan.display_curosr
  • Cardinality
         - the predicted number of rows generated by an operation
         - cardinality = selectivity * ( number of input rows)
  • COST
         - represent the optimizer's best estimate of the time it will take to execute the statement; CBO always chooses the execution plan with minimum cost.
         - using dbms_xplan.display_cursor to obtain E-row vs A-row ( set statistical_level=all;  or using /* gather_plan_statistics */ )
  • AWR, ADDM, ASH and wait interface - able to identify the offensive SQLs
  • bind variable peeking and plan instability
  • 10046 trace & tkprof
         - a rule of thumb to check efficiency: query+ current/rows < 20
         ie.  select * from ( select col_of_insterest,count(*) from table_A group by col_of_interest order by 2 desc) where rownun <=50
  • Talk to lead developers
         - tune questions not just tune queries
  • Fix a bad query or method of improvement
        - index re-design,
        - hints,
        - gather stats,
        - explore parallelism,
        - plan stability: sql profile( sqlt coe* script) and sql plan baseline (sqlt coe* script)

Friday, April 11, 2014

Creating a DG Broker Configuration and Perform Swithover testing

1. Register Static service

To enable DGMGRL to restart instances during the course of broker operations, a static service must be registered In the listener.ora, the following entries should be presented:
 LISTENER = (DESCRIPTION =
      (ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=host_name)
      (PORT=port_num))))
 SID_LIST_LISTENER=(SID_LIST=(SID_DESC=(SID_NAME=sid_name)
      (GLOBAL_DBNAME=db_unique_name_DGMGRL.db_domain)
      (ORACLE_HOME=oracle_home)
      (ENVS="TNS_ADMIN=oracle_home/network/admin")))
In my case, for example in the listener.ora, including:
 SID_LIST_LISTENER =
     (SID_DESC =
       (SID_NAME = etsdb)
       (GLOBAL_DBNAME=etsdb_DGMGRL)          
       (ORACLE_HOME = /apps/opt/oracle/product/11.2.0)
       (ENVS="TNS_ADMIN=/apps/opt/oracle/product/11.2.0/network/admin")
     )
   )

LOCAL_LISTENER initialization parameter shout set,ie

SQL> alter system set local_listener=(ADDRESS=(PROTOCOL=TCP) (HOST=server_name) (PORT=15xx)) scope=both

More details check: Oracle Data Guard Broker and Static Service Registration (Doc ID 1387859.1)

2. Create a broker configuration

pre-requisite: DG_BROKER_START should set to TRUE in order for DMON background process to start
xxx.com:/db1/opt/oracle/product/11.2.0/dbs [ettdb] $ dgmgrl              
DGMGRL for Solaris: Version 11.2.0.3.0 - 64bit Production
 
Copyright (c) 2000, 2009, Oracle. All rights reserved.
 
Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys       
Password:
Connected.
DGMGRL>gt; create configuration 'DG_ETS_CONFIG' as
> primary database is 'ettdb'
> connect identifier is ettdb;
Configuration "DG_ETS_CONFIG" created with primary database "ettdb"
DGMGRL> add database 'etsdb' as
> connect identifier is etsdb;
Database "etsdb" added
DGMGRL> show configuration
 
Configuration - DG_ETS_CONFIG
 
  Protection Mode: MaxPerformance
  Databases:
    ettdb - Primary database
    etsdb - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
DISABLED
 
3. Enable the configuration
DGMGRL> enable configuration
Enabled.
DGMGRL> show configuration
 
Configuration - DG_ETS_CONFIG
 
  Protection Mode: MaxPerformance
  Databases:
    ettdb - Primary database
    etsdb - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS

4. Test switchover
xxx.com:/home/oracle [ettdb] $ dgmgrl                
DGMGRL for Solaris: Version 11.2.0.3.0 - 64bit Production
 
Copyright (c) 2000, 2009, Oracle. All rights reserved.
 
Welcome to DGMGRL, type "help" for information.

DGMGRL> connect sys
Password:
Connected.
DGMGRL> show configuration
 
Configuration - DG_ETS_CONFIG
 
  Protection Mode: MaxPerformance
  Databases:
    ettdb - Primary database
    etsdb - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS
 
DGMGRL> switchover to etsdb
Performing switchover NOW, please wait...
New primary database "etsdb" is opening...
Operation requires shutdown of instance "ettdb" on database "ettdb"
Shutting down instance "ettdb"...
ORACLE instance shut down.
Operation requires startup of instance "ettdb" on database "ettdb"
Starting instance "ettdb"...
ORACLE instance started.
Database mounted.
Database opened.
Switchover succeeded, new primary is "etsdb"
DGMGRL> show configuration

DGMGRL> switchover to ettdb
Performing switchover NOW, please wait...
New primary database "ettdb" is opening...
Operation requires shutdown of instance "etsdb" on database "etsdb"
Shutting down instance "etsdb"...
ORACLE instance shut down.
Operation requires startup of instance "etsdb" on database "etsdb"
Starting instance "etsdb"...
ORACLE instance started.
Database mounted.
Database opened.
Switchover succeeded, new primary is "ettdb"


 
Configuration - DG_ETS_CONFIG
 
  Protection Mode: MaxPerformance
  Databases:
    etsdb - Primary database
    ettdb - Physical standby database
 
Fast-Start Failover: DISABLED
 
Configuration Status:
SUCCESS

A few issues encountered during testing:

1. When set up configuration in DGMGRL, receive ORA-00088: command cannot be executed by shared server

The application this database supports requires using shared server, the fix is to add "(server=dedicated)" in the tnsnames.ore i.e.
ettdb =
  (DESCRIPTION =
    (ADDRESS_LIST =
      (ADDRESS = (PROTOCOL = TCP)(HOST = xxx.xxx.com)(PORT = 1523))
    )
    (CONNECT_DATA =
      (SERVICE_NAME = ettdb )
      (server=dedicated) 
    )
  )

2. ORA-16664: unable to receive the result from a database

It turns out due to I have a typo when doing "create configuration":

DGMGRL> create configuration 'DG_ETS_CONFIG' as
> primary database is 'ettdb'
> connect identifier is etsdb;    ===========> should be ettdb obviously


3. switchover failed in an attempt


When I use dgmgrl connect to the primary db, I also have a dgmgrl session connected to the standby db. This appears to be the reason. In the standby site Data Guard Broker log file I saw:
SQL Execution error=604, sql=[ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY WAIT WITH SESSION SHUTDOWN]. See error stack below.
  ORA-00604: error occurred at recursive SQL level 1
  ORA-01093: ALTER DATABASE CLOSE only permitted with no sessions connected
Sswitchover to primary command failed

After disconnecting any sessions connected to the standby db, switchover to the standby succeeded.

References: http://docs.oracle.com/cd/E11882_01/server.112/e40771/configure.htm#i1021661

Tuesday, April 08, 2014

Manage Physical Standby Database - Add a Datafile

When a datafile is added to the primary database, what could happen at standby database? Today, I had some hands-on experiences about that.

Case 1 - STANDBY_FILE_MANAGEMENT set to MANUAL at standby database

In this case, an UNNAMED file will be created at $ORACLE_HOME/dbs, we need to issue 'ALTER DATABASE CREATE DATAFILE AS' to fix it.
(1) At primary:
alter tablespace dbaets_data add datafile '/db2/u01/oradata/ettdb/dbaets_data_02.dbf' size 500M;
 

(2) At standby, query v$datafile 
NAME                                                  SIZE_M
-------------------------------------------------    -------
/db2st2/u01/oradata/etsdb/system01.dbf                   500
/db2st2/u01/oradata/etsdb/sysaux01.dbf                  2048
/db2st2/u01/oradata/etsdb/undotbs01.dbf                  271
/db2st2/u01/oradata/etsdb/users01.dbf                    500
/db2st2/u01/oradata/etsdb/xdb01.dbf                     1024
/db2st2/u01/oradata/etsdb/dbaets_data_01.dbf             500
/apps/opt/oracle/product/11.2.0/dbs/UNNAMED00007           0
 

(3) At standby, issue 'ALTER DATABAE CREATE DATAFILE AS'

ALTER DATABASE CREATE DATAFILE '/apps/opt/oracle/product/11.2.0/dbs/UNNAMED00007'  AS '/db2st2/u01/oradata/etsdb/dbaets_data_02.dbf';

(4) At standby, query v$datafile again:

NAME                                                   SIZE_M
-------------------------------------------------- ----------
/db2st2/u01/oradata/etsdb/system01.dbf                    500
/db2st2/u01/oradata/etsdb/sysaux01.dbf                   2048
/db2st2/u01/oradata/etsdb/undotbs01.dbf                   271
/db2st2/u01/oradata/etsdb/users01.dbf                     500
/db2st2/u01/oradata/etsdb/xdb01.dbf                      1024
/db2st2/u01/oradata/etsdb/dbaets_data_01.dbf              500
/db2st2/u01/oradata/etsdb/dbaets_data_02.dbf              500


Case 2 - STANDBY_FILE_MANAGEMENT set to AUTO, but file cannot be created due to other error

At standby, initially I have DB_FILE_NAMEC_CONVERT set as folllow:


 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
db_file_name_convert                 string      /db2st2/u01/oradata/etsdb, /db
                                                 2/u01/oradata/ettdb
This is not in the correct order as at standby site the path is '/db2st2/u01/oradata/etsdb' whereas '/db2/u01' does not exist.

At primary, I added another datafile:

ops$oracle@ETSDB> alter tablespace dbaets_data add datafile '/db2/u01/oradata/ettdb/dbaets_data_03.dbf' size 100M;
 
Tablespace altered.

In the alert log file of  the standby database, it can be seen the file is not created and managed recovery process (MRP0)  is terminated:

Media Recovery Log /db2st2/arch/1_1115_790089239.dbf
WARNING: File being created with same name as in Primary
Existing file may be overwritten
Errors in file /apps/opt/oracle/admin/etsdb/diag/rdbms/etsdb/etsdb/trace/etsdb_pr00_8554.trc:
ORA-01119: error in creating database file '/db2/u01/oradata/ettdb/dbaets_data_03.dbf'
ORA-27054: NFS file system where the file is created or resides is not mounted with correct options
SVR4 Error: 13: Permission denied
File #8 added to control file as 'UNNAMED00008'.
Originally created as:
'/db2/u01/oradata/ettdb/dbaets_data_03.dbf'
Recovery was unable to create the file as:
'/db2/u01/oradata/ettdb/dbaets_data_03.dbf'
MRP0: Background Media Recovery terminated with error 1274
Errors in file /apps/opt/oracle/admin/etsdb/diag/rdbms/etsdb/etsdb/trace/etsdb_pr00_8554.trc:
ORA-01274: cannot add datafile '/db2/u01/oradata/ettdb/dbaets_data_03.dbf' - file could not be created
Managed Standby Recovery not using Real Time Apply
Recovery interrupted!
Recovery stopped due to failure in applying recovery marker (opcode 17.30).
Datafiles are recovered to a consistent state at change 12910838021570 but controlfile could be ahead of datafiles.

I reset the the db_file_name_covert to be '/db2/u01/oradata/ettdb','/db2st2/u01/oradata/etsdb' and restarted the MRP0, in the alert log I saw:

MRP0: Background Managed Standby Recovery process started (etsdb)
 started logmerger process
Tue Apr 08 16:43:35 2014
Managed Standby Recovery starting Real Time Apply
MRP0: Background Media Recovery terminated with error 1111
Errors in file /apps/opt/oracle/admin/etsdb/diag/rdbms/etsdb/etsdb/trace/etsdb_pr00_10330.trc:
ORA-01111: name for data file 8 is unknown - rename to correct file
ORA-01110: data file 8: '/apps/opt/oracle/product/11.2.0/dbs/UNNAMED00008'
ORA-01157: cannot identify/lock data file 8 - see DBWR trace file
ORA-01111: name for data file 8 is unknown - rename to correct file
ORA-01110: data file 8: '/apps/opt/oracle/product/11.2.0/dbs/UNNAMED00008'
Managed Standby Recovery not using Real Time Apply
Recovery Slave PR00 previously exited with exception 1111
MRP0: Background Media Recovery process shutdown (etsdb)
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT

So an UNNAMED file is created at $ORACLE_HOME/dbs  and MRP0 is stopped again. To fix:

SQL> alter system set standby_file_management=manual;
 
System altered.
 
SQL> alter database create datafile '/apps/opt/oracle/product/11.2.0/dbs/UNNAMED00008' as '/db2st2/u01/oradata/etsdb/dbaets_data_03.dbf';
 
Database altered.
 
SQL> alter system set standby_file_management=auto;
 
System altered.

Case 3 STANDBY_FILE_MANAGEMENT set to AUTO and DB_FILE_NAME_CONVERT  is correct

No issue as expected:
At primary:

ops$oracle@ETSDB> alter tablespace dbaets_data add datafile '/db2/u01/oradata/ettdb/dbaets_data_04.dbf' size 100M;
 
Tablespace altered.

At standby, the alert log shows the datafile is added with the converted file name:

Tue Apr 08 16:56:22 2014
ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT
Attempt to start background Managed Standby Recovery process (etsdb)
Tue Apr 08 16:56:22 2014
MRP0 started with pid=33, OS id=12229 
MRP0: Background Managed Standby Recovery process started (etsdb)
 started logmerger process
Tue Apr 08 16:56:27 2014
Managed Standby Recovery starting Real Time Apply
Parallel Media Recovery started with 4 slaves
Waiting for all non-current ORLs to be archived...
All non-current ORLs have been archived.
Media Recovery Log /db2st2/arch/1_1115_790089239.dbf
Media Recovery Log /db2st2/arch/1_1116_790089239.dbf
Media Recovery Log /db2st2/arch/1_1117_790089239.dbf
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT
Recovery created file /db2st2/u01/oradata/etsdb/dbaets_data_04.dbf
Successfully added datafile 9 to media recovery
Datafile #9: '/db2st2/u01/oradata/etsdb/dbaets_data_04.dbf'

Note: The testing dataguard environment is the same as the one that is described in the previous post. However, I did a switchover, so in today's testing, ettdb is primary and etsdb is standby.

Thursday, March 27, 2014

Create a physical standby database through RMAN duplicate active data files

I have a very small database ( ~ 5GB) called etsdb, which hosts an Oracle APEX application called DBAETS for DBA team internal use.

In the past, as a better-than-nothing backup and disaster recovery strategy, I export it every night and ship the export dump file to a remote server. Now I decide to create a physical standby database at the same remote server for it.

The transaction volumn of this application is typically very low as at any given time there could be just two or three DBAs logged into it to update some info. So it is ideal to use RMAN active database duplication technique to create the physical standby. In this method, RMAN duplicate process copies the active data files of the primary database directly to the remote host through network. RMAN supports two basic types of duplication: active database duplication and backup-based duplication. Oracle online documentation has a section about the considerations  to choose duplication technique.


I describe the steps involved in the following:

1. Prepare the production database to be the primary database

a. Ensure that the database is in archivelog mode

b. Enable force logging

    SQL> ALTER DATABASE FORCE LOGGING;

c. Create standby redologs

     SQL> alter database add standby logfile '/db2st2/u01/oradata/etsdb/stby_redo01.rdo' size 100M;

d. Modify the primary initialization parameter for dataguard on primary,
         
alter system set LOG_ARCHIVE_CONFIG='DG_CONFIG=(etsdb,ettdb)';
alter system set LOG_ARCHIVE_DEST_1='LOCATION=/db2st2/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=etsdalter system set LOG_ARCHIVE_DEST_2='SERVICE=ettdb LGWR ASYNC VALID_FOR=(online_LOGFILES, PRIMARY_ROLE) DB_UNIQUE_NAME=ettdb';
alter system set LOG_ARCHIVE_DEST_STATE_1=ENABLE;
alter system set FAL_SERVER=ettdb;
alter system set FAL_CLIENT=etsdb;
alter system set DB_FILE_NAME_CONVERT='/db2st2/u01/oradata/etsdb','/db2/u01/oradata/ettdb' scope=spfile;
alter system set LOG_FILE_NAME_CONVERT='/db2st2/u01/oradata/etsdb','/db2/u01/oradata/ettdb', '/db2st2/u02/oradata/etsdb','/db2/u02/oradata/ettdb' scope=spfile;

2. Ensure that the sql*net connectivity is working fine.

Configure the listener in standby and put the following tns entries in tnsnames.ora:
 ettdb =
   (DESCRIPTION =
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = standby_server)(PORT = 1523))
     )
     (CONNECT_DATA =
       (SERVICE_NAME = ettdb )
     )
   )
  
 etsdb =
   (DESCRIPTION =
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = primary_server)(PORT = 1521))
     )
     (CONNECT_DATA =
       (SERVICE_NAME = etsdb )
     )
   )
Confirm that tnsping ettdb and tnsping etsdb work on both primary and standby sites

3. Create the standby database

a. Copy the password file from the primary $ORACLE_HOME/dbs and rename it to the standby database name.

The username is required to be SYS and the password needs to be the same on the Primary and Standby. The best practice for this is to copy the passwordfile as suggested. The password file name must match the instance name/SID used at the standby site, not the DB_NAME.

orapwetsdb ----> orapwettdb

b. Create a initialization parameter with only one parameter DB_NAME.
  
 -- initettdb.ora 
 DB_NAME=etsdb
 DB_UNIQUE_NAME=ettdb
 DB_BLOCK_SIZE=8192

c. Create the necessary directories in the standby location to place the datafiles and the trace files in the $ADR_HOME.

d. Set the environment variable ORACLE_SID to the standby service and start the standby-instance.
 % export ORACLE_SID=ettdb
 % sqlplus "/ as sysdba"
 SQL> startup nomount pfile=$ORACLE_HOME/dbs/initettdb.ora

e. Verify if the connection 'AS SYSDBA' is working
  sqlplus sys@ettdb as sysdba
  sqlplus sys@etsdb as sysdba

f. On the primary system invoke the RMAN executable and connect to the primary and the auxiliary database ( i.e., the standby)

$ rman target sys/xxx@etsdb auxiliary sys/xxx@ettdb cmdfile=dup.rmn

content of dup.rmn :
duplicate target database for standby from active database
spfile
parameter_value_convert 'etsdb','ettdb'
set db_unique_name='ettdb'
set db_file_name_convert='/db2st2/u01/oradata/etsdb','/db2/u01/oradata/ettdb','/db2st2','/db2'
set log_file_name_convert='/db2st2/u01/oradata/etsdb','/db2/u01/oradata/ettdb', '/db2st2/u02/oradata/etsdb','/db2/u02/oradata/ettdb'
set control_files='/db2/u01/oradata/ettdb/control01.ctl'
set log_archive_dest_1 =  'LOCATION=/db2/arch VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=ettdb' 
set log_archive_max_processes='5'
set fal_client='ettdb'
set fal_server='etsdb'
set standby_file_management='AUTO'
set log_archive_config='dg_config=(etsdb,ettdb)'
set log_archive_dest_2='service=etsdb ASYNC valid_for=(ONLINE_LOGFILE,PRIMARY_ROLE) db_unique_name=etsdb'
;

4. Connect to standby using SQL*Plus and start the MRP ( Managed Recovery Process)

-- mount db first if required

alter database recover managed standby database disconnect from session;

5. If licensed and want to use Active Dataguard (ADG), than open the Standby Database in READ ONLY and start the revoer

Enable Active Data Guard using SQL Plus :

SQL> alter database recover managed standby database cancel;
SQL> alter database open read only;
SQL> alter database recover managed standby database disconnect using current logfile;

-- verify
SQL> select name, db_unique_name, database_role, open_mode from v$database;
 
NAME      DB_UNIQUE_NAME                 DATABASE_ROLE    OPEN_MODE
--------- ------------------------------ ---------------- --------------------
ETSDB     ettdb                          PHYSICAL STANDBY READ ONLY WITH APPLY


testing to confirm that standby db can be read and meanwhile redo logs are being applied:
in standby

create table dbaets.t as select * from all_objects
                                       *
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-16000: database open for read-only access
 


in primary

  1* select thread#, max(sequence#) from v$log_history group by thread#
SQL> /
 
   THREAD# MAX(SEQUENCE#)
---------- --------------
         1           1057
 
SQL> create table dbaets.t as select * from all_objects;
 
Table created.
 
SQL> alter system archive log current;
 
System altered.
 
SQL> select thread#, max(sequence#) from v$log_history group by thread#
  2  ;
 
   THREAD# MAX(SEQUENCE#)
---------- --------------
         1           1058


in standby:



SQL> select thread#, max(sequence#) from v$log_history group by thread#;
 
   THREAD# MAX(SEQUENCE#)
---------- --------------
         1           1058


SQL> select count(*) from dbaets.t;
 
  COUNT(*)
----------
     17855

References:

Step by Step Guide on Creating Physical Standby Using RMAN DUPLICATE...FROM ACTIVE DATABASE (Doc ID 1075908.1)

RMAN duplicate: http://docs.oracle.com/cd/E11882_01/backup.112/e10642/rcmdupdb.htm#BRADV298

Tuesday, March 25, 2014

Installation of Oracle Big Data Lite Virtual Machine on My New Desktop PC


I bought a new desktop PC - Lenovo ThinkCentre M83 with 32 GB RAM and 64-bit Windows 7. The intention is to use it as a learning environment for something like Oracle database 12c as I know it requires 64-bit OS, which I did not have at home previously. While trying to find an Oracle 12c virtual machine to install, I came across the Oracle Big Data Lite Virtual Machine page. I am excited to see that this virtual machine really contains lots of good stuff I am interested in, including the Oracle 12c database. It could be a great learning tool. So I downloaded and installed the Oracle Virtual Box, and I downloaded the virtual machine files and imported it to the Virtual Box. However when powering on the machine, I got:
 
Failed to open a session for the virtual machine xxx
          VT-X is disabled in the BIOS
          (VERR_VMX_MSR_VMXON_DISABLED)

To fix this, I need to go to the vendor's website to download and install the latest BIOS driver and use a tool provided by the vendor called CFGWIN.exe to change Bios setting.
 
To flash the BIOS from operating system, I need to issue the following command:
 
c:\SWTOOLS\FLASH\FBJY74USA>flash.cmd

To capture the current BIOS settings into a file:


C:\SWTOOLS\FLASH\FBJW73USA>CFGWIN.exe /c /path:c:\settings.txt
File Path = "c:\settings.txt".
All items have been captured successfully!

To change a BIOS settings:


C:\SWTOOLS\FLASH\FBJW73USA>CFGWIN.exe /r /path:C:\Users\Denis\Documents\settings_vtx.txt
File Path = "C:\Users\Denis\Documents\settings_vtx.txt".
Restore all BIOS settings successfully!

The settings_vtx.txt contains one line:

Intel(R) Virtualization Technology,Enabled;[Optional:Disabled,Enabled]




Basically after change the bios parameter: Intel(R) Virtualization Technology to Enabled. I bypassed the first errror shown above. Now I got something different:

This kernal requires an x86-64 CPU, but only detected an i686 CPU.

To fix this, I change the OS version from 32 bit to 64 bit in the virtual machine General tab.

Finally, I have the environment. I call it a day, happy ending. I was a little worried that I might end up unable to run the virtual machine on this new PC. Nowdays, it is easy to set up learning environment with pre-built virtual machines, just  so little time and so much to learn!








Friday, February 14, 2014

Update selected rows of a large table through rowid


We have a requirement to update a date column of about 14M rows in a large table of billions of rows .  The challenge is we have limited downtime to perform such a task.
In a Sun SPARC T4-2 machine with 128 virtual CPUs, we tested two different join methods with parallel executions:

-- nested loop took about 8h30
UPDATE TTQ_DQG.DQG_PRODUCT b
    SET billing_end_date = SYSDATE
    where exists (select /*+ parallel (a 12) */ 1 from DBSCRUB.ISOC_MAPPING_5M a where b.bo_id=a.old_bo_id);

-- hash join took about 5h
UPDATE /*+ parallel (b 12) */TTQ_DQG.DQG_PRODUCT b
    SET billing_end_date = SYSDATE
    where exists (select 1 from DBSCRUB.ISOC_MAPPING_5M a where b.bo_id=a.old_bo_id)
   ;

The 5h execution time still does not meet our downtime requirement. I proposed to perform the task in two steps.

(1) create a rowid driver table outside downtime window
(2) update the table with rowid in the downtime window

The test in the same machine shows we can do it in 41 min!
SQL>
SQL> --Step 1. Create a rowid driver table  several hours before downtime
SQL>
SQL> CREATE table  dbscrub.dqg_product_5m_rowid nologging
  2              AS
  3              select /*+ full(a) parallel( a 10) */ a.rowid rid from  TTQ_DQG.dqg_product a
  4              where exists ( select 1 from dbscrub.ISOC_MAPPING_5M b where a.bo_id=b.old_bo_id);
Table created.
Elapsed: 00:22:01.71
SQL>
SQL>
SQL>
SQL>
SQL> --Step 2 update table with rowid access in the downtime window
SQL>
SQL>
SQL> UPDATE  TTQ_DQG.CPF_PRODUCT b
  2  SET billing_end_date = SYSDATE
  3  where b.rowid in  (select /*+ parallel (a, 12) */  rid from DBSCRUB.dqg_product_5m_rowid a)
  4  ;
14431117 rows updated.
Elapsed: 00:41:34.28
SQL>
SQL>
SQL> spool off

Is this approach safe, any concerns? You may check this AskTom thread:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:53140678334596

Monday, January 27, 2014

Move ASM database files from one diskgroup to another

In the past week, I have been involved in a migration of a standalone database to RAC with ASM through datab pump export and import.  After the migration was done, one of the ASM diskgroup is more than 98% percent used, which triggerred alerts from BMC Patrol, which is a standard monitoring tool in my working environment. Searching My Oracle Support website, I found the following note:

How to move ASM database files from one diskgroup to another ? (Doc ID 330103.1)


The note is pretty straight forword, however, there was a one suprise when I tried to offline a datafile :

ORA-01145: offline immediate disallowed unless media recovery enabled

That was because I put the db in noarchivelog mode in order to speed up the data pump import jobs. After I configured the archivelog mode, thing went fine.

Here are the steps I followed for future reference

1) Identify the data file to be moved

+DATA_01/roeprdsc/datafile/non_ttq_clb_msg_01.dbf            NON_TTQ_CLB_MSG
2)  Identify the diskgroup on to which the file has to be moved.

+DATA_09/roeprdsc/datafile

3) Take the file offline.

 ALTER DATABASE DATAFILE '+DATA_01/roeprdsc/datafile/non_ttq_clb_msg_01.dbf' OFFLINE;

4. Copy the data file

I used DBMS_FILE_TRANSFER package. We can also use RMAN, detail see the note.
     
          
create or replace directory orcl1 as '+DATA_01/roeprdsc/datafile';
        
create or replace directory orcl2 as '+DATA_09/roeprdsc/datafile';
      
 Note, if need to create a directory in a diskgroup, syntax looks like


 ASM> Alter disgroup asmdsk2 add directory  '+asmdsk2/test';        
       
BEGIN
 DBMS_FILE_TRANSFER.COPY_FILE(
 source_directory_object => 'ORCL1',
 source_file_name => 'non_ttq_clb_msg_01.dbf',
 destination_directory_object => 'ORCL2',
 destination_file_name => 'non_ttq_clb_msg_01.dbf');
END;                          
/



5) Rename the file to point to new location.

ALTER DATABASE RENAME FILE '+DATA_01/roeprdsc/datafile/non_ttq_clb_msg_01.dbf' TO
 '+DATA_09/roeprdsc/datafile/non_ttq_clb_msg_01.dbf' ;


6) Recover the file.

  RECOVER DATAFILE '+DATA_09/roeprdsc/datafile/non_ttq_clb_msg_01.dbf'

7) Bring the file online.


 ALTER DATABASE DATAFILE '+DATA_09/roeprdsc/datafile/non_ttq_clb_msg_01.dbf' online;


8) Verify the new file location.
         
SQL> SELECT FILE_NAME FROM DBA_DATA_FILES where file_name like '%clb_msg%';

FILE_NAME
--------------------------------------------------------------------------------
...
+DATA_09/roeprdsc/datafile/non_ttq_clb_msg_01.dbf
+DATA_01/roeprdsc/datafile/non_ttq_clb_msg_02.dbf
+DATA_01/roeprdsc/datafile/non_ttq_clb_msg_03.dbf
+DATA_01/roeprdsc/datafile/non_ttq_clb_msg_04.dbf
+DATA_01/roeprdsc/datafile/non_ttq_clb_msg_05.dbf

..



9) Delete the file from its original location either per SQLPLUS or per ASMCMD:

Syntax examples:

SQL:ASM> ALTER DISKGROUP ASMDSK2 DROP FILE users.256.565313879;

or:   ASMCMD> rm -rf 

Note:
Most Automatic Storage Management files do not need to be manually deleted because, as Oracle managed files, they are removed automatically when they are no longer needed. ( To not to have the alert come again, I have to do this)



Tuesday, January 14, 2014

Reference: Clean up SharePlex Queues

We use SharePlex replication as our DR solution for couple of applicatins. Last Saturday night, we did a DR test for one of the very important applications, but we could not make replication working from DR site to other target. As we ran out of time in the window, we just brought up application back to the production site without fixing issues. Today I involved vendor support to test DR configuration (with a dummy configuration) again to make sure that if real DR situation happens replication will work. I have learned that the key is that we should clean up orphan or corrupted queues before activating a configuration in DR. ( note: when we issue deactivate config , suppose all associated queues with this particular conifg will be gone, if not, those left queues need to be cleaned up).


Below are the steps to clean up SharePlex queues for future reference:


1. shutdown or shutdown force at source and target
2. qview -i
3. qview> qsetup
4. qview> qstatus
5. qview> deleteq p   -- for post
   qview> deleteq x   -- for export
6. On target:  truncate splex.shareplex_trans;