Friday, June 20, 2008

Speed up the optimizer analyze job in a replicated environment

In an application, the production database is replicated to a reporting database and a DR database through SharePlex replication technology. The analyze jobs for the optimizer stats run on three instances bi-weekly. The problem is that the analyze job takes as long as 24 hours in the reporting database, which impact other jobs seriously. I have proposed a new procedure in order to reduce the time needed for the analyze job and potentially allow collectng more accurate optimizer stats within the time window available. The basic idea is to distribute the job among three instances and then combine the results, kind of divide-and-conquer.

Below I describe the steps. And overall time to obtain the stats was about 7-8 hours from the first time implementation, which involvs some manual steps. I think it is possible to automate the whole procedure if needed.


Steps:


1. Backup the current stats in PROD, RPT and DR instances:

1.1 create_stat_table

begin
DBMS_STATS.CREATE_STAT_TABLE (
'DB_ADMIN',
'MYSCHMEA_STATS_052508'
);
end;
/

1.2 export_schema_stats

begin
DBMS_STATS.EXPORT_SCHEMA_STATS (
'MYSCHEMA',
'MYSCHEMA_STATS_052508' ,
NULL,
'DB_ADMIN');
end;
/

2. Analyze group A table stats in PRD

Note: all tables that under MYSCHEMA are divided into two groups: group A and group B

It took about 4.5 hrs

3. Analyze group B table stats in DR

It took about 4.5 hrs

4. Analyze additional index stats in rpt

Note: Some indexes only exist in the reporting database

It took about 5.5 hrs

5. Sync up stats among three database

5.1 In PRD: export group A table/index stats

a. create stats table

begin
DBMS_STATS.CREATE_STAT_TABLE (
'DB_ADMIN',
'TAB_A_STATS',
NULL);
end;
/

b. run script: exp_tab_a_stats.sql



5.2 In DR: export group B table/index stats

begin
DBMS_STATS.CREATE_STAT_TABLE (
'DB_ADMIN',
'TAB_B_STATS',
NULL);
end;
/

script: exp_tab_b_stats.sql

5.3 In PRD: import group B table/index stats
a.
create table db_admin.tab_b_stats as
select * from db_admin.tab_b_stats@DB_DR;

b.
run script: imp_tab_b_stats.sql

Verify:

select table_name, last_analyzed from dba_tables where owner='MYSCHEMA' and last_analyzed < sysdate -1;

select index_name, last_analyzed from dba_indexes where owner='MYSCHEMA' and last_analyzed < sysdate -1;

5.4 RRT: import group A table/index stats


create table db_admin.tab_b_stats
as select * from db_admin.tab_b_stats@DB_DR;

create table db_admin.tab_a_stats
as select * from db_admin.tab_a_stats@DB_PRD;

script: imp_tab_a_stats.sql

5.5 Rpt: import group B table/index stats
script: imp_tab_b_stats.sql

5.6 Dr: import group A table/index stats
create table db_admin.tab_a_stats
as select * from db_admin.tab_a_stats@DB_PRD_lnk;

script: imp_tab_a_stats.sql


Note: to generate the imp/exp table stats scripts, I used the AWK script: for example:


# exp_stats.awk
{
print "begin"
print " dbms_stats.export_table_stats("
print " 'MYSCHEMA', "
print " '" $1 "',"
print " stattab => 'TAB_B_STATS' ,"
print " statown => 'DB_ADMIN' ,"
print " cascade => TRUE, "
print " );"
print "end;"
print "/"
print " "
}


# imp_stats.awk
{
print "begin"
print " dbms_stats.import_table_stats("
print " 'MYSCHEMA', "
print " '" $1 "',"
print " stattab => 'TAB_B_STATS' ,"
print " statown => 'DB_ADMIN' ,"
print " cascade => TRUE "
print " );"
print "end;"
print "/"
print " "
}

No comments: