Tuesday, April 14, 2015

Monitoring Oracle GoldenGate Latency

Today I've implemented an approach to monitor OGG latency. Here I will describe what I've done.  

1. Create a table gg_latency in source and target databases:
create table gg_latency
(
  extr varchar2(10),
  pump varchar2(10),
  repl varchar2(10),
  update_time date
);



alter table gg_latency add constraint gg_latency_pk primary key(extr, pump, repl) using index;

2. Create a procedure that is used to update the latency table: 

create or replace procedure proc_update_gg_latency
is
begin
  for rec in ( select * from gg_latency)
  loop
     update gg_latency set update_time=sysdate where extr=rec.extr and pump=rec.pump and repl = rec.repl;
     commit;
  end loop;
end;
/
3. Populate the table with every possible combination of the processing group names:

For example, in my replication enviroment, at source I have three Extract groups, three Pump groups, at target I have 15 Replicat groups :
insert into gg_latency(extr, pump, repl) values('ECRUDR1', 'PCRUDR1', 'CRURDR1');
insert into gg_latency(extr, pump, repl) values('ECRUDR1', 'PCRUDR1', 'CRURDR1A');
insert into gg_latency(extr, pump, repl) values('ECRUDR1', 'PCRUDR1', 'CRURDR1B');
insert into gg_latency(extr, pump, repl) values('ECRUDR1', 'PCRUDR1', 'CRURDR1C');
insert into gg_latency(extr, pump, repl) values('ECRUDR1', 'PCRUDR1', 'CRURDR1D');
  
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2A');
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2B');
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2C');
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2D');
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2F');
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2G');
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2H');
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2M');
insert into gg_latency(extr, pump, repl) values('ECRUDR2', 'PCRUDR2', 'CRURDR2N');

insert into gg_latency(extr, pump, repl) values('ECRUDR3', 'PCRUDR3', 'CRURDR3');

4. For each EXTRACT group parameter file at source, add the TABLE clause with WHERE option for the GG_LATENCY table , e.g.

    TABLE DB_ADMIN.GG_LATENCY WHERE ( EXTR="ECRUDR1");

Note: do this for all the EXTRACT groups


5. For each PUMP group parameter file at source, add the TABLE clause with WHERE option for the GG_LATENCY table , e.g.
    TABLE DB_ADMIN.GG_LATENCY, WHERE (PUMP="PCRUDR1");
Note: add the line before the PASSTHRU if exists, do this for all the PUMP groups


 6. For each REPLICAT group parameter file at target, add MAP clause with WHERE option for the GG_LATENCY table , e.g.

MAP DB_ADMIN.GG_LATENCY, TARGET DB_ADMIN.GG_LATENCY, WHERE (REPL='CRURDR1');
 
Note: do this for all the REPLICAT groups. In 12c OGG, single quotation mark should be used for literal string.

7. Bounce all processes as parameter files are modified 


8. Create a scheduler job to update the latency table every minute
begin
DBMS_SCHEDULER.create_job (
   job_name         => 'UPDATE_GG_LATENCY_TABLE',
    job_type        => 'PLSQL_BLOCK',
    job_action      => 'BEGIN db_admin.proc_update_gg_latency; END;',
    start_date      => trunc(sysdate, 'HH24'),
    repeat_interval => 'freq=minutely',
    end_date        => NULL,
    enabled         => TRUE
    );
end;
/

9. Check latency by the following query at target:
 
SQL> select extr, pump, repl, update_time, round((sysdate - update_time) *24*60) latency_mins from gg_latency;

EXTR       PUMP       REPL       UPDATE_TIME          LATENCY_MINS
---------- ---------- ---------- -------------------- ------------
ECRUDR1    PCRUDR1    CRURDR1D   14-Apr-2015 12:46:00            1
ECRUDR1    PCRUDR1    CRURDR1B   14-Apr-2015 12:46:00            1
ECRUDR1    PCRUDR1    CRURDR1A   14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2D   14-Apr-2015 12:46:00            1
ECRUDR1    PCRUDR1    CRURDR1C   14-Apr-2015 12:46:00            1
ECRUDR1    PCRUDR1    CRURDR1    14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2H   14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2C   14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2N   14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2B   14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2A   14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2M   14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2G   14-Apr-2015 12:46:00            1
ECRUDR2    PCRUDR2    CRURDR2F   14-Apr-2015 12:46:00            1
ECRUDR3    PCRUDR3    CRURDR3    14-Apr-2015 12:46:00            1

15 rows selected.

Note: As we update every minute, the smallest unit for latency is a minute.