Tuesday, March 03, 2009

Discovering the I/O pattern of a production database through mining STATSPACK data

1. Total I/O in every hour during weekdays

Total I/O is defined as the number of physical reads plus the number of physical writes. They can be obtained throught the following two scripts, respectively.


---- Sample Script to obtain physical reads ---------
select
to_char(sn.snap_time, 'YYYY-MM-DD HH24') start_time,
sum(b.value - a.value) val
from perfstat.stats$sysstat a,
perfstat.stats$sysstat b,
perfstat.stats$snapshot sn
where sn.snap_id = a.snap_id
and a.snap_id = b.snap_id -1
and a.statistic# = 42
and b.statistic# = 42
and snap_time >= to_date('2009-01-26', 'YYYY-MM-DD')
and snap_time < to_date('2009-01-31', 'YYYY-MM-DD')
group by to_char(sn.snap_time, 'YYYY-MM-DD HH24')
;
---- END of Sample Script to obtain physical reads ---------

---- Sample Script to obtain physical writes---------

select
to_char(sn.snap_time, 'YYYY-MM-DD HH24') start_time,
sum(b.value - a.value) val
from perfstat.stats$sysstat a,
perfstat.stats$sysstat b,
perfstat.stats$snapshot sn
where sn.snap_id = a.snap_id
and a.snap_id = b.snap_id -1
and a.statistic# = 46
and b.statistic# = 46
and snap_time >= to_date('2009-01-26', 'YYYY-MM-DD')
and snap_time < to_date('2009-01-31', 'YYYY-MM-DD')
group by to_char(sn.snap_time, 'YYYY-MM-DD HH24')
;
---- END of Sample Script to obtain physical writes---------



2. Average Time Per Read (ATPR) in every hour during weekdays


In the Statspack report, there is a section called "Tablespace I/O Statistics", along with the "File I/O section" they can be used to identify whether I/O is especially slow or there are an exceptional number of I/Os on any specific data file or tablespaces. There is a field called "Av Rd (ms)", representing average time per read in milli-second. Generally speaking, 20ms - 40ms reads may be considered slow for single block reads. In this post, I used a script to count the number of tablespaces that have ATPR greater than 100ms.


----- Sample script to obtain the number of tablespaces
------ that have ATPR greater than 100ms
select mydate, sum(case when atpr_ms > 100 then 1 else 0 end)
from (
select
to_char(snap_time,'yyyy-mm-dd HH24') mydate,
old.tsname,
sum(new.phyrds-old.phyrds) phy_rds,
sum(new.phywrts-old.phywrts) phy_wrts,
sum(new.readtim-old.readtim) read_tim,
sum(new.writetim-old.writetim) write_tim,
decode(sum(new.phyrds-old.phyrds), 0,0,
sum(new.readtim-old.readtim) *10 / sum(new.phyrds-old.phyrds) ) atpr_ms
from
perfstat.stats$filestatxs old,
perfstat.stats$filestatxs new,
perfstat.stats$snapshot sn
where
old.snap_id = sn.snap_id
and
old.filename = new.filename
and
new.snap_id = sn.snap_id + 1
and
(new.phyrds-old.phyrds) > 0
and snap_time >= to_date('2009-01-26', 'YYYY-MM-DD')
and snap_time < to_date('2009-01-31', 'YYYY-MM-DD')
group by
to_char(snap_time,'yyyy-mm-dd HH24'),
old.tsname
)
group by mydate
;


3. Grapth the I/O pattern:

Sample Graph - Total IO changes with every hour
Sample Graph - Average Time Per Read Count

1 comment:

yds said...

Jonathan Lewis summarized three reasons to use statspack in this post . I found that my post fits the reason 1- simply to observe patterns and trends over time