Wednesday, April 22, 2009

Find long running SQLs in a reporting database

We have a reporting database (real sid is changed to 'mydbrpt' in the post), there are usually long running SQLs in it. Usually I check v$sessiong_longops to see them. The problem is that not all the operations could appear in this view. Today I found a long connecting session by ps -ef. I think this is a valid alternative and also wrote an AWK script to make it easy to find the long running sessions/sqls

If I type the following command at OS level:
ps -ef grep mydbrpt

I will see hundreds of line similar to the following:


......
oracle 3106 1 0 12:18:07 console 0:01 oraclemydbrpt (LOCAL=NO)
oracle 19386 1 3 15:00:40 console 20:44 ora_p004_mydbrpt
oracle 2631 4115 0 06:01:03 console 1:08 oraclemydbrpt (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 10910 14823 0 15:46:27 console 0:00 oraclemydbrpt (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 4152 4115 0 Apr 14 console 38:49 oraclemydbrpt (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 21996 1 0 15:06:25 console 0:01 oraclemydbrpt (LOCAL
=NO)
......


Noticed the field 7 or field 8. It shows the minutes of the server process that is running. So I wrote a small AWK script to filter out those processes running more than 100 min as follows:



$ cat ps.awk
{
if ( $7 == "console" )
split($8 ,a, ":")
else
split($7 ,a, ":")
if ( a[1] > 100 )
print $0
}



Then I executed the following command and got:


$ ps -ef grep mydbrpt awk -f ps.awk
oracle 18100 1 0 Mar 14 console 1883:49 ora_dbw2_mydbrpt
oracle 18088 1 0 Mar 14 console 2218:00 ora_dbw0_mydbrpt
oracle 18102 1 0 Mar 14 console 1831:44 ora_dbw3_mydbrpt
oracle 18108 1 0 Mar 14 console 450:24 ora_ckpt_mydbrpt
oracle 18093 1 0 Mar 14 console 2013:48 ora_dbw1_mydbrpt
oracle 18086 1 0 Mar 14 console 181:00 ora_pmon_mydbrpt
oracle 18118 1 0 Mar 14 console 606:40 ora_arc0_mydbrpt
oracle 18106 1 0 Mar 14 console 4377:12 ora_lgwr_mydbrpt
oracle 11329 1 3 Apr 16 console 8500:06 oraclemydbrpt (LOCAL=NO)
oracle 21487 1 1 07:44:30 console 127:23 oraclemydbrpt (LOCAL=NO)
oracle 2999 2955 1 05:00:02 console 220:51 oraclemydbrpt (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
oracle 6658 6652 0 Mar 27 console 136:19 oraclemydbrpt (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))



 

Found out the pid=11329 is from a user session which may use unreasonable execution plan:



-------------------------------------------------------------------------------
Id Operation Name Rows Bytes TempSpc Cost
-------------------------------------------------------------------------------
0 SELECT STATEMENT 33M
1 SORT AGGREGATE 1 28
2 HASH JOIN 638G 16T 199M 33M
3 TABLE ACCESS FULL CP 13M 49M 1598K
4 MERGE JOIN CARTESIAN 24G 539G 19M
5 TABLE ACCESS FULL OF 82 1722 2
6 BUFFER SORT 294M 842M 19M
7 INDEX FAST FULL SCAN BM_FK4 294M 842M 232K
-------------------------------------------------------------------------------





But I can not see this session in v$session_longops at all.

No comments: