SELECT distinct C.INT_PSEFS_ID ,C.ACCOUNT_ID ,C.PSEFS_ID ,C.IS_PQ ,C.psefs_save_date FROM MYSCH.PSEFS_MSTR A ,MYSCH.PSEFS_DTLS B ,MYSCH.PSEFS_MSTR C WHERE A.INT_PSEFS_ID = B.INT_PSEFS_ID AND C.PSEFS_ID LIKE SUBSTR (A.PSEFS_ID, 1, 13)|| '%' AND A.IS_VALID = 'N' AND B.PSEFS_STATUS_ID IN (4, 6) AND A.PSEFS_COMP_DATE <= SYSDATE - 45 ;The execution plan is as follows:
--------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 66T| 4861T| | 3064G (2)|999:59:59 | | 1 | HASH UNIQUE | | 66T| 4861T| 11P| 3064G (2)|999:59:59 | | 2 | TABLE ACCESS BY INDEX ROWID| PSEFS_MSTR | 3287K| 122M| | 149K (1)| 00:29:49 | | 3 | NESTED LOOPS | | 66T| 4861T| | 3027G (1)|999:59:59 | |* 4 | HASH JOIN | | 20M| 794M| 550M| 839K (3)| 02:47:49 | |* 5 | TABLE ACCESS FULL | PSEFS_DTLS | 26M| 250M| | 612K (2)| 02:02:29 | |* 6 | TABLE ACCESS FULL | PSEFS_MSTR | 20M| 600M| | 156K (6)| 00:31:17 | |* 7 | INDEX RANGE SCAN | PSEFS_MSTR_IX5 | 591K| | | 5154 (1)| 00:01:02 | ---------------------------------------------------------------------------------------------------------
INT_PSEFS_ID is a PK of A, and a FK of B, so A join B on INT_PSEFS_ID will generate a result set (refer to as R1) with duplicated records of A; Then we selected out row of C (same table as A) by join it with R1t based on an unusual join condition with LIKE operator. To see the problem by way of example, supposing we have a row in C with PSRFS_ID='ABCDEFGHIJKLM_2'; In R1, we could have several rows with PSRFS_ID in('ABCDEFGHIJKLM_3', 'ABCDEFGHIJKLM_4', 'ABCDEFGHIJKLM_4', 'ABCDEFGHIJKLM'), due to the LIKE, the row in C will match 4 rows in R1, thus generate 4 duplicated rows in the final result set. I rewrote the SQL to use sub-queries to enable the more efficient semi-joins intead of normal joins
select C.INT_PSEFS_ID , C.ACCOUNT_ID , C.PSEFS_ID,IS_PQ , C.psefs_save_date from MYSCH.PSEFS_MSTR C where SUBSTR(C.PSEFS_ID,1,13) in ( select SUBSTR(A.PSEFS_ID,1,13) from MYSCH.PSEFS_MSTR A where A.IS_VALID = 'N' and A.PSEFS_COMP_DATE <= SYSDATE - 45 and exists ( select 1 from MYSCH.PSEFS_DTLS B where A.int_psefs_id = B.int_psefs_id and B.PSEFS_STATUS_ID IN (4, 6) ) ) ;Now the execution plan looks like:
------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 26M| 1182M| | 1174K (3)| 03:54:49 | |* 1 | HASH JOIN RIGHT SEMI | | 26M| 1182M| 387M| 1174K (3)| 03:54:49 | | 2 | VIEW | VW_NSO_1 | 20M| 155M| | 839K (3)| 02:47:49 | |* 3 | HASH JOIN RIGHT SEMI| | 20M| 794M| 550M| 839K (3)| 02:47:49 | |* 4 | TABLE ACCESS FULL | PSEFS_DTLS | 26M| 250M| | 612K (2)| 02:02:29 | |* 5 | TABLE ACCESS FULL | PSEFS_MSTR | 20M| 600M| | 156K (6)| 00:31:17 | | 6 | TABLE ACCESS FULL | PSEFS_MSTR | 65M| 2445M| | 153K (4)| 00:30:45 | ------------------------------------------------------------------------------------------------
Fig 1. Good query visual representation |
Fig 2. Bad query visual representaion |
Note: though this sql is actually rather simple, the visual approach advocated by Jonathan Lewis in this article really helps me to do the analysis
No comments:
Post a Comment