The SQL has the following structure (actual table names, column name were modified of course):
SELECT *
FROM a,
bi,
bm
WHERE bm.bsc = 3
AND bm.bsci = 'S'
AND a.bid = bi.bid
AND a.bid = bm.vp_bid
AND
NOT EXISTS (
SELECT 1
FROM bm bm1
WHERE bm.vp_bid = bm1.vp_bid
AND bm1.bsci = 'S'
AND bm1.bsc NOT IN (3, 4));
union
SELECT *
FROM a,
bi,
bm
WHERE bm.bsc = 3
AND bm.bsci = 'S'
AND a.bid = bi.bid
AND a.bid = bm.vp_bid
AND
EXISTS (
SELECT 1
FROM bm bm1
WHERE bm.vp_bid = bm1.vp_bid
AND bm1.bsci = 'S'
AND bm1.bsc NOT IN (3, 4));
The DBA initially tried to do a dry run using hint parallel 5 for all the three tables. He said using parallel hint he got smaller cost value from the explain plan. I told him if he used parallel 10, he would get even smaller cost value, but this did not mean the SQL would run faster. I told him just start the SQL without any hint and see how long it would take.
The number of rows of the table involved are as follows:
TABLE_NAME NUM_ROWS LAST_ANALYZED
----------------- ---------- -------------------
BM 309688610 2009-01-03 07:22:04
A 15837920 2009-01-03 10:54:31
BI 15840300 2009-01-03 11:08:33
Our databae is Oracle 9.2.0.8.
It turned out the SQL did not finish after 5.5 hours running. The execution plan looked like:
------------------------------------------------------------------------
|Id | Operation | Name | Rows| Bytes |Temp |
Spc |Cost |
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | | 11M|
| 1 | SORT UNIQUE | | 110M| 9017M| 19G| 11M|
| 2 | UNION-ALL | | | | | |
| 3 | FILTER | | | | | |
| 4 | HASH JOIN | |1124K| 79M| 61M| 334K|
| 5 | HASH JOIN | |1124K| 48M| 38M| 305K|
| 6 | TABLE ACCESS FULL | BM |1124K| 25M| | 271K|
| 7 | TABLE ACCESS FULL | A | 15M| 319M| |24889|
| 8 | TABLE ACCESS FULL | BI | 15M| 440M| |17719|
| 9 | TABLE ACCESS BY INDEX ROWID| BM | 5 | 60 | | 8 |
|10 | INDEX RANGE SCAN | BM_FK5| 13 | | | 4 |
|11 | HASH JOIN | | 108M| 8937M|1844M| 703K|
|12 | HASH JOIN | | 22M| 1587M| 772M| 359K|
|13 | TABLE ACCESS FULL | BM | 22M| 514M| | 71K|
|14 | HASH JOIN | | 15M| 759M| 501M|60525|
|15 | TABLE ACCESS FULL | A | 15M| 319M| |24889|
|16 | TABLE ACCESS FULL | BI | 15M| 440M| |17719|
|17 | TABLE ACCESS FULL | BM | 115M| 1323M| | 271K|
------------------------------------------------------------------------
We were worried about this SQL, our manager set up a call with developers at about 7PM. I learned from them that we can use inion all instead of union since there are no possible dulpicate rows from the two parts. Finally we agreed with runing the two parts of the SQL separatly and load the results to two separate temp table. And DBA team only need diliver these two temp talbe to the devlopers.
I took over the task. I started a new test and using the nested loop hint for the first part:
SELECT /*+ use_nl(bm, a) use_nl(a, bi) */
*
FROM a,
bi,
bm
WHERE bm.bsc = 3
AND bm.bsci = 'S'
AND a.bid = bi.bid
AND a.bid = bm.vp_bid
AND
NOT EXISTS (
SELECT 1
FROM bm bm1
WHERE bm.vp_bid = bm1.vp_bid
AND bm1.bsci = 'S'
AND bm1.bsc NOT IN (3, 4));
The execution plan for this part is as follows:
------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 14M|
| 1 | FILTER | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | BI | 1 | 29 | 3 |
| 3 | NESTED LOOPS | | 1106K| 78M| 5797K|
| 4 | NESTED LOOPS | | 1106K| 47M| 2479K|
| 5 | TABLE ACCESS FULL | BM | 1106K| 25M| 267K|
| 6 | TABLE ACCESS BY INDEX ROWID| A | 1 | 21 | 2 |
| 7 | INDEX UNIQUE SCAN | A_PK | 1 | | 1 |
| 8 | INDEX RANGE SCAN | BI_PK | 1 | | 2 |
| 9 | TABLE ACCESS BY INDEX ROWID | BM | 5 | 60 | 8 |
| 10 | INDEX RANGE SCAN | BM_FK5| 13 | | 4 |
------------------------------------------------------------------------
I used parallel hint for the second part:
SELECT /*+ parallel (bm, 4) */
*
FROM a,
bi,
bm
WHERE bm.bsc = 3
AND bm.bsci = 'S'
AND a.bid = bi.bid
AND a.bid = bm.vp_bid
AND
EXISTS (
SELECT 1
FROM bm bm1
WHERE bm.vp_bid = bm1.vp_bid
AND bm1.bsci = 'S'
AND bm1.bsc NOT IN (3, 4));
And the execution plan looks like:
----------------------------------------------------------------------------
Id | Operation | Name|Rows|Bytes|Temp|Cost | TQ |IN-OUT|PQ
|Spc | | | |Distrib
----------------------------------------------------------------------------
0 | SELECT STATEMENT | | | | | 404K| | |
1 | HASH JOIN | | 22M|1814M|154M| 404K|98,00| P->S |QC(RAND)
2 | TABLE ACCESS FULL | BI | 15M| 438M| |17565|98,00| S->P | HASH
3 | HASH JOIN | | 22M|1202M|124M| 378K|98,00| PCWP |
4 | TABLE ACCESS FULL | A | 15M| 317M| |24624|98,00| S->P | HASH
5 | HASH JOIN SEMI | | 22M| 759M|189M| 348K|98,00| PCWP |
6 | TABLE ACCESS FULL| BM | 22M| 506M| |66931|98,00| P->P | HASH
7 | TABLE ACCESS FULL| BM |113M|1301M| | 267K|98,00| S->P | HASH
----------------------------------------------------------------------------
Good news was the first part finished in 3 hours, and the second part took 2 h 40 min. On Saturday afternoon, I did the real run and it took a little longer but finished in 4 hours.
At this point, I am not sure I chose the best way to accomplish this task. I just ensured that we deliverd as we promised. I don't know if it could be better if we just adopt the approach of using parallel hint 5 for all the three tables. It was too late for DBA to having sufficient time to find the best way to execute this SQL.
I will re-visit this SQL if I have time. Acutally I already invest some time to understand anti-join this evening - the first part of this SQL is acutally anti-join, but I have not seen Oracle choose anti-join access path in the current execution plan.
No comments:
Post a Comment