First of all, let's see where the cardinality estimation is wrong. Below are the estimated cardinality from each operations in the first 4-table join: PACKAGE_MARKETS -> MARKETS -> PACKAGE_CATEGORY -> PACKAGE_PRODUCTS
10 9 NESTED LOOPS (Cost=10 Card=1 Bytes=522)
11 10 NESTED LOOPS (Cost=9 Card=1 Bytes=325)
12 11 NESTED LOOPS (Cost=4 Card=1 Bytes=229)
13 12 NESTED LOOPS (Cost=3 Card=1 Bytes=224)
14 13 NESTED LOOPS (Cost=2 Card=1 Bytes=104)
15 14 INDEX (UNIQUE SCAN) OF 'PK_PACKAGE_MARKETS' (UNIQUE) (Cost=1 Card=1 Bytes=12)
16 14 TABLE ACCESS (BY INDEX ROWID) OF 'MARKETS' (Cost=1 Card=1 Bytes=92)
17 16 INDEX (UNIQUE SCAN) OF 'PK_MARKETS' (UNIQUE)
18 13 TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGES' (Cost=1 Card=1 Bytes=120)
19 18 INDEX (UNIQUE SCAN) OF 'PK_PACKAGES' (UNIQUE)
20 12 TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGE_CATEGORY' (Cost=1 Card=1 Bytes=5)
21 20 INDEX (UNIQUE SCAN) OF 'PK_PACKAGE_CATEGORY' (UNIQUE)
22 11 TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGE_PRODUCTS' (Cost=5 Card=1 Bytes=96)
23 22 INDEX (RANGE SCAN) OF 'PK_PACKAGE_PRODUCTS' (UNIQUE) (Cost=2 Card=6)
24 10 TABLE ACCESS (BY INDEX ROWID) OF 'PRODUCTS' (Cost=1 Card=1 Bytes=197
Below are the real cardinalities obtained from SQL_TRACE/TKPROF. we can see the real cardinality coming out of PACKAGE_PRODUCTS table is 2725 and the estimated one is 1 as shown above. This is where the whole execution plan becomes "wrong".
2725 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 INDEX UNIQUE SCAN PK_PACKAGE_MARKETS (object id 423961)
1 TABLE ACCESS BY INDEX ROWID MARKETS
1 INDEX UNIQUE SCAN PK_MARKETS (object id 423924)
1 TABLE ACCESS BY INDEX ROWID PACKAGES
1 INDEX UNIQUE SCAN PK_PACKAGES (object id 423954)
1 TABLE ACCESS BY INDEX ROWID PACKAGE_CATEGORY
1 INDEX UNIQUE SCAN PK_PACKAGE_CATEGORY (object id 423956)
2725 TABLE ACCESS BY INDEX ROWID PACKAGE_PRODUCTS
2811 INDEX RANGE SCAN PK_PACKAGE_PRODUCTS (object id 423963)
What I did is to add hints to force Oracle choose hash join instead of nested loop join for the remaining 9 tables.
This execersie also arouse my insterest to understand how join cardinality is calculated, thus I re-read Chapter 10 of Jonanth Lewis's Cost-Based Oracle Fundamentals. I found what he says is true ( as always :-)). Below is a test case output for a two table join SQL that involves "transitive closure" - there is a filter on the join column:
SQL>select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
PL/SQL Release 9.2.0.8.0 - Production
CORE 9.2.0.8.0 Production
TNS for Solaris: Version 9.2.0.8.0 - Production
NLSRTL Version 9.2.0.8.0 - Production
SQL>
SQL>-- from the count we know the acutual join cardinality should be 86
SQL>select count(*) from packages where package_id='Y9995';
COUNT(*)
----------
1
SQL>select count(*) from package_products where package_id='Y9995';
COUNT(*)
----------
86
SQL>
SQL>-- JL mentioned "query_reworte_enabled" has impact on join predicate removal
SQL>-- prior to 10g in hist book "Cost-Based Oracle Fundamentals"
SQL>-- I confirmed it is true in our 9i db. I also tested with rewrite/nowrite hints
SQL>-- instead of session level "query_rewrite_enabled parameter". No impact
SQL>-- from the hints.
SQL>
SQL>
SQL>show parameter query_rewrite_enabled
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
query_rewrite_enabled string TRUE
SQL>alter session set query_rewrite_enabled=true;
Session altered.
SQL>
SQL>
SQL>-- query_rewrite_enabled=true
SQL>explain plan for
2 select *
3 from packages a, -- NROWS 2656
4 package_products b -- NROWS 284247
5 where a.package_id -- NDV 2656 PK
6 = b.package_id -- NDV 2610 NOT NULL
7 and a.package_id = 'Y9995'
8 ;
Explained.
SQL>select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 240 | 5 |
| 1 | NESTED LOOPS | | 1 | 240 | 5 |
| 2 | TABLE ACCESS BY INDEX ROWID| PACKAGES | 1 | 144 | 2 |
|* 3 | INDEX UNIQUE SCAN | PK_PACKAGES | 1 | | 1 |
| 4 | TABLE ACCESS BY INDEX ROWID| PACKAGE_PRODUCTS | 1 | 96 | 3 |
|* 5 | INDEX RANGE SCAN | PK_PACKAGE_PRODUCTS | 1 | | 2 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("A"."PACKAGE_ID"='Y9995')
5 - access("B"."PACKAGE_ID"='Y9995')
filter("A"."PACKAGE_ID"="B"."PACKAGE_ID")
Note: cpu costing is off
20 rows selected.
SQL>
SQL>alter session set query_rewrite_enabled=false;
Session altered.
SQL>
SQL>-- query_rewrite_enabled=false
SQL>explain plan for
2 select *
3 from packages a, -- NROWS 2656
4 package_products b -- NROWS 284247
5 where a.package_id -- NDV 2656 PK
6 = b.package_id -- NDV 2610 NOT NULL
7 and a.package_id = 'Y9995'
8 ;
Explained.
SQL>
SQL>select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 109 | 26160 | 52 |
| 1 | NESTED LOOPS | | 109 | 26160 | 52 |
| 2 | TABLE ACCESS BY INDEX ROWID| PACKAGES | 1 | 144 | 2 |
|* 3 | INDEX UNIQUE SCAN | PK_PACKAGES | 1 | | 1 |
| 4 | TABLE ACCESS BY INDEX ROWID| PACKAGE_PRODUCTS | 109 | 10464 | 50 |
|* 5 | INDEX RANGE SCAN | PK_PACKAGE_PRODUCTS | 109 | | 2 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("A"."PACKAGE_ID"='Y9995')
5 - access("B"."PACKAGE_ID"='Y9995')
Note: cpu costing is off
19 rows selected.
The test case showed two scenarios with same SQL and apparently obtained same execution plan, but
1. with query_rewrite_enabled=true, the estimated cardinality is 1
2. with query_rewrite_enabled=false, the estimated cardinality is 109
3. There are differences in the "Predicate Information" section of the execution plan. With query_rewrite_enabled=false, there is no join predicate
So how those cardinilities are calculated by Oracle? If we have two table join in the following form:
select * from t1, t2 where t1.c1 = t2.c2 and filter_predicates_of_t1 and filter_predicates_of_t2;
The basic join cardinality formula is as follows (from "Cost-Based Oracle Fundamentals" p266):
Join Selectivity = ((num_rows(t1) - num_nulls(t1.c1)) / num_rows(t1)) * ((num_rows(t2) - num_nulls(t2.c2)) / num_rows(t2)) / greater(num_distinct(t1.c1), num_distinct(t2.c2)) Join Cardinality = join selectivity * filtered cardinality(t1) * filtered cardinality(t2)
In our scenario 1:
Join Selectivity = (( 2656 -0) / 2656 ) * (( 284247 - 0) / 284247) / greater(2656, 2619) = 1/2656 Join Cardinaltiy = (1/2656) * (2656/2656) * ( 284247 /2610) = 0.004 ( rounded to 1)
In our senario 2, notice we don't have join predicate actually, Oracle eliminated the Join Selectivity part:
Join Cardinlity = (2656/2656) * ( 284247 /2610) = 109
My production 9i database has set query_rewrite_enabled=true. So if I wrote the SQL without the join condition as follows::
select *
from packages a, -- NROWS 2656
package_products b -- NROWS 284247
where
b.package_id = 'Y9995'
and a.package_id = 'Y9995'
;
I can get the correct cardinality as in the scenaro 2.
BTW, for example, if a=b, and a=5;
No comments:
Post a Comment