First of all, let's see where the cardinality estimation is wrong. Below are the estimated cardinality from each operations in the first 4-table join: PACKAGE_MARKETS -> MARKETS -> PACKAGE_CATEGORY -> PACKAGE_PRODUCTS
10 9 NESTED LOOPS (Cost=10 Card=1 Bytes=522)
11 10 NESTED LOOPS (Cost=9 Card=1 Bytes=325)
12 11 NESTED LOOPS (Cost=4 Card=1 Bytes=229)
13 12 NESTED LOOPS (Cost=3 Card=1 Bytes=224)
14 13 NESTED LOOPS (Cost=2 Card=1 Bytes=104)
15 14 INDEX (UNIQUE SCAN) OF 'PK_PACKAGE_MARKETS' (UNIQUE) (Cost=1 Card=1 Bytes=12)
16 14 TABLE ACCESS (BY INDEX ROWID) OF 'MARKETS' (Cost=1 Card=1 Bytes=92)
17 16 INDEX (UNIQUE SCAN) OF 'PK_MARKETS' (UNIQUE)
18 13 TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGES' (Cost=1 Card=1 Bytes=120)
19 18 INDEX (UNIQUE SCAN) OF 'PK_PACKAGES' (UNIQUE)
20 12 TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGE_CATEGORY' (Cost=1 Card=1 Bytes=5)
21 20 INDEX (UNIQUE SCAN) OF 'PK_PACKAGE_CATEGORY' (UNIQUE)
22 11 TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGE_PRODUCTS' (Cost=5 Card=1 Bytes=96)
23 22 INDEX (RANGE SCAN) OF 'PK_PACKAGE_PRODUCTS' (UNIQUE) (Cost=2 Card=6)
24 10 TABLE ACCESS (BY INDEX ROWID) OF 'PRODUCTS' (Cost=1 Card=1 Bytes=197
Below are the real cardinalities obtained from SQL_TRACE/TKPROF. we can see the real cardinality coming out of PACKAGE_PRODUCTS table is 2725 and the estimated one is 1 as shown above. This is where the whole execution plan becomes "wrong".
2725 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 NESTED LOOPS
1 INDEX UNIQUE SCAN PK_PACKAGE_MARKETS (object id 423961)
1 TABLE ACCESS BY INDEX ROWID MARKETS
1 INDEX UNIQUE SCAN PK_MARKETS (object id 423924)
1 TABLE ACCESS BY INDEX ROWID PACKAGES
1 INDEX UNIQUE SCAN PK_PACKAGES (object id 423954)
1 TABLE ACCESS BY INDEX ROWID PACKAGE_CATEGORY
1 INDEX UNIQUE SCAN PK_PACKAGE_CATEGORY (object id 423956)
2725 TABLE ACCESS BY INDEX ROWID PACKAGE_PRODUCTS
2811 INDEX RANGE SCAN PK_PACKAGE_PRODUCTS (object id 423963)
What I did is to add hints to force Oracle choose hash join instead of nested loop join for the remaining 9 tables.
This execersie also arouse my insterest to understand how join cardinality is calculated, thus I re-read Chapter 10 of Jonanth Lewis's Cost-Based Oracle Fundamentals. I found what he says is true ( as always :-)). Below is a test case output for a two table join SQL that involves "transitive closure" - there is a filter on the join column:
SQL>select * from v$version; BANNER ---------------------------------------------------------------- Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production PL/SQL Release 9.2.0.8.0 - Production CORE 9.2.0.8.0 Production TNS for Solaris: Version 9.2.0.8.0 - Production NLSRTL Version 9.2.0.8.0 - Production SQL> SQL>-- from the count we know the acutual join cardinality should be 86 SQL>select count(*) from packages where package_id='Y9995'; COUNT(*) ---------- 1 SQL>select count(*) from package_products where package_id='Y9995'; COUNT(*) ---------- 86 SQL> SQL>-- JL mentioned "query_reworte_enabled" has impact on join predicate removal SQL>-- prior to 10g in hist book "Cost-Based Oracle Fundamentals" SQL>-- I confirmed it is true in our 9i db. I also tested with rewrite/nowrite hints SQL>-- instead of session level "query_rewrite_enabled parameter". No impact SQL>-- from the hints. SQL> SQL> SQL>show parameter query_rewrite_enabled NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ query_rewrite_enabled string TRUE SQL>alter session set query_rewrite_enabled=true; Session altered. SQL> SQL> SQL>-- query_rewrite_enabled=true SQL>explain plan for 2 select * 3 from packages a, -- NROWS 2656 4 package_products b -- NROWS 284247 5 where a.package_id -- NDV 2656 PK 6 = b.package_id -- NDV 2610 NOT NULL 7 and a.package_id = 'Y9995' 8 ; Explained. SQL>select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 240 | 5 | | 1 | NESTED LOOPS | | 1 | 240 | 5 | | 2 | TABLE ACCESS BY INDEX ROWID| PACKAGES | 1 | 144 | 2 | |* 3 | INDEX UNIQUE SCAN | PK_PACKAGES | 1 | | 1 | | 4 | TABLE ACCESS BY INDEX ROWID| PACKAGE_PRODUCTS | 1 | 96 | 3 | |* 5 | INDEX RANGE SCAN | PK_PACKAGE_PRODUCTS | 1 | | 2 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("A"."PACKAGE_ID"='Y9995') 5 - access("B"."PACKAGE_ID"='Y9995') filter("A"."PACKAGE_ID"="B"."PACKAGE_ID") Note: cpu costing is off 20 rows selected. SQL> SQL>alter session set query_rewrite_enabled=false; Session altered. SQL> SQL>-- query_rewrite_enabled=false SQL>explain plan for 2 select * 3 from packages a, -- NROWS 2656 4 package_products b -- NROWS 284247 5 where a.package_id -- NDV 2656 PK 6 = b.package_id -- NDV 2610 NOT NULL 7 and a.package_id = 'Y9995' 8 ; Explained. SQL> SQL>select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------- ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 109 | 26160 | 52 | | 1 | NESTED LOOPS | | 109 | 26160 | 52 | | 2 | TABLE ACCESS BY INDEX ROWID| PACKAGES | 1 | 144 | 2 | |* 3 | INDEX UNIQUE SCAN | PK_PACKAGES | 1 | | 1 | | 4 | TABLE ACCESS BY INDEX ROWID| PACKAGE_PRODUCTS | 109 | 10464 | 50 | |* 5 | INDEX RANGE SCAN | PK_PACKAGE_PRODUCTS | 109 | | 2 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("A"."PACKAGE_ID"='Y9995') 5 - access("B"."PACKAGE_ID"='Y9995') Note: cpu costing is off 19 rows selected.
The test case showed two scenarios with same SQL and apparently obtained same execution plan, but
1. with query_rewrite_enabled=true, the estimated cardinality is 1
2. with query_rewrite_enabled=false, the estimated cardinality is 109
3. There are differences in the "Predicate Information" section of the execution plan. With query_rewrite_enabled=false, there is no join predicate
So how those cardinilities are calculated by Oracle? If we have two table join in the following form:
select * from t1, t2 where t1.c1 = t2.c2 and filter_predicates_of_t1 and filter_predicates_of_t2;
The basic join cardinality formula is as follows (from "Cost-Based Oracle Fundamentals" p266):
Join Selectivity = ((num_rows(t1) - num_nulls(t1.c1)) / num_rows(t1)) * ((num_rows(t2) - num_nulls(t2.c2)) / num_rows(t2)) / greater(num_distinct(t1.c1), num_distinct(t2.c2)) Join Cardinality = join selectivity * filtered cardinality(t1) * filtered cardinality(t2)
In our scenario 1:
Join Selectivity = (( 2656 -0) / 2656 ) * (( 284247 - 0) / 284247) / greater(2656, 2619) = 1/2656 Join Cardinaltiy = (1/2656) * (2656/2656) * ( 284247 /2610) = 0.004 ( rounded to 1)
In our senario 2, notice we don't have join predicate actually, Oracle eliminated the Join Selectivity part:
Join Cardinlity = (2656/2656) * ( 284247 /2610) = 109
My production 9i database has set query_rewrite_enabled=true. So if I wrote the SQL without the join condition as follows::
select * from packages a, -- NROWS 2656 package_products b -- NROWS 284247 where b.package_id = 'Y9995' and a.package_id = 'Y9995' ;
I can get the correct cardinality as in the scenaro 2.
BTW, for example, if a=b, and a=5;