Wednesday, May 26, 2010

Tuning by cardinality feedback and understanding of join cardinality and transitive closure

What comes out of my recent tuning experience of a 13-table join query is my deeper understanding of "Tuning by Cardinality Feedback", join cardinality and transitive closure. The problem for that 13-table join SQL is that Oracle predictes the join cardinality to be 1 always so it chooses nested loop join, whereas the hash join is able to bring the consistent gets from 400K to 6K.

First of all, let's see where the cardinality estimation is wrong. Below are the estimated cardinality from each operations in the first 4-table join: PACKAGE_MARKETS -> MARKETS -> PACKAGE_CATEGORY -> PACKAGE_PRODUCTS

10    9       NESTED LOOPS (Cost=10 Card=1 Bytes=522)
  11   10         NESTED LOOPS (Cost=9 Card=1 Bytes=325)
  12   11           NESTED LOOPS (Cost=4 Card=1 Bytes=229)
  13   12             NESTED LOOPS (Cost=3 Card=1 Bytes=224)
  14   13               NESTED LOOPS (Cost=2 Card=1 Bytes=104)
  15   14                 INDEX (UNIQUE SCAN) OF 'PK_PACKAGE_MARKETS' (UNIQUE) (Cost=1 Card=1 Bytes=12)
  16   14                 TABLE ACCESS (BY INDEX ROWID) OF 'MARKETS' (Cost=1 Card=1 Bytes=92)
  17   16                   INDEX (UNIQUE SCAN) OF 'PK_MARKETS' (UNIQUE)
  18   13               TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGES' (Cost=1 Card=1 Bytes=120)
  19   18                 INDEX (UNIQUE SCAN) OF 'PK_PACKAGES' (UNIQUE)
  20   12             TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGE_CATEGORY' (Cost=1 Card=1 Bytes=5)
  21   20               INDEX (UNIQUE SCAN) OF 'PK_PACKAGE_CATEGORY' (UNIQUE)
  22   11           TABLE ACCESS (BY INDEX ROWID) OF 'PACKAGE_PRODUCTS' (Cost=5 Card=1 Bytes=96)
  23   22             INDEX (RANGE SCAN) OF 'PK_PACKAGE_PRODUCTS' (UNIQUE) (Cost=2 Card=6)
  24   10         TABLE ACCESS (BY INDEX ROWID) OF 'PRODUCTS' (Cost=1 Card=1 Bytes=197


Below are the real cardinalities obtained from SQL_TRACE/TKPROF. we can see the real cardinality coming out of PACKAGE_PRODUCTS table is 2725 and the estimated one is 1 as shown above. This is where the whole execution plan becomes "wrong".

2725         NESTED LOOPS
      1          NESTED LOOPS
      1           NESTED LOOPS
      1            NESTED LOOPS
      1             INDEX UNIQUE SCAN PK_PACKAGE_MARKETS (object id 423961)
      1             TABLE ACCESS BY INDEX ROWID MARKETS
      1              INDEX UNIQUE SCAN PK_MARKETS (object id 423924)
      1            TABLE ACCESS BY INDEX ROWID PACKAGES
      1             INDEX UNIQUE SCAN PK_PACKAGES (object id 423954)
      1           TABLE ACCESS BY INDEX ROWID PACKAGE_CATEGORY
      1            INDEX UNIQUE SCAN PK_PACKAGE_CATEGORY (object id 423956)
   2725          TABLE ACCESS BY INDEX ROWID PACKAGE_PRODUCTS
   2811           INDEX RANGE SCAN PK_PACKAGE_PRODUCTS (object id 423963)

What I did is to add hints to force Oracle choose hash join instead of nested loop join for the remaining 9 tables.

This execersie also arouse my insterest to understand how join cardinality is calculated, thus I re-read Chapter 10 of Jonanth Lewis's Cost-Based Oracle Fundamentals. I found what he says is true ( as always :-)). Below is a test case output for a two table join SQL that involves "transitive closure" - there is a filter on the join column:
SQL>select * from v$version;

BANNER
----------------------------------------------------------------
Oracle9i Enterprise Edition Release 9.2.0.8.0 - 64bit Production
PL/SQL Release 9.2.0.8.0 - Production
CORE 9.2.0.8.0 Production
TNS for Solaris: Version 9.2.0.8.0 - Production
NLSRTL Version 9.2.0.8.0 - Production

SQL>
SQL>-- from the count we know the acutual join cardinality should be 86
SQL>select count(*) from packages where package_id='Y9995';

  COUNT(*)
----------
         1

SQL>select count(*) from package_products where package_id='Y9995';

  COUNT(*)
----------
        86

SQL>
SQL>-- JL mentioned "query_reworte_enabled" has impact on join predicate removal
SQL>-- prior to 10g in hist book "Cost-Based Oracle Fundamentals"
SQL>-- I confirmed it is true in our 9i db. I also tested with rewrite/nowrite hints
SQL>-- instead of session level "query_rewrite_enabled parameter". No impact
SQL>-- from the hints.
SQL>
SQL>
SQL>show parameter query_rewrite_enabled

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
query_rewrite_enabled                string      TRUE
SQL>alter session set query_rewrite_enabled=true;

Session altered.

SQL>
SQL>
SQL>-- query_rewrite_enabled=true
SQL>explain plan for
  2  select  *
  3    from  packages a,   -- NROWS 2656
  4        package_products b   -- NROWS 284247
  5   where a.package_id   -- NDV  2656 PK
  6       = b.package_id   -- NDV  2610 NOT NULL
  7   and a.package_id = 'Y9995'
  8  ;

Explained.

SQL>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------
| Id  | Operation                    |  Name                | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                      |     1 |   240 |     5 |
|   1 |  NESTED LOOPS                |                      |     1 |   240 |     5 |
|   2 |   TABLE ACCESS BY INDEX ROWID| PACKAGES             |     1 |   144 |     2 |
|*  3 |    INDEX UNIQUE SCAN         | PK_PACKAGES          |     1 |       |     1 |
|   4 |   TABLE ACCESS BY INDEX ROWID| PACKAGE_PRODUCTS     |     1 |    96 |     3 |
|*  5 |    INDEX RANGE SCAN          | PK_PACKAGE_PRODUCTS  |     1 |       |     2 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"."PACKAGE_ID"='Y9995')
   5 - access("B"."PACKAGE_ID"='Y9995')
       filter("A"."PACKAGE_ID"="B"."PACKAGE_ID")

Note: cpu costing is off

20 rows selected.

SQL>
SQL>alter session set query_rewrite_enabled=false;

Session altered.

SQL>
SQL>-- query_rewrite_enabled=false
SQL>explain plan for
  2  select  *
  3    from  packages a,   -- NROWS 2656
  4        package_products b   -- NROWS 284247
  5   where a.package_id   -- NDV  2656 PK
  6       = b.package_id   -- NDV  2610 NOT NULL
  7   and a.package_id = 'Y9995'
  8  ;

Explained.

SQL>
SQL>select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------

-------------------------------------------------------------------------------------
| Id  | Operation                    |  Name                | Rows  | Bytes | Cost  |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                      |   109 | 26160 |    52 |
|   1 |  NESTED LOOPS                |                      |   109 | 26160 |    52 |
|   2 |   TABLE ACCESS BY INDEX ROWID| PACKAGES             |     1 |   144 |     2 |
|*  3 |    INDEX UNIQUE SCAN         | PK_PACKAGES          |     1 |       |     1 |
|   4 |   TABLE ACCESS BY INDEX ROWID| PACKAGE_PRODUCTS     |   109 | 10464 |    50 |
|*  5 |    INDEX RANGE SCAN          | PK_PACKAGE_PRODUCTS  |   109 |       |     2 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("A"."PACKAGE_ID"='Y9995')
   5 - access("B"."PACKAGE_ID"='Y9995')

Note: cpu costing is off

19 rows selected.


The test case showed two scenarios with same SQL and apparently obtained same execution plan, but
1. with query_rewrite_enabled=true, the estimated cardinality is 1
2. with query_rewrite_enabled=false, the estimated cardinality is 109
3. There are differences in the "Predicate Information" section of the execution plan. With query_rewrite_enabled=false, there is no join predicate

So how those cardinilities are calculated by Oracle? If we have two table join in the following form:

select *
  from t1, t2
 where t1.c1 = t2.c2
   and filter_predicates_of_t1
   and filter_predicates_of_t2;

The basic join cardinality formula is as follows (from "Cost-Based Oracle Fundamentals" p266):

Join Selectivity =
  ((num_rows(t1) - num_nulls(t1.c1)) / num_rows(t1)) *
  ((num_rows(t2) - num_nulls(t2.c2)) / num_rows(t2)) /
  greater(num_distinct(t1.c1), num_distinct(t2.c2))

Join Cardinality =
   join selectivity *
   filtered cardinality(t1) * filtered cardinality(t2)


In our scenario 1:

Join Selectivity = 
  (( 2656 -0) / 2656 ) *
  (( 284247 - 0) / 284247) /
  greater(2656, 2619)
  = 1/2656

Join Cardinaltiy =
   (1/2656) *
   (2656/2656) * ( 284247 /2610) 
   = 0.004 ( rounded to 1)

In our senario 2, notice we don't have join predicate actually, Oracle eliminated the Join Selectivity part:

Join Cardinlity =
   (2656/2656) * ( 284247 /2610)
   = 109


My production 9i database has set query_rewrite_enabled=true. So if I wrote the SQL without the join condition as follows::

select  *
 from  packages a,   -- NROWS 2656
       package_products b   -- NROWS 284247
where
      b.package_id = 'Y9995'
  and a.package_id = 'Y9995'
;


I can get the correct cardinality as in the scenaro 2.

BTW, for example, if a=b, and a=5; then b=5, this is called "transitive closure".

No comments: