Thursday, June 12, 2014

Using analytic function can improve query performance greatly

Let's say we have a table: t (x,y,z), how to output the rows of the table with max value of z based on y?

For example if we have (1,1,2) (1,1,3),(2,1,1), we should get (1,1,3).

Below is a test case that demonstrats two approaches: Approach 1 uses subquery resulting in 38 consistent gets; Approach 2 uses analytic approach resulting in 7 consistent gets
SQL>@test
SQL>set echo on
SQL>drop table t;

Table dropped.

SQL>create table t(x number, y number, z number);

Table created.

SQL>
SQL>insert into t values(1,1,2);

1 row created.

SQL>insert into t values(1,1,3);

1 row created.

SQL>insert into t values(2,1,1);

1 row created.

SQL>
SQL>
SQL>set autotrace on
SQL>-- approach 1
SQL>
SQL>select x, y, z
  2  from t
  3  where
  4   exists ( select 1 from
  5            ( select y, max(z) mz
  6              from t
  7               group by y
  8             ) zz
  9             where t.y=zz.y
 10               and t.z=zz.mz
 11          )
 12  ;

         X          Y          Z
---------- ---------- ----------
         1          1          3

1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 3359768323

-----------------------------------------------------------------------------
| Id  | Operation            | Name | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |      |     3 |   117 |     5  (20)| 00:00:01 |
|*  1 |  FILTER              |      |       |       |            |          |
|   2 |   TABLE ACCESS FULL  | T    |     3 |   117 |     2   (0)| 00:00:01 |
|*  3 |   FILTER             |      |       |       |            |          |
|   4 |    HASH GROUP BY     |      |     1 |    26 |     3  (34)| 00:00:01 |
|*  5 |     TABLE ACCESS FULL| T    |     1 |    26 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( EXISTS (SELECT /*+ */ 0 FROM "T" "T" WHERE "Y"=:B1 GROUP
              BY "Y",:B2 HAVING MAX("Z")=:B3))
   3 - filter(MAX("Z")=:B1)
   5 - filter("Y"=:B1)

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
         78  recursive calls
          0  db block gets
         38  consistent gets
          0  physical reads
          0  redo size
        270  bytes sent via SQL*Net to client
        247  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL>
SQL>-- approach 2
SQL>
SQL>
SQL>select x,y,z from
  2  (
  3     select x,y, z, max(z) over (partition by y) mz
  4     from t
  5  )
  6  where z=mz
  7  ;

         X          Y          Z
---------- ---------- ----------
         1          1          3

1 row selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 2206009079

----------------------------------------------------------------------------
| Id  | Operation           | Name | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |      |     3 |   156 |     3  (34)| 00:00:01 |
|*  1 |  VIEW               |      |     3 |   156 |     3  (34)| 00:00:01 |
|   2 |   WINDOW SORT       |      |     3 |   117 |     3  (34)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| T    |     3 |   117 |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("Z"="MZ")

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------------------------------------------
          4  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        272  bytes sent via SQL*Net to client
        247  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

Approach 1 is essetially used in a query in one of our production databases that causes performance issue. I suggested we re-write the query using approach 2. My test results of the production-like query show that bad query needs 2M consistent gets to get 6382 rows out while the re-written query needs only about 6k.

Monday, June 02, 2014

Bitmap plans can estimate wrong cardinality

There is a problem query involving 3-table join in one of our prodcutin databases. The difference between the good plan and bad plan is the join order in a hash join execution plan. The reason for Oracle to pick up the bad join order is that Oracle CBO chooses the bitmap plan and this plan gives very wrong cardinality. What I did is to set "_b_tree_bitmap_plans"=false at system level, which allowing the CBO to generate the good plan. Then I created a sql plan baseline for that query from the good plan ( using sqlt/utl/coe_load_sql_baseline.sql script). After that, considering that there may be some queries that benfit from the bitmap access path, I changed back "_b_tree_bitmap_plans" to be true.

Below tests demonstrated how wrong the cardinaltiy estimeated by the bitmap plan could be.

  • in the bitmap plan,  the number of rows estimated is 578K
SQL> select * from BOD_OTHR_SERIAL_ACTIVITY_I8400 I8400
  2  where
  3  (I8400.DVTQ_ACCT_NUM = 'A145779917' OR (
  4      I8400.UJTJPM_CUSTOMER_ID = '' AND I8400.UJTJPM_ACCOUNT_ID = ''
  5  ))
  6  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 972526864

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name                           | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |                                |   578K|   170M|   161K (11)| 00:05:17 |
|   1 |  TABLE ACCESS BY INDEX ROWID     | BOD_OTHR_SERIAL_ACTIVITY_I8400 |   578K|   170M|   161K (11)| 00:05:17 |
|   2 |   BITMAP CONVERSION TO ROWIDS    |                                |       |       |            |          |
|   3 |    BITMAP OR                     |                                |       |       |            |          |
|   4 |     BITMAP CONVERSION FROM ROWIDS|                                |       |       |            |          |
|*  5 |      INDEX RANGE SCAN            | IX_I8400_04                    |       |       |     1   (0)| 00:00:01 |
|   6 |     BITMAP CONVERSION FROM ROWIDS|                                |       |       |            |          |
|*  7 |      INDEX RANGE SCAN            | IX_I8400_07                    |       |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("I8400"."UJTJPM_CUSTOMER_ID"='' AND "I8400"."UJTJPM_ACCOUNT_ID"='')
   7 - access("I8400"."DVTQ_ACCT_NUM"='A145779917')


  • -- disable the bitmap plan
SQL> alter session set "_b_tree_bitmap_plans"=false;

Session altered.


  • -- Now the estimated rows is 2
SQL> select * from BOD_OTHR_SERIAL_ACTIVITY_I8400 I8400
  2  where
  3  (I8400.DVTQ_ACCT_NUM = 'A145779917' OR (
  4      I8400.UJTJPM_CUSTOMER_ID = '' AND I8400.UJTJPM_ACCOUNT_ID = ''
  5  ))
  6  ;

Execution Plan
----------------------------------------------------------
Plan hash value: 7978316

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                           | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                                |     2 |   618 |     9   (0)| 00:00:01 |
|   1 |  CONCATENATION               |                                |       |       |            |       |
|   2 |   TABLE ACCESS BY INDEX ROWID| BOD_OTHR_SERIAL_ACTIVITY_I8400 |     1 |   309 |     6   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | IX_I8400_07                    |     1 |       |     4  (25)| 00:00:01 |
|*  4 |   TABLE ACCESS BY INDEX ROWID| BOD_OTHR_SERIAL_ACTIVITY_I8400 |     1 |   309 |     3   (0)| 00:00:01 |
|*  5 |    INDEX RANGE SCAN          | IX_I8400_04                    |     1 |       |     2  (50)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("I8400"."DVTQ_ACCT_NUM"='A145779917')
   4 - filter(LNNVL("I8400"."DVTQ_ACCT_NUM"='A145779917'))
   5 - access("I8400"."UJTJPM_CUSTOMER_ID"='' AND "I8400"."UJTJPM_ACCOUNT_ID"='')


  •  the acutall number of row is 1:
SQL> select count(*)  from BOD_OTHR_SERIAL_ACTIVITY_I8400 I8400
  2      where
  3      (I8400.CUST_ACCT_NUM = 'A145779917' OR (
  4          I8400.UJTJPM_CUSTOMER_ID = '' AND I8400.UJTJPM_ACCOUNT_ID = ''
  5      ))
  6      ;

  COUNT(*)
----------
         1

1 row selected.