For example if we have (1,1,2) (1,1,3),(2,1,1), we should get (1,1,3).
Below is a test case that demonstrats two approaches: Approach 1 uses subquery resulting in 38 consistent gets; Approach 2 uses analytic approach resulting in 7 consistent gets
SQL>@test SQL>set echo on SQL>drop table t; Table dropped. SQL>create table t(x number, y number, z number); Table created. SQL> SQL>insert into t values(1,1,2); 1 row created. SQL>insert into t values(1,1,3); 1 row created. SQL>insert into t values(2,1,1); 1 row created. SQL> SQL> SQL>set autotrace on SQL>-- approach 1 SQL> SQL>select x, y, z 2 from t 3 where 4 exists ( select 1 from 5 ( select y, max(z) mz 6 from t 7 group by y 8 ) zz 9 where t.y=zz.y 10 and t.z=zz.mz 11 ) 12 ; X Y Z ---------- ---------- ---------- 1 1 3 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3359768323 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 117 | 5 (20)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | T | 3 | 117 | 2 (0)| 00:00:01 | |* 3 | FILTER | | | | | | | 4 | HASH GROUP BY | | 1 | 26 | 3 (34)| 00:00:01 | |* 5 | TABLE ACCESS FULL| T | 1 | 26 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter( EXISTS (SELECT /*+ */ 0 FROM "T" "T" WHERE "Y"=:B1 GROUP BY "Y",:B2 HAVING MAX("Z")=:B3)) 3 - filter(MAX("Z")=:B1) 5 - filter("Y"=:B1) Note ----- - dynamic sampling used for this statement Statistics ---------------------------------------------------------- 78 recursive calls 0 db block gets 38 consistent gets 0 physical reads 0 redo size 270 bytes sent via SQL*Net to client 247 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed SQL> SQL>-- approach 2 SQL> SQL> SQL>select x,y,z from 2 ( 3 select x,y, z, max(z) over (partition by y) mz 4 from t 5 ) 6 where z=mz 7 ; X Y Z ---------- ---------- ---------- 1 1 3 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2206009079 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 156 | 3 (34)| 00:00:01 | |* 1 | VIEW | | 3 | 156 | 3 (34)| 00:00:01 | | 2 | WINDOW SORT | | 3 | 117 | 3 (34)| 00:00:01 | | 3 | TABLE ACCESS FULL| T | 3 | 117 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("Z"="MZ") Note ----- - dynamic sampling used for this statement Statistics ---------------------------------------------------------- 4 recursive calls 0 db block gets 7 consistent gets 0 physical reads 0 redo size 272 bytes sent via SQL*Net to client 247 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processedApproach 1 is essetially used in a query in one of our production databases that causes performance issue. I suggested we re-write the query using approach 2. My test results of the production-like query show that bad query needs 2M consistent gets to get 6382 rows out while the re-written query needs only about 6k.
No comments:
Post a Comment