Yesterday I was requested to check a "scrub" job that application team was running in a test environment, which would go into production at night. This "scrub" job was essentially to update a column of three tables using the value from the driver table. The developer provided the actual code he used as follows: ( The table names and colunm names were modified)
DECLARE counter NUMBER(10):= 0; CURSOR upd_tabs IS SELECT x, y FROM D; BEGIN FOR rec IN upd_tabs LOOP UPDATE T1 SET y= rec.y WHERE x=rec.x; UPDATE T2 SET y= rec.y WHERE z=rec.x; UPDATE T3 SET y= rec.y WHERE z=rec.x; counter:=counter+1; IF MOD(counter,1000) = 0 )THEN INSERT INTO update_log (COMMIT_TIME, COMMIT_COUNTER) VALUES (SYSDATE, counter); COMMIT; END IF; IF upd_tabs%NOTFOUND THEN EXIT; END IF; END LOOP; COMMIT; END;
Maybe several days ago , I came across a thread on AskTom forum, which talks about update the first table with second table. At that time I did not study throughly. I re-visited that thread, and I think Tom' s approach - update a join, can be used in our case. I checked the table involed on production database, found that:
T1 (x, y ... ) - x is primary key 3M
T2 (z, y ... ) - There is an index on z 14M
T3 (z, y ... ) - Primary key on z and another column, z is the leading column 9M
D (x, y ... ) - no primary key constraint, x is not null, but most likly x is unique also
If we can enfore the primary key constraint on D (x), we can write the following sqls:
update ( select T1.y T1_y, D.y D_y from T1, D where T1.x = D.x ) set T1_y = D_y /
As a rule of thumb, if problem can be solved using only SQL statment, it is always better than using PL/SQL. If we take a close check on the alogrithm of the PL/SQL code above, we can find that it is no difference than the execution plan shown in the following test case: ( This test case is just repeating what Tom Kyte did in the AskTom forum thread )
denis@DB10G> denis@DB10G> create table t1 2 ( x int constraint t1_pk primary key, 3 y int ); Table created. denis@DB10G> denis@DB10G> create table t2 2 ( x int constraint t2_pk primary key, 3 y int ); Table created. denis@DB10G> denis@DB10G> insert into t1 values (1,1); 1 row created. denis@DB10G> insert into t1 values (2,1); 1 row created. denis@DB10G> insert into t1 values (3,1); 1 row created. denis@DB10G> denis@DB10G> insert into t2 values (2,2); 1 row created. denis@DB10G> insert into t2 values (3,2); 1 row created. denis@DB10G> denis@DB10G> denis@DB10G> -- update a join denis@DB10G> set autotrace on explain denis@DB10G> update 2 ( select /*+ USE_NL(t1) INDEX (t1 t1_pk ) */ 3 t1.y t1_y, t2.y t2_y 4 from t1, t2 5 where t1.x = t2.x ) 6 set t1_y = t2_y 7 / 2 rows updated. Execution Plan ---------------------------------------------------------- Plan hash value: 2040170521 --------------------------------------------------------------------------------------- Id Operation Name Rows Bytes Cost (%CPU) Time --------------------------------------------------------------------------------------- 0 UPDATE STATEMENT 2 104 5 (0) 00:00:01 1 UPDATE T1 2 NESTED LOOPS 2 104 5 (0) 00:00:01 3 TABLE ACCESS FULL T2 2 52 3 (0) 00:00:01 4 TABLE ACCESS BY INDEX ROWID T1 1 26 1 (0) 00:00:01 * 5 INDEX UNIQUE SCAN T1_PK 1 0 (0) 00:00:01 --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T1"."X"="T2"."X") Note ----- - dynamic sampling used for this statement denis@DB10G> denis@DB10G> set autotrace off; denis@DB10G> denis@DB10G>> select * from t1; X Y ---------- ---------- 1 1 2 2 3 2 denis@DB10G> denis@DB10G> -- for larger tables the hint may not be necessary, Oracle can denis@DB10G> -- pick up the right plan denis@DB10G> denis@DB10G> insert into t1 select rownum + 5 , 1 2 from all_objects 3 / 49841 rows created. denis@DB10G> denis@DB10G> analyze table t1 compute statistics; Table analyzed. denis@DB10G> denis@DB10G> analyze table t2 compute statistics; Table analyzed. denis@DB10G> denis@DB10G> set autotrace on explain denis@DB10G> update 2 ( select t1.y t1_y, t2.y t2_y 3 from t1, t2 4 where t1.x = t2.x ) 5 set t1_y = t2_y 6 / 2 rows updated. Execution Plan ---------------------------------------------------------- Plan hash value: 2040170521 --------------------------------------------------------------------------------------- Id Operation Name Rows Bytes Cost (%CPU) Time --------------------------------------------------------------------------------------- 0 UPDATE STATEMENT 2 20 5 (0) 00:00:01 1 UPDATE T1 2 NESTED LOOPS 2 20 5 (0) 00:00:01 3 TABLE ACCESS FULL T2 2 8 3 (0) 00:00:01 4 TABLE ACCESS BY INDEX ROWID T1 1 6 1 (0) 00:00:01 * 5 INDEX UNIQUE SCAN T1_PK 1 0 (0) 00:00:01 --------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T1"."X"="T2"."X") denis@DB10G> set autotrace off denis@DB10G>
Per Tom:
It should be NOTED that in order for the update of a join to work, T2 in the above example must have a primary or unique key on it. If T2 does not, you'll not be able to update the join. The optimizer needs to know that T1 will be joined to AT MOST 1 row in T2 and the way it does this is by seeing that a primary key or unique constraint exists on the join conditions