Sunday, August 16, 2009

LIKE does not like bind variable

Today, I have learned this in a hard way. After the release, a query was time-out frequently. I have spent quite a lot of time to figure out what's going on. When checking the execution plan from the memory, there were always two plans, one good with nested loop join and one bad with hash join. Trying to add hint to force nested loop join, however, it did not work.

The problem is that the query has a predicate such as A.ORDER_ID LIKE 'ICN000104775%', but it is fired from application server and is using bind variable.

When checking why there were different versions of execution plan, I found out the following from the v$sql_shared_cursor in one occasion.

ADDRESS          KGLHDPAR         U S O O S L S E B P I S T A B D L T R I I R L I O S M U T N F L
---------------- ---------------- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
000000054EF09208 0000000516791558 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N
0000000532A02020 0000000516791558 N N N N N N N N N N N N N N N N N N N N N N N N N N N Y N N N N
00000005327D3808 0000000516791558 N N N N N N N N N N N N N N N N N N N N N N N N N N N Y N N N N




The forth field from the right is called USER_BIND_PEEK_MISMATCH

The following test demonstrated that LIKE operator is not getting along with bind variable:

1. In the case of literal, execution plan looks perfect
SQL> SELECT
2     a.ACCOUNT_ID
3       ,a.IO_ID
4       ,a.ORDER_ID
5       ,a.IS_MASTER
6       ,a.IS_PQ
7     FROM
8        TAB_A a,
9        TAB_B b
10     WHERE
11        a.ORDER_ID LIKE  'ISPT143730263%'
12     AND a.io_id = b.io_id
13     AND (a.IS_VALID='Y' OR (a.IS_VALID='N' AND b.ORDER_STATUS_ID=6));


Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=7 Card=1 Bytes=45)
1    0   TABLE ACCESS (BY INDEX ROWID) OF 'TAB_B' (Cost=3 Card=1 Bytes=10)
2    1     NESTED LOOPS (Cost=7 Card=1 Bytes=45)
3    2       TABLE ACCESS (BY INDEX ROWID) OF 'TAB_A' (Cost=4 Card=1 Bytes=35)
4    3         INDEX (RANGE SCAN) OF 'TAB_A_IX5' (UNIQUE) (Cost=3 Card=1)
5    2       INDEX (RANGE SCAN) OF 'TAB_B_IX1' (NON-UNIQUE) (Cost=2 Card=1)




2. In the case of bind variable, totally wrong cardinality:
SQL>
SQL>
SQL> @s1
SQL> variable v varchar2(16);
SQL> exec :v :='ISPT143730263%';

PL/SQL procedure successfully completed.

SQL>
SQL> SELECT
2     a.ACCOUNT_ID
3       ,a.IO_ID
4       ,a.ORDER_ID
5       ,a.IS_MASTER
6       ,a.IS_PQ
7     FROM
8        TAB_A a,
9        TAB_B b
10     WHERE
11        a.ORDER_ID LIKE  :v
12       AND a.io_id = b.io_id
13       AND (a.IS_VALID='Y' OR (a.IS_VALID='N' AND b.ORDER_STATUS_ID=6));


Execution Plan
----------------------------------------------------------
0      SELECT STATEMENT Optimizer=CHOOSE (Cost=227422 Card=833902 Bytes=37525590)
1    0   HASH JOIN (Cost=227422 Card=833902 Bytes=37525590)
2    1     TABLE ACCESS (BY INDEX ROWID) OF 'TAB_A' (Cost=45900 Card=1516186 Bytes=53066510)
3    2       INDEX (RANGE SCAN) OF 'TAB_A_IX5' (UNIQUE) (Cost=1064 Card=272913)
4    1     TABLE ACCESS (FULL) OF 'TAB_B' (Cost=170305 Card=30317333 Bytes=303173330)




We have a function index based on substr(order_id, 1, 13), so the fix is to use substr(order_id, 1,13)=:v as the predicate.

Ed. Sep 23, 2009 -

I come accross a paper "A LOOK UNDER THE HOOD OF CBO: THE 10053 EVENT " by Wolfgang Breitling . I have learned the filter factor (selectivity) of LIKE operator is as follows:

predicate               filter factor
-------------           -------------
c1 like  value          c1.density
c1 like :b1             0.05 OR c1.density

In the case of like with bind variable, the filter factor depends on the hidden parameter _LIKE_WITH_BIND_AS_EQUALITY. When it is false (the default), 0.05 is chosen, otherwise c1.density. This is why I saw unrealistic high cardinality in my case above.

Update: Feb 20, 2011

Today, caught by same issue again in the same database (only now we are in 10g not 9i) . Last night during code release, a query was changed:

--------------
before release
--------------

SELECT 
a.PSEFS_ID
, b.PSEFS_STATUS_ID
, TO_CHAR(b.PSEFS_CRFBTF_DATE, 'MM/DD/YYYY HH24:MI:SS') AS "PSEFS_CRFBTF_DATE"
, b.PRESALE_ACTIVITY_ID
, b.PSEFS_TYPE_ID
, b.PSEFS_SUB_TYPE_ID
, b.WORKWITH
, b.WORKWITH_GROUP_ID 
FROM PSEFS_MBSTFR a
   , PSEFS_DFTBJLS b 
WHERE substr(a.PSEFS_ID,1,13) = substr(:b,1,13) 
AND a.INT_SUPP_PSEFS_ID = 0 
AND a.INT_PSEFS_ID = b.INT_PSEFS_ID;

---------------
After release
--------------

SELECT 
  a.PSEFS_ID
, b.PSEFS_STATUS_ID
, TO_CHAR(b.PSEFS_CRFBTF_DATE, 'MM/DD/YYYY HH24:MI:SS') AS "PSEFS_CRFBTF_DATE"
, b.PRESALE_ACTIVITY_ID
, b.PSEFS_TYPE_ID
, b.PSEFS_SUB_TYPE_ID
, b.WORKWITH
, b.WORKWITH_GROUP_ID  
FROM PSEFS_MBSTFR a
   , PSEFS_DFTBJLS b  
WHERE a.PSEFS_ID like substr(:b,1,13)||'%' 
 AND a.INT_SUPP_PSEFS_ID = 0
 AND a.INT_PSEFS_ID = b.INT_PSEFS_ID
 ORDER BY a.INT_BASE_PSEFS_ID,a.IS_PQ;


So the predicate changed from : substr(a.PSEFS_ID,1,13) = substr(:b,1,13) to: a.PSEFS_ID like substr(:b,1,13)||'%'    Guess what? Now the new query uses hash join and time-out. The 0.05 filter factor makes CBO think from the predicate (a.PSEFS_ID like substr(:b,1,13)||'%' ) there will come out 580K rows, thus hash join is the best choice :-(. Fortunately, we are in 10g now, so I have the choice to use sql profile to freeze the good plan without the code change. But the predicate change in this case makes no sense, having asked the dev team to revert back in the next available maintainence window.

Friday, August 14, 2009

Misconception about direct-path insert

I had a misconception about direct-path insert. I thought it would generate minimized redo by design. Actually, direct-patch insert can be performed with both logging (by default) and nologging mode. Oracle doc is very clear about this.

Now it comes to how to perform direct-path insert with nologging. I thought it would work by just adding 'nologging' after the table name in the SQL statement, similar to the case of CTAS. It turned out to be not true.

Here was my test to insert 1 million rows to a table and measure the redo generated in each case:

Case 1: Conventional insert show redo size about 70M


DB9i> @@mystat "redo size"
DB9i> set echo off

NAME VALUE
------------------------------ ----------
redo size 63004

Elapsed: 00:00:03.06
DB9i>
DB9i> insert into t select * from big_table a where rownum <= 1000000;

1000000 rows created.

Elapsed: 00:00:57.00
DB9i> commit;

Commit complete.

Elapsed: 00:00:05.43
DB9i>
DB9i> @@mystat2
DB9i> set echo off

NAME V DIFF
------------------------------ ---------- ------------------
redo size 71015272 70,952,268



 


Case 2: Direct-path insert generate same amount of redo



DB9i> @@mystat "redo size"
DB9i> set echo off

NAME VALUE
------------------------------ ----------
redo size 142548540

Elapsed: 00:00:06.04
DB9i>
DB9i> insert /*+append */ into t select * from big_table a where rownum <= 1000000;

1000000 rows created.

Elapsed: 00:00:17.10
DB9i> commit;

Commit complete.

Elapsed: 00:00:06.37
DB9i>
DB9i> @@mystat2
DB9i> set echo off

NAME V DIFF
------------------------------ ---------- ------------------
redo size 213379400 70,830,860





Note: By adding nologging keyword as follows, nothing changes, same amount of redo:

insert /*+append */ into t nologging select * from big_table a where rownum <= 1000000;


Case 3. To enable direct-path insert with nologging, we have to specify nologging at table level:


DB9i> truncate table t;

Table truncated.

Elapsed: 00:00:08.40
DB9i>
DB9i> alter table t nologging;

Table altered.

Elapsed: 00:00:06.29
DB9i>
DB9i> @@mystat "redo size"
DB9i> set echo off

NAME VALUE
------------------------------ ----------
redo size 284863656

Elapsed: 00:00:07.37
DB9i>
DB9i> insert /*+append */ into t select * from big_table a where rownum <= 1000000;

1000000 rows created.

Elapsed: 00:00:08.17
DB9i> commit;

Commit complete.

Elapsed: 00:00:08.48
DB9i>
DB9i> @@mystat2
DB9i> set echo off

NAME V DIFF
------------------------------ ---------- ------------------
redo size 285145076 281,420



Note: Bear in mind the recovery scenario, this operation is not recoverable. After performing nologging operation, generally we should do a backup.

I should mention that the above results were obtained in a 9i database operated in ARCHIVELOG mode. If in a NOARCHIVELOG mode database, conventional inserts still generate same amount of redo. However, by just adding the hint 'append', we can see redo size dropped to around 280K. After all we do not expect a complete recovery for a NOARCHIVELOG database. So be careful about in which kind of databases the tests are being done.

Thursday, August 13, 2009

An Experiment of Partitioning

I was experimenting about a special partitioning task, which involves a 16 million row table. The goal is to create two partitions based on its primary key. There is a driver table containing
about 10-20% primary key values that determines which rows should go to the smaller partition.

First of all, I need to add a new column as a paritition key.

SQL> alter table user_a.target_tab add (s char(1));

Secondly, I created a partitioned destination table structure with exact same indexes and constraints as the target table.

create table user_a.target_tab_p
partition by list (s)
(
partition p1 values('Y'),
partition p2 values(default)
)
as
select * from user_a.target_tab
where 1=0;



Then, I performed exchanging partition,it took about 30 min, with the following command:

alter table user_a.target_tab_p
exchange partition p2
with table user_a.target_tab
without validation
update global index
parallel 8 ;



Next, I drop the original table, rename the partitioned table, index and constraint:

SQL> drop table user_a.target_tab;
SQL> alter table user_a_target_tab_p rename to target_tab;

In order to be able to update the partition key, I have to enable row movement:

SQL> alter table user_a.target_tab enable row movement;

I used the following command to update the partition key, it took about 20 min:

update user_a.target_tab set s='Y'
where id in (
SELECT id FROM driver_tab;
);

After analyze, the following query verified the successful partition:

SELECT table_owner, table_name, partition_name, num_rows
FROM dba_tab_partitions
WHERE table_name = 'TARGET_TAB'
and table_owner='USER_A'
/

TABLE_OWNER TABLE_NAME PARTITION_NAME NUM_ROWS
------------------ ----------------- -------------------- ----------
USER_A TARGET_TAB P1 1983143
USER_A TARGET_TAB P2 14124980


The whole process took less than 1 hour.


Reference: Partitioning an Existing Table using EXCHANGE PARTITION

Thursday, August 06, 2009

A note about VCS in our environment

I was asked by a fellow DBA:
1. how to findout whether the server is in a cluster or not
2. If so how to find out what are the servers in the cluster

His question is general, however in our environment I know we use VCS.

I guess we can do ps -ef | grep had at least, if we see the had deamon is running, we have a good reason to believe the server is in the cluster.

e.g.

$ ps -ef | grep had

root 23555 1 0 Jul 12 console 126:03 /opt/VRTSvcs/bin/had
root 23557 1 0 Jul 12 console 0:00 /opt/VRTSvcs/bin/hashadow

There is a VCS configuration file called /etc/VRTSvcs/conf/config/main.cf. The file contains the information that defines the cluster and its systems. Unfortunately DBA does not have permission to read it. There is a command: hastatus -summary , which should be give us the summary information about the cluster. Again, we don't have permission to run it. However I found we had a log file under the VCS log directory, and it is readable by all, the contents are as follows:



ourservpd1:/var/VRTSvcs/log [ourdb] $ cat hastatus.summary

-- SYSTEM STATE
-- System State Frozen

A ourservpd1 RUNNING 0
A ourservpd2 RUNNING 0

-- GROUP STATE
-- Group System Probed AutoDisabled State

B online_failover ourservpd1 Y N ONLINE
B online_failover ourservpd2 Y N OFFLINE
B report_static ourservpd2 Y N ONLINE





It is clear from this log about what servers are involved in the cluster.

Tuesday, August 04, 2009

NetApp DataONTAP Simulator

I am attending a 5-days training class this week about a NetApp course: DataONTAP 7.3 Fundamentals. I have never had hands-on on any storgae system before. The first good thing I have learned so for is there is a Simulator which allows us to practise many things about the storage system through NetApp DataONTAP.

You can download the Simulator from NetApp NOW site after login. To install this Simulator, you need Linux and the Simulator package itself. There is a step by step guide from NetApp , which describes how to install this Simulator on VMWare and Ubuntn Linux. However more than often like any other guides, it is slightly out-dated (dated Jan, 2007 so not a surprise), but still I followed it and got my simulator set up with a few intelligent guess.

My simulation environment is Windows XP, VMWare Server 2.0, Ubuntn Linux 9.04 Desktop edition. Here is the screenshot from the sysconfig -r command:


mysim> sysconfig -r
Aggregate aggr1 (online, raid_dp) (zoned checksums)
Plex /aggr1/plex0 (online, normal, active)
RAID group /aggr1/plex0/rg0 (normal)

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity v4.19 v4 1 3 FC:B - FCAL N/A 70/144384 77/158848
parity v4.20 v4 1 4 FC:B - FCAL N/A 70/144384 77/158848
data v4.21 v4 1 5 FC:B - FCAL N/A 70/144384 77/158848

Aggregate aggr0 (online, raid0) (zoned checksums)
Plex /aggr0/plex0 (online, normal, active)
RAID group /aggr0/plex0/rg0 (normal)

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
data v4.16 v4 1 0 FC:B - FCAL N/A 120/246784 127/261248
data v4.17 v4 1 1 FC:B - FCAL N/A 120/246784 127/261248
data v4.18 v4 1 2 FC:B - FCAL N/A 120/246784 127/261248


Spare disks

RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for zoned checksum traditional volumes or aggregates only
spare v4.22 v4 1 6 FC:B - FCAL N/A 70/144384 77/158848
spare v4.24 v4 1 8 FC:B - FCAL N/A 70/144384 77/158848
spare v4.25 v4 1 9 FC:B - FCAL N/A 70/144384 77/158848
spare v4.26 v4 1 10 FC:B - FCAL N/A 70/144384 77/158848



 

I will see if I can get an Oracle database use the storage in the Simulator

Saturday, August 01, 2009

DELETE from a view

Sometime, we need to do "scrub" against a big table, for example,delete (or update) some rows based on the condition in a second small table. We can probably perform this task through "delete from a view" if the second table has primary key constraint. The following test demonstrated the importance of table order if we do "delete from a view". We need to put the target big table first in the FROM list, otherwise we may end up deleting the small table.


SQL> create table t1
2 as
3 select rownum id,
4 rpad('*', 50) pad
5 from all_objects
6 where rownum <=100;

Table created.

SQL>
SQL>
SQL> create table t2
2 as
3 select rownum*3 id
4 from all_objects
5 where rownum <=10;

Table created.

SQL>
SQL> alter table t1 add constraint t1_pk primary key(id);

Table altered.

SQL> alter table t2 add constraint t2_pk primary key(id);

Table altered.

SQL>
SQL> execute dbms_stats.gather_table_stats(user,'t1');

PL/SQL procedure successfully completed.

SQL> execute dbms_stats.gather_table_stats(user,'t2');

PL/SQL procedure successfully completed.

SQL>
SQL> -- Order: t1, t2
SQL> delete from plan_table;

3 rows deleted.

SQL> explain plan for
2 delete from
3 ( select t1.id
4 from t1,
5 t2

6 where t2.id = t1.id
7 );

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
------------------------------------------
Plan hash value: 2755785190

-----------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
-----------------------------------------------------------------------
0 DELETE STATEMENT 10 60 1 (0) 00:00:01
1 DELETE T1
2 NESTED LOOPS 10 60 1 (0) 00:00:01
3 INDEX FULL SCAN T2_PK 10 30 1 (0) 00:00:01
* 4 INDEX UNIQUE SCAN T1_PK 1 3 0 (0) 00:00:01
------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T2"."ID"="T1"."ID")

16 rows selected.

SQL>
SQL> -- Order: t2, t1
SQL>
SQL> delete from plan_table;

5 rows deleted.

SQL> explain plan for
2 delete from
3 ( select t1.id
4 from t2,
5 t1

6 where t2.id = t1.id
7 );

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
-----------------------------------------------------------------------
Plan hash value: 525412351

-----------------------------------------------------------------------
Id Operation Name Rows Bytes Cost (%CPU) Time
-----------------------------------------------------------------------
0 DELETE STATEMENT 10 60 1 (0) 00:00:01
1 DELETE T2
2 NESTED LOOPS 10 60 1 (0) 00:00:01
3 INDEX FULL SCAN T2_PK 10 30 1 (0) 00:00:01
* 4 INDEX UNIQUE SCAN T1_PK 1 3 0 (0) 00:00:01
-------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("T2"."ID"="T1"."ID")

16 rows selected.

SQL> spool off