By design, RAC involves multiple instances accessing the same database. By analogy, a database can be thought of as an island, which is accessible by multiple brigdes (i.e. instances). If there were two brigdes and 10 incoming cars that are being driven to the island, and if the 10 cars are directed randomly toward one of the bridges, each bridge will be added 5 cars roughly, this is so-called
Client-side connect-time load balancing . That is to say regardless of how busy a brigde is, how many cars already on that brigde, and how much longer to travel through that bridge etc, each brige is treated equally. We can easily imagine that if we have a mechanizm to monitor the condition of each brigde and direct the incoming cars based on the current traffic conditions, we will be much better in providing the service. Such a mechanizm in Oracle RAC is called
Load balancing Advisory, which can be used in the Server-side connect-time load balancing. Both client-side and server-side load balancings are applied to the time when a physical connection to a database is initially opened. Since opening a connection is an expensive operation, ideally application should be able to open a connection once, then reuse it for as many times as needed. There is a function called
connection pooling at the application server layer, which establishs physical connections once and never close them. With respect to connection pools, there is a feature called "
Runtime Connection Load Balancing", esstially it allocates logical connections among the physical connections in the connection pool at the transaction level instead of load balancing at the time of the initial database connection.
In this post, I will describe a test which is to demonstrate server-side connect-time load balancing with load balancing advisory feature. The idea used in this test was inspired from
Jim Czuprynski's article , as well as in his Oracle University RAC class, which I attended about two and half years ago.
First of all, a "job" in this test is defined as a workload that involves:
(1) connect to the database as user "DENIS"
(2) do an insert and an delete to a table 1000 times
(3) disconnect
(4) repeat (1) to (3) during predefined time period, e.g 20 min.
So if I submit 100 such jobs, I would expect 100 sessions count by username "DENIS"
Secondly, I created a service called 'SLBA', which is used for the job to connect to the database. The details about how to create a LBA-enabled service can be seen in Jim's article or Oracle documentation.
The steps are as follows:
1. Query GV$SERVICEMEATIC for instance goodness before starting any jobs
12:39:59 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:40:00 12:40:05 2 100 100
12:39:58 12:40:01 1 100 100
12:38:58 12:39:58 1 100 100
12:38:53 12:39:53 2 100 100
Note: In Oracle 10.2 and above, the GOODNESS indicates how attractive the instance is with respect to processing the workload presented to the database service. Lower numbers are preferred. The umber is internally calculated based on any goal (long or short) that was specified for that database service. In Oracle 10.2 and above, the DELTA column predicts how much GOODNESS will increase for every
additional session that connects to the instance.
2. Start 10 jobs using SERVICE SLBA at RAC2
[oracle@rac2 ~]$ startSLBAload.sh 20
12:41:18 system@DEVDB> @ct
count # of session by username DENIS
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:41:27 1 3
2010-04-16 12:41:27 2 7
12:42:30 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:42:39 12:42:45 2 416 9998
12:42:33 12:42:38 1 384 9844
12:40:58 12:41:58 1 384 9844
12:40:53 12:41:54 2 416 9998
3. Start CPU LOAD at RAC2
[oracle@rac2 ~]$ cpu_load.sh &
[1] 8882
[oracle@rac2 ~]$ ps -ef | grep zip
oracle 8883 8882 16 12:43 pts/1 00:00:01 gzip video1.mpg
oracle 8884 8882 8 12:43 pts/1 00:00:00 gzip video2.mpg
Note: I created two gzip/guzip jobs at RAC2 to increase CPU utilization at RAC2.
4. Query GV$SERVICEMEATIC for instance goodness
12:44:46 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:44:48 12:44:54 2 1666 9998
12:44:43 12:44:49 1 500 9994
12:43:54 12:44:54 2 1666 9998
12:42:58 12:43:58 1 500 9994
Note: it is expected to see "Goodness" increasing in RAC2, compared to RAC1, RAC2 should be less attractive to the incomming connections.
5. Start 10 jobs using SERVICE SLBA at RAC2 - total 20 jobs
12:45:55 system@DEVDB> @ct
count # of session by username DENIS
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:46:11 1 7
2010-04-16 12:46:11 2 13
12:46:03 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:46:10 12:46:14 2 12500 9998
12:46:03 12:46:07 1 588 9998
12:44:57 12:45:58 1 588 9998
12:44:54 12:45:54 2 12500 9998
6. Start 10 jobs using SERVICE SLBA at RAC2 - total 30 jobs
12:47:52 system@DEVDB> @ct
count # of session by username DENIS
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:48:06 1 15
2010-04-16 12:48:06 2 15
12:47:58 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:48:05 12:48:10 2 12500 9998
12:47:43 12:47:48 1 750 9998
12:46:54 12:47:55 2 12500 9998
12:45:58 12:46:58 1 750 9998
7. Start 10 jobs using SERVICE SLBA at RAC2 - total 40 jobs
12:50:14 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:50:19 12:50:22 1 1428 9998
12:50:19 12:50:23 2 12500 9998
12:48:58 12:49:58 1 1428 9998
12:48:55 12:49:54 2 12500 9998
12:50:21 system@DEVDB> @ct
count # of session by username DENIS
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:50:39 1 15
2010-04-16 12:50:39 2 25
12:51:23 system@DEVDB> @ct
count # of session by username DENIS
8. Start 10 jobs using SERVICE SLBA at RAC2 - total 50 jobs
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:51:37 1 26
2010-04-16 12:51:37 2 24
12:51:30 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:51:40 12:51:43 2 12500 9998
12:51:28 12:51:33 1 2500 9998
12:49:58 12:50:58 1 2500 9998
12:49:54 12:50:55 2 12500 9998
Note: We started to see # of session in RAC1 greater than that in RAC2
8. Start 10 jobs using SERVICE SLBA at RAC2 - total 60 jobs
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:53:15 1 34
2010-04-16 12:53:15 2 26
12:53:09 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:53:15 12:53:20 2 12500 9998
12:53:08 12:53:13 1 1700 9998
12:51:58 12:52:58 1 1700 9998
12:51:55 12:52:55 2 12500 9998
9. Start 10 jobs using SERVICE SLBA at RAC2 - total 70 jobs
count # of session by username DENIS
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:54:23 1 42
2010-04-16 12:54:23 2 28
12:54:15 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:54:19 12:54:24 2 12500 9998
12:54:13 12:54:18 1 2100 9998
12:52:58 12:53:58 1 2100 9998
12:52:55 12:53:54 2 12500 9998
9. Start 10 jobs using SERVICE SLBA at RAC2 - total 80 jobs
12:55:37 system@DEVDB> @ct
count # of session by username DENIS
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:55:55 1 53
2010-04-16 12:55:55 2 27
12:55:50 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:55:55 12:55:59 2 12500 9998
12:55:48 12:55:52 1 2650 9998
12:54:52 12:55:55 2 12500 9998
12:53:58 12:54:58 1 2650 9998
9. Start 10 jobs using SERVICE SLBA at RAC2 - total 90 jobs
12:56:48 system@DEVDB> @ct
count # of session by username DENIS
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:56:59 1 59
2010-04-16 12:56:59 2 31
12:56:52 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:56:55 12:57:00 2 12500 9998
12:56:48 12:56:53 1 5000 9998
12:55:55 12:56:55 2 12500 9998
12:54:58 12:55:57 1 5000 9998
10. Start 10 jobs using SERVICE SLBA at RAC2 - total 100 jobs
12:58:55 system@DEVDB> @ct
count # of session by username DENIS
Inst
SYSDATE ID COUNT(*)
------------------- ----- ----------
2010-04-16 12:59:08 1 65
2010-04-16 12:59:08 2 35
12:59:01 system@DEVDB> @servicemetric.sql SLBA
Service Inst Good
Name Start Time End Time ID ness DELTA
-------- ---------- ---------- ----- ------- -------
SLBA 12:59:05 12:59:10 2 12500 9998
12:58:54 12:58:57 1 12500 9998
12:57:58 12:58:57 1 12500 9998
12:57:52 12:58:52 2 12500 9998
Note: At total jobs 100, the GOODNESS of two instances were evened out. The change of GOODNESS inidicated the LBA was working and somehow was used to direct the incomming connections to the less-loaded instance.
Scripts used in this test can be found
here.