 |
|
Load Balancing
Disks with Oracle Databases
Oracle Tips by Burleson |
With terabyte-sized and Web-enabled Oracle
databases becoming more commonplace, the task of disk load balancing
has never been more critical. These huge databases are too massive to
be cached in an Oracle data buffer, yet these databases often serve
thousands of users who expect instant response times. The most
important thing that the DBA can do to minimize disk I/O is to balance
the load on the disks.
By placing datafiles strategically on the
physical disks, you can minimize the likelihood of any one disk
becoming stalled while handling simultaneous I/O requests. This
section provides a strategy for collecting I/O information into Oracle
tables and generating reports to deliver maximum guidance in the
load-balancing process for multiple physical disk systems. The purpose
of collecting I/O statistics is to provide data for load balancing.
Load balancing involves moving datafiles on the physical disks such
that no single disk becomes stalled waiting for simultaneous I/O
requests. The best way to start a file placement strategy is to create
a transparent disk architecture whereby the DBA can correlate the
datafile name to a specific Oracle table or index. Let's explore how
we create this type of architecture.
Configuring Oracle Tablespaces
and Datafiles
Since different
application processes have different I/O patterns, hot disks may
appear on different disks at different times during each day. The goal
of disk load balancing is to eliminate disk I/O bottlenecks, but it is
important to remember that these bottlenecks are transient in nature.
Since Oracle transactions happen very quickly, a disk may experience
an I/O bottleneck for a very short period, and this short-duration
bottleneck may repeat itself thousands of times each day. However,
many Oracle administrators make the mistake of summarizing I/O by the
hour, and the disk will appear not to have bottlenecks since the I/O
spikes will have disappeared in the hourly average, as shown in Figure
8-10.
Figure 8-48: Short I/O spikes can be lost with
long measurement periods
To get the most accurate results, you should
collect I/O statistics at frequent intervals—preferably no more than
ten minutes between samples—over a representative time period, such as
a week. Because individual application processes have different I/O
patterns, bottlenecks may appear on different disks at various times
during each day. And because Oracle transactions happen very quickly,
a disk may experience an I/O bottleneck for a very short period—but a
short-duration bottleneck may nonetheless repeat itself thousands of
times each day. If you make the mistake of summarizing I/O by the
hour, as many DBAs do, you won't see these bottlenecks because the I/O
spikes will not be evident in the hourly average.
The point is simple: in order to accurately
identify and correct disk I/O bottlenecks, you must measure in minor
duration, preferably no more than 10 minutes between samples. We will
discuss this technique in a following section where we will show how
to extend STATSPACK to capture disk iostat information.
The goal of load balancing is to distribute the
files across disks so as to achieve a single static optimal I/O
throughput. Moving Oracle datafiles to other disks is not a trivial
operation, and the datafile must be taken offline before the file can
be moved to another disk. However, the good news is that once the I/O
subsystem is balanced, the files will not need to be moved unless new
processes change the I/O pattern for the disks.
The goal is to find the optimal file placement
where overall load balance is achieved for all of the many variations
of disk access. Load balancing is essentially the identification of
hot disks, and the movement of datafiles to less-used cool disks. As
such, disk load balancing is an iterative process since it is possible
that relocating a datafile may relieve contention for one process,
only to cause I/O contention for an unrelated process. Also, for
databases placed on a small number of disks, it is possible that I/O
contention cannot be avoided. Consider a 30GB database spread across
two disks with 20 competing processes for data. On average, ten
processes would be queued waiting for I/O from each of the two disks.
Clearly, these types of systems will always experience I/O contention.
Within Oracle
in any UNIX environment, we have a hierarchical relationship between
entities. Each physical disk has many UNIX mount points, each mount
point has many Oracle datafiles, and each datafile may have many
Oracle tables, as shown in Figure 8-11.
Figure 8-49: The hierarchy of Oracle file
structures on UNIX
After using data collected by iostat to
identify a hot disk, you would use data collected by the Oracle
utilities to identify which mount point and file contain the table
causing the excessive I/O activity.
Identifying the hot disk is only the beginning
of the quest. We must then see what mount point on the disk is causing
the problem, which datafile on the mount point, and finally, what
Oracle table is causing the excessive I/O. Only with this approach can
the Oracle administrator fully understand how to perform disk load
balancing. With that in mind, let's look at the first method for
collecting Oracle I/O statistics. We will then move on to look at
collecting UNIX I/O statistics.
This is an excerpt from "Oracle
High Performance tuning with STATSPACK" by Oracle Press.
 |
If you like Oracle tuning, you may enjoy the new book "Oracle
Tuning: The Definitive Reference", over 900 pages of BC's favorite tuning
tips & scripts.
You can buy it direct from the publisher for 30%-off and get instant access to
the code depot of Oracle tuning scripts. |
|