 |
|
Tuning the OPS
Environment
Oracle Tips by Burleson
|
In a sense, we can think of the shared-nothing
configuration as having numerous independent Oracle instances, and we
can expect to tune each instance as if it were an independent entity.
However, we must always bear in mind that each Oracle instance is
competing for the same data block resources. This competition is
directly measured by the IDLM.
Oracle Parallel Server only achieves a high
degree of parallelism when careful planning has partitioned the tasks
onto each instance in such a way that no two instances are constantly
competing for data resources. If we find evidence that two Oracle
instances are frequently accessing the same data blocks, the first
remedy is to move common tasks into the same instance, where they can
share the same buffer cache, and eliminate calls to the IDLM.
Indeed, tuning of the Parallel Server is all
about IDLM lock contention. Our goal should be to independently tune
each Oracle instance and to keep a careful eye on how these instances
interact with each other to manage internode locks. As IDLM lock
contention is identified, we have numerous options, including
repartitioning the application to move tasks to other instances,
adding multiple freelists to frequently accessed blocks, or using
table replication techniques to alleviate I/O contention.
Here is a very simple monitoring approach to
the tuning of an OPS database:
-
Monitor for multiple tasks that modify rows
on the same block When multiple tasks contend for the updating
of rows on the same data block, adding freelists or freelist groups
may relieve the bottleneck.
-
Monitor the IDLM for lock conversions If
the maximum lock convert rate for your IDLM has been reached, you
will need to repartition the application to balance “alike”
transactions into common instances.
-
Monitor statistics independently for each
Oracle instance The goal should be to minimize physical I/O by
tuning the buffer cache and providing input into an overall load
plan. For example, if we discover one instance is heavily loaded
when compared to other instances, we can take a look at the
partitioning of tasks and rebalance the load by moving tasks onto
other instances.
-
Monitor each instance's buffer cache,
looking for common data blocks If the same data blocks show up
in multiple buffer caches, move one of the tasks into a common
Oracle instance. Remember, the idea of tuning Parallel Server is to
segregate common tasks into common instances.
-
Monitor the TEMP tablespace usage within
each instance In OPS, each instance performs its own sorting,
but all instances share a common TEMP tablespace. In other words,
memory sorts are performed within each instance's sort_area_size
in RAM, but all disk sorts are performed in the common TEMP
tablespace.
It should be apparent that the inherent
complexity of parallel processing makes it very difficult to come up
with generic tuning techniques. Every parallel system is unique, and
the Oracle professional must analyze each system carefully,
considering its unique structure and behavior.
As we continue to discuss OPS tuning, let's
begin our discussion of tools for monitoring the behavior of the OPS
database.
STATSPACK Tables for Monitoring OPS
The STATSPACK utility provides several tables
for tracking OPS activity. These tables can be used to provide an
instance-wide reporting architecture for Oracle Parallel Server. These
are the main STATSPACK tables for OPS. Let's begin by looking at their
contents before we look at the reports:
-
stats$sysstat This
table contains numerous global parameters that relate to the
behavior of OPS.
-
stats$rowcache_summary This table shows
IDLM activity detail for each instance.
-
stats$sgaxs This table provides summary
information for each instance in the OPS architecture.
The stats$rowcache_summary Table
In addition to information about row cache
information within an instance, the stats$rowcache_summary table
contains several columns that relate directly to the OPS IDLM:
L 13-5
SQL> desc stats$rowcache_summary;
Name Null? Type
----------------------------------------- --------
----------------------
SNAP_ID NOT NULL NUMBER(6)
DBID NOT NULL NUMBER
INSTANCE_NUMBER NOT NULL NUMBER
PARAMETER NOT NULL VARCHAR2(32)
TOTAL_USAGE NUMBER
USAGE NUMBER
GETS NUMBER
GETMISSES NUMBER
SCANS NUMBER
SCANMISSES NUMBER
SCANCOMPLETES NUMBER
MODIFICATIONS NUMBER
FLUSHES NUMBER
DLM_REQUESTS NUMBER
DLM_CONFLICTS NUMBER
DLM_RELEASES NUMBER
This table contains useful information about the
internal operations of the DLM. Especially useful is the column
dlm_conflicts, which is related to internal bottlenecks within the
IDLM.
The stats$sgaxs Table
The stats$sgaxs table contains basic
information about each instance in an OPS environment. This is largely
a summary table and it does not contain detailed OPS information.
L 13-6
SQL> desc stats$sgaxs;
Name Null? Type
----------------------------------------- --------
----------------------
SNAP_ID NOT NULL NUMBER(6)
DBID NOT NULL NUMBER
INSTANCE_NUMBER NOT NULL NUMBER
STARTUP_TIME NOT NULL DATE
PARALLEL NOT NULL VARCHAR2(3)
NAME NOT NULL VARCHAR2(64)
VERSION VARCHAR2(17)
VALUE NUMBER
The stats$sysstat Table
The stats$sysstat also contains columns that
are of interest when tuning OPS. Just like its cousin the v$sysstat
table, there are global statistics that relate to the behavior of OPS.
L 13-7
SQL> Select distinct name from stats$sysstat
where name like 'global%';
global cache blocks corrupt
global cache convert time
global cache convert timeouts
global cache converts
global cache cr block log flush time
global cache cr block log flushes
global cache cr block receive time
global cache cr block send time
global cache cr block serve time
global cache cr blocks received
global cache cr blocks served
global cache cr requests blocked
global cache cr timeouts
global cache defers
global cache freelist waits
global cache get time
global cache gets
global cache prepare failures
global lock async converts
global lock async gets
global lock convert time
global lock get time
global lock releases
global lock sync converts
global lock sync gets
The stats$sysstat STATSPACK table can be used
to determine whether lock converts are being performed too often.
Excessive lock convert rates usually mean there is contention for a
common resource within the database. This resource may be a commonly
updated table. For example, inventory management systems often utilize
one-of-a-kind (OOAK) rows. An OOAK row may be used to keep the order
number of the last order, and all application tasks must increment
this row when a new order is placed. This type of architecture forces
each parallel instance to single-thread all requests for this
resource. But how do we identify these types of database resources?
Just as the buffer hit ratio measures
contention for data blocks, the lock hit ratio can be used to identify
excessive lock conversion by the IDLM. The lock hit ratio should
generally be above 90 percent, and if it falls below 90 percent, you
should look for sources of data contention. Here is the SQL to
determine the lock hit ratio for Oracle Parallel Server:
rpt_idlm.sql
L 13-8
column mydate heading 'Yr. Mo Dy
Hr.' format a16
column idlm_lock_hit_ratio format
999,999,999
select
to_char(snap_time,'yyyy-mm-dd HH24') mydate,
(a.value - b.value)/(a.value) idlm_lock_hit_ratio
from
stats$sysstat a,
stats$sysstat b,
stats$snapshot sn
where
a.name = 'consistent gets'
and
b.name = 'global lock converts (async)'
and
a.snap_id = sn.snap_id
and
b.snap_id = sn.snap_id
order by
to_char(snap_time,'yyyy-mm-dd HH24')
;
Now that we see the regular STATSPACK tables
that capture performance information, let's look at an easy way to
extend STATSPACK by including a table to track IDLM behavior.
This is an excerpt from "Oracle
High Performance tuning with STATSPACK" by Oracle Press.
 |
If you like Oracle tuning, you may enjoy the new book "Oracle
Tuning: The Definitive Reference", over 900 pages of BC's favorite tuning
tips & scripts.
You can buy it direct from the publisher for 30%-off and get instant access to
the code depot of Oracle tuning scripts. |
|