 |
|
Tuning Aggregation
Queries with Temporary Tables
Oracle Tips by
Burleson
|
In addition to data dictionary queries,
temporary table can dramatically improve the performance of certain
SQL self-join queries that summarize data values.
For example, consider a query that examines the
stats$tab_stats STATSPACK extension table. Rows for the
stats$tab_stats table are collected weekly, and one row in this
table exists for each table in the schema. This STATSPACK extension
table is loaded by collecting the table_name and the bytes
consumed by the table (from the dba_segments view), and we want
to use this table-level detail data to summarize our overall size
change per week.
Here is the description for this STATSPACK
extension table. For details on using this table, refer to Oracle
High-Performance Tuning with STATSPACK, by Don Burleson
(McGraw-Hill Professional Publishing, 2001).
SQL> desc
stats$tab_stats;
Name
Null? Type
----------------------------------------- -------- -----------------
SNAP_TIME
DATE
SERVER_NAME
VARCHAR2(20)
DB_NAME VARCHAR2(9)
TABLESPACE_NAME
VARCHAR2(40)
OWNER
VARCHAR2(40)
TABLE_NAME
VARCHAR2(40)
NUM_ROWS
NUMBER
AVG_ROW_LEN
NUMBER
NEXT_EXTENT
NUMBER
EXTENTS
NUMBER
BYTES
NUMBER
Since each row
of this table contains a date (snap_time), a table name (table_name),
and the number of bytes, we need a query that sums up the total size
for all tables for one week, and then compares that value to the
overall size for the following week (Figure 18-1).
Figure 1: Using
temporary tables to preaggregate values from a time-based table
Essentially, we could formulate this comparison
of summaries as a single query that summarizes each range or rows.
select
distinct
to_char(old_size.snap_time,'yyyy-mm-dd'), -- The old snapshot date
sum(old_size.bytes),
sum(new_size.bytes),
sum(new_size.bytes) - sum(old_size.bytes)
from
stats$tab_stats old_size,
stats$tab_stats new_size
where
-- This is the highest date in the table
new_size.snap_time = (select max(snap_time) from stats$tab_stats)
and
-- This is the prior weeks snapshot
old_size.snap_time = (select min(snap_time)-7 from stats$tab_stats)
group by
to_char(old_size.snap_time,'yyyy-mm-dd')
;
Here is the execution plan for this query.
Because we are summing and comparing ranges of values within the same
table, we see the dreaded MERGE JOIN CARTESIAN access method. As you
know, a Cartesian merge join can run for hours because the Cartesian
products of the tables must be derived.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
5
SORT
GROUP BY
1
MERGE JOIN
CARTESIAN
1
TABLE ACCESS
BY INDEX ROWID
STATS$TAB_STATS
1
INDEX
RANGE SCAN
TAB_STAT_DATE_IDX
1
SORT
AGGREGATE
1
INDEX
FULL SCAN (MIN/MAX)
TAB_STAT_DATE_IDX
1
SORT
JOIN
2
TABLE ACCESS
BY INDEX ROWID
STATS$TAB_STATS
1
INDEX
RANGE SCAN
TAB_STAT_DATE_IDX
1
SORT
AGGREGATE
1
INDEX
FULL SCAN (MIN/MAX)
TAB_STAT_DATE_IDX
1
Take a close look at the execution plan for
this query, and carefully review the steps:
1.
We begin with an index range scan to get the maximum date in
the stats$tab_stats table.
2.
Next, we perform an index range scan to get the sysdate–7
value.
3.
Once we have the target data values, we must perform a
Cartesian merge join to access the rows and resolve the query.
This query gets tricky where we must compare a
range of common date with another range of common dates. As you can
see, these range comparison queries are hard to formulate and hard for
Oracle to optimize.
To see an alternative query, let’s look at an
equivalent query that utilizes temporary tables. This report uses the
STATSPACK extension tables for objects to prepare weekly growth
reports. The rpt_object_stats.sql script is a very useful
STATSPACK report that approximates the overall growth of the database
over the past week. The DBA can quickly compare table and index
counts, and see the total growth for table and indexes over the past
week. This report is often e-mailed to MIS managers.
Let’s take a closer look at each section of the
report and then see how to formulate the query.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|