 |
|
Evaluate the Legitimacy
of the Full-Table Scan
Oracle Tips by
Burleson
|
As you should know by now, a full-table scan
may be faster than an index range scan if a minority of the table
blocks are retrieved by the query. We say "table blocks" instead of
the number of rows, because it is the number of data blocks, and not
the number of table rows, that most influences the decision to perform
a full-table scan over an index range scan. For an un-sequenced table
with small row lengths and a large block size, a random retrieval of
10 percent of the rows can access well over 50 percent of the table
blocks in the table.
Since the number of rows returned is heavily
dependent on the row length and the block size, there is no firm
percentage that can be used to determine the threshold for a
full-table scan. If the query retrieves fewer data blocks using an
index range scan than the full-table scan, then the index access will
generally be a faster alternative.
The goal is to compare the number of blocks
returned by the full-table scan with the number of data blocks
returned by the index range scan. Computing the number of blocks
accessed by a full-table scan can be done by going to the Oracle data
dictionary and summing the number of blocks from the dba_segments
view:
select
sum(bytes)
from
dba_segments
where
table_name = ‘CUSTOMER’;
SUM(BYTES)
----------------------
152392
Now, we need to compare the number of blocks
retrieved by the full-table scan with the blocks retrieved from an
index range scan. Since the number of blocks retrieved by the query is
the most important, the following equation can be used to determine
the approximate number of blocks retrieved by the query. To use this
equation, you need the following information:
-
The number of rows returned by the query. This is determined by
executing the query
-
The database blocksize (the db_block_size initialization
parameter)
-
The average row length (the avg_row_len in the dba_tables
view)
-
The clustering_factor in the dba_indexes view for the
index you plan to use
Let’s look at two methods for comparing an
indexed SQL query against the block I/O for a full-table scan.
Estimating Blocks Retrieved by an Indexed Table Query
This is an equation to compute the number of
blocks retrieved by a row-sequenced table. We discussed the method for
using CTAS to resequence rows. You can tell if your table is
row-sequenced by checking the clustering_factor column of the
dba_indexes view for the index that you are using to retrieve
your rows.
-
If clustering_factor is close to the
number of data blocks in your table (the blocks column of
dba_segments), then your table is row-sequenced.
-
If clustering_factor is close to the
number of rows (the num_rows column of the dba_tables
view), then your table rows are not in index sequence. Hence, we
must perform lots of additional block accesses to retrieve our rows.
Next, let’s look at a way to estimate the
number of data blocks retrieved by a query.
If your clustering_factor is close to
the number of blocks in the table, then you can use this method to
estimate the number of blocks retrieved by the query.
Here is a great way to estimate the number of
blocks retrieved by your query:
number of rows retrieved
Number of blocks retrieved
=---------------------------------------------------
db_block_size dba.segments.blocks
-------------- * -----------------------
avg_row_len
dba_indexes.clustering_factor
To see how this works, assume that you have
collected the following information about the table and index in your
query.
-
number of rows retrieved = 600
-
db_block_size = 16K (rounded to 16,000 for estimation
purposes)
-
dba_tables.avg_row_len = 80
-
dba_segments.blocks = 1000
-
dba_indexes.clustering_factor = 20,000
When we plug numbers into the equation, we see
that this index range scan causes about 60 data blocks to be retrieved
by Oracle. Of course, a block request does not always result in a disk
I/O, because of the buffering in the DEFAULT, KEEP, and RECYCLE pools.
600
600
------------ = ---------- = 60 data blocks retrieved
200*.05 10
Note that as the clustering_factor
increases from the number of blocks in the table to the number of rows
in the table, the number of disk accesses will increase. For example,
when we change the clustering factor from 1000 to 2000, we see that
Oracle must retrieve twice the number of blocks.
NOTE: Be sure to completely reanalyze the
table and the index to get accurate dictionary statistics for this
equation.
Now that you know the number of data blocks
retrieved by the query, we can accurately compare the number of blocks
retrieved as a percentage of the number of blocks in the table.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|