 |
|
Tuning with Rule-Based
Optimization
Oracle Tips by
Burleson
|
As we noted, the rule-based optimizer is the
oldest and most stable of the optimizers. The rule-based optimizer is
very simple and uses basic information in the data dictionary to make
decisions about how to generate an optimal execution plan. Also,
unlike the cost-based optimizer, the order of tables in the from
clause and the order of Booleans in the where clause effect the
execution plan for the query.
If you are using any release of Oracle prior to
Oracle8i, you may want to consider trying the RBO as your default
optimizer. Before Oracle8i, the CBO had an erroneous propensity to
invoke a full-table scan instead of using available indexes. Again,
while the RBO is quite old, it can sometimes result in faster SQL
execution. When tuning an SQL statement, the rule hint should
be the first thing to try.
Since the RBO is still widely used, let’s take
a look at the rules for formulating a query.
Changing the rule-based driving table
In Oracle’s rule-based optimizer, the ordering
of the table names in the from clause determines the driving
table. The driving table is important because it is retrieved first,
and the rows from the second table are then merged into the result set
from the first table. Therefore, it is essential that the second table
return the least amount of rows based on the where clause.
Note: The driving table is not always the table
with the least amount of rows. The Boolean conditions in the SQL
where clause must be evaluated, and the driving table should be
the table that returns the smallest number of rows.
With the rule-based optimizer, the table names
are read from right to left. Hence, the LAST table in the from
clause should be the table that returns the smallest amount of rows.
For setting the driving table for the rule-based optimizer, consider
the following query where the order table has 100,000 rows and the
customer table has 50,000 rows.
Select
customer_name,
customer_phone
from
customer,
order
where
customer_region = ‘EAST’
and
order_status = ‘BACKORDER’;
In this query, we see that the last table in the from
clause is the order table, and the order table will be the driving
table. This might make sense since we know that this table has half
the rows of the customer table. However, we must first evaluate the
where clause, to see what table return the smallest number of
rows.
Let’s assume that there are 10,000 customer in the WEST
region and 30,000 backordered status columns. Given this
information, we know that the customer table should be last in the
from clause because it returns less rows.
When the rule-base optimizer fails to use the correct index
The rule-based optimizer’s greatest shortcoming
is its failure to use the best index. There are cases where the
rule-based optimizer fails to choose the best index to service a query
because it is not aware about the number of distinct values in an
index (the selectivity of the index). This is especially a problem
when values within an index are highly skewed.
For example, let’s assume in this example that
there are 100,000 retired employees, 20,000 employees in the personnel
department, and 500 who are both retired and belong to the personnel
department. Let’s also assume that we have a non-unique index on both
the status and the department columns of our employee table.
We would expect that the most efficient way to
service this query would be to scan the most selective index, in this
case the department index, scanning the 20,000 retired employees to
get the 500 in the personnel department. It would be far less
efficient to scan the status index, reading through 100,000 retired
employees to find those who work in the personnel department.
select
count(*)
from
employee
where
department = ‘PERSONNEL’
and
status = ‘RETIRED’;
With the rule-based optimizer, we see the following
execution plan:
SELECT STATEMENT
SORT AGGREGATE
SELECT BY ROWID EMPLOYEE
NON-UNIQUE INDEX NON-SELECTIVE RANGE SCAN status_ix(status)
Even reversing the order of the items in the where
clause does not change the fact that the rule-based optimizer is
choosing to scan through all 100,000 retired employees looking for the
500 that belong to the Personnel department. With cost-based optimizer
we see that the selectivity of the indexes is known and that the most
efficient index is used to service the request:
SELECT STATEMENT
SORT AGGREGATE
SELECT BY ROWID EMPLOYEE
NON-UNIQUE INDEX NON-SELECTIVE RANGE SCAN
dept_ix(department)
In sum, we need to pay careful attention to the
indexes that are chosen by the rule-based optimizer, and either
disable the indexes that we do not want to be used in the query or
force the use of the index that we want. If we want to use the RBO for
this type of query, the best indexes can be explicitly specified with
an index hint, or unwanted indexes can be disabled by mixing
data type on the index (i.e. WHERE numeric_column_value = 123||’ ‘).
Now let’s move on and take a close look at
tuning with the cost-based optimizer.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|