 |
|
When the Rule-Based
Optimizer Fails
Oracle Tips by
Burleson
|
There are cases in which the rule-based
optimizer fails to choose an optimal execution plan. In Oracle8i,
this is most often due to the inability of the RBO to use advanced
index structures and parallel query. New features aside, we also see
that the RBO will often choose a suboptimal index to service a query.
This happens because the rule-based optimizer is not aware of the
number of distinct values in tables and indexes but relies on simple
heuristics to find an acceptable access path to the data.
Remember, in rule-based optimization, all
indexes have an equal ranking. When items have an equal order, the row
cache order is used to select the first index. When the RBO detects
equally ranked objects, it chooses the first object that it comes to
in the row cache. Be aware that row cache order cannot be externally
controlled, as it is determined by a internal unpublished algorithm.
It is possible that row cache order can be changed by modifying shared
pool parameters, or by dropping and recreating the objects, but this
option is seldom as feasible as simply adding a hint and running the
query again as a cost-based query.
When the Rule-Based Optimizer Is Best
As I have repeatedly noted, there are many
times when the RBO will achieve a faster execution plan than
cost-based optimization, especially in the earlier releases of
Oracle8. To illustrate these differences, let’s use the Oracle demo
database for our example. For the following query, assume that we have
built a nonunique B-tree index on both the deptno and mgr
columns.
select /*+
rule */
count(*)
from
emp
where
mgr = 7902
and
deptno = 10
;
Now, let’s take a look at the distribution of
distinct values within the deptno and mgr columns.
SQL> select
distinct deptno from emp;
DEPTNO
----------
10
20
30
SQL> select distinct mgr from emp;
MGR
----------
7566
7698
7782
7788
7839
7902
Here you see that the mgr column has more than
double the selectivity of the deptno column because it has far
more unique values. We would expect that the most efficient way to
service this query would be to use the AND_EQUAL access method,
starting with the most selective index. Here is the rule-based
execution plan for this query:
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
SORT
AGGREGATE
1
AND-EQUAL
1
INDEX
RANGE SCAN
EMP_MGR
1
INDEX
RANGE SCAN
EMP_DEPTNO
2
The rule-based optimization uses the AND-EQUAL access
method to combine the ROWID lists from each index range scan to get
only those ROWIDs that meet the selection criteria.
To see the difference between this method and
cost-based optimization, here is the same query using the
first_rows hint.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
1
SORT
AGGREGATE
1
TABLE ACCESS
BY INDEX ROWID
EMP
1
INDEX
RANGE SCAN
EMP_DEPTNO
1
Unlike under rule-based optimization, the cost-based
optimizer wants to perform an index range scan on the deptno
index and then filter for the rows belonging to the specified mgr
column. This could take a long time, especially if the tables are
large.
NOTE: Remember that the rule-based optimizer does
not recognize bitmap or function-based indexes. Adding a rule hint
will cause full-table scans in cases where cost-based indexes are used
to access the target table.
In sum, you need to pay careful attention to
the indexes that are chosen by the rule-based optimizer, and either
disable the indexes that you do not want to be used in the query or
force the use of the index that you want. To review, indexes can be
explicitly specified with the index hint, or unwanted indexes
can be disabled by mixing data type on the index (i.e., where
numeric_column_value = 123||’ ‘).
Any perceived index change (whether it really
affects the value or not) will defeat the index, so simply adding zero
to a number or a null to a character should always disable index
access.
Conclusion
The rule-based optimizer remains quite popular
in Oracle8i despite the attempts by Oracle to improve
cost-based optimization to the point where it always outperforms the
RBO. The major points in this chapter include these:
-
The RBO is very stable and predictable, while
the CBO is more intelligent but often unpredictable.
-
The number of databases using the RBO falls
with each new release of the CBO, but there are still a significant
number of databases using a rule-based optimizer default.
-
Some shops use a cost-based optimizer default
and override selected queries with the rule hint.
-
Some shops use a rule-based default and tune
SQL queries by adding cost-based hints and analyzing selected tables
and indexes
-
In the RBO, the order of tables in the from
clause determines the table join order. The last table in the
from clause is the driving table, which should be the smallest
table.
-
The order of the Boolean predicates in the
where clause can also affect the behavior of the RBO.
-
Using optimizer_mode=choose can be very
dangerous to performance if selected tables or indexes are analyzed.
-
The most common error with the RBO is its
inability to know the selectivity of each index on a table. Hence,
the RBO sometimes uses a nonselective index to access a table.
Next, let’s take a look at the single most
complex of all SQL tuning operations, the optimization of table joins.
If you can master the table join, you are well on your way to being a
SQL tuning guru.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|