 |
|
The Driving Table
Location Problem
Oracle Tips by
Burleson
|
It is a real problem in Oracle SQL that the
driving table is reversed between the rule-based optimizer and the
cost-based optimizer. In the RBO, the driving table is the last
table in the from clause, while in the CBO the driving table is
always determined by the CBO. In cases where the ordered hint
is specified, Oracle will use the driving table as the first
table in the from clause. As you will recall, the ordered
hint is very useful for reducing the parse time of large n-way table
joins by specifying the table join order.
This makes it very challenging for the SQL
tuning professional who adds hints to SQL statements to specify the
appropriate driving table. Some DBAs use the num_rows column of
DBA_TABLES to get an idea of which table will be the best driving
table, but the best way to determine the driving table is to look at
the Boolean predicates in the where clause.
select /*+
rule */
emp.ename,
emp.deptno,
bonus.comm
from
emp,
bonus
where
emp.ename = bonus.ename
;
Note that the bonus table is the driving
table because it appears last in the from clause. Let’s explain
this statement and observe the execution plan.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
NESTED LOOPS
1
TABLE ACCESS
FULL
BONUS
1
TABLE ACCESS
BY INDEX ROWID
EMP
2
INDEX
RANGE SCAN
EMP_ENAME
1
Now we reverse the table order, placing the bonus
table first in the from clause.
select /*+
rule */
emp.ename,
emp.deptno,
bonus.comm
from
bonus,
emp
where
emp.ename = bonus.ename
;
Now when we reexecute the explain plan utility,
we should see that the driving table has been reversed.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
NESTED LOOPS
1
TABLE ACCESS
FULL
EMP
1
TABLE ACCESS
BY INDEX ROWID
BONUS
2
INDEX
RANGE SCAN
BONUS_ENAME
1
Here you see that the driving table has changed, and
this is clear evidence that the only factor influencing the driving
table with the RBO is the position of the table name in the where
clause.
The Driving Table and Table Cardinality
Remember, the driving table should be the table
that returns the smallest number of rows, and this is not
always the table with the smallest number of rows. Hence, you should
evaluate each table independently and factor any filtering constraints
into the where clause.
For example, assume we have a customer
table with 100,000 rows and an order table with 500,000 rows.
select
customer_name
from
customer,
order,
where
customer.cust_nbr = order.cust_nbr
and
order_status = ‘backordered’;
At first blush, it might appear that the
customer table should be the driving table. However, if there are
only 50,000 order rows that meet the order_status=’backordered’
criterion, then the order table should be made the driving
table.
The driving table is important because it is
retrieved first, and the rows from the second table are then merged
into the result set from the first table. Therefore, it is essential
that the second table return the least number of rows in terms of the
where clause. Note that the driving table should be the table
that returns the least number of rows, not always the table with the
smallest value for num_rows in the DBA_TABLES view.
With the rule-based optimizer, the indexing of
tables and order of table name and Boolean expressions within the SQL
statement control the execution plan for the SQL. For the rule-based
optimizer, consider the following query:
select /*+
rule */
dname,
sum(bonus.comm)
from
emp,
bonus,
dept
where
dept.deptno = emp.deptno
and
emp.ename = bonus.ename
group by
dname;
Here we expect that a single row will be returned for
each department, but hundreds of emp and bonus rows may
be scanned in order to compute the sum of all bonuses by department.
Hence, we want the dept table to be the driving table for this
query, and we have placed it last in the from clause.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
SORT
GROUP BY
1
NESTED LOOPS
1
NESTED LOOPS
1
TABLE ACCESS
FULL
DEPT
1
TABLE ACCESS
BY INDEX ROWID
EMP
2
INDEX
RANGE SCAN
EMP_DEPTNO
1
TABLE ACCESS
BY INDEX ROWID
BONUS
2
INDEX
RANGE SCAN
BONUS_ENAME
1
As you can see from this example, the placement of the
table names in the from clause is critical to the execution
plan and the speed of the query.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|