 |
|
Column Uniqueness and
Subquery Transformation
Oracle Tips by
Burleson
|
Whenever an Oracle SQL tuning professional sees
a subquery in a SQL statement, his or her first inclination is to see
if the query can be rewritten as a standard join. However, this can be
very dangerous unless you know whether the subquery is querying on
unique values.
When the subquery has multiple tables in the
from clause, explicit uniqueness can only occur when either of the
following are true:
-
The columns defining a unique key in the lowest
table of the hierarchy are on the select list of the subquery.
-
At least one of the columns defining the unique
key is on the select list, and the other columns defining the unique
key have an equality criterion specified, directly or indirectly.
If a subquery is rewritten to specify the
subquery table in the from clause, the result set had better
return only a single row, or otherwise the transformed query will
return the wrong answer. Let’s illustrate this fact with a simple
example. Assume that we want a quick list of all employees who have
spent at least one year in the Accounting department. Since the
department name is in the dept table, the following query could
be used:
select /*+
first_rows */
ename,
hiredate
from
emp
where
hiredate < sysdate-365
and
deptno IN
(
select deptno from dept
where dname = ‘ACCOUNTING’
);
Assuming that we have indexes on dname
in the dept table, Oracle will perform a nested loop scan.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
7
NESTED LOOPS
1
VIEW
VW_NSO_1
1
SORT
UNIQUE
1
TABLE ACCESS
FULL
DEPT
1
TABLE ACCESS
BY INDEX ROWID
EMP
2
INDEX
RANGE SCAN
DEPT_EMP
1
Now, it might be tempting to rewrite this query to
replace the subquery with a join:
select /*+
first_rows */
ename,
hiredate
from
emp,
dept
where
emp.deptno = dept.deptno
and
hiredate < sysdate-365
and
dname = 'ACCOUNTING'
;
Can you see the problem with this query?
Remember, the department name is not unique. Here is the execution
plan:
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
5
NESTED LOOPS
1
TABLE ACCESS
FULL
DEPT
1
TABLE ACCESS
BY INDEX ROWID
EMP
2
INDEX
RANGE SCAN
DEPT_EMP
1
Here we see that the execution plan has
changed, but we have a real problem here, because this query might not
always return the same result as the subquery. The reason is that the
dname column is not unique.
Why is the unique key required to transform a
subquery into a join? The simple reason is that without the uniqueness
guarantee, it is possible for the transformed query to produce a
different result set. This is because when uniqueness is not
guaranteed, multiple rows may be joined to the row in the surrounding
query, thus producing a Cartesian product effect.
Next, let’s examine the rare case where a
correlated subquery uses the IN clause.
Correlated Subqueries with the IN Clause
This type of query form is redundant because
the IN clause is doing exactly the same thing as the correlation in
the subquery. To illustrate, consider the following query:
select /*+
first_rows */
ename,
hiredate
from
emp
where
deptno IN
(
select deptno from dept
where emp.deptno = dept.deptno
);
Here we see that this query contains redundant
Boolean operators. The where deptno IN clause performs the same
check as the where emp.deptno = dept.deptno clause.
Non-correlated Subqueries with the EXISTS Clause
As I already noted, it is never appropriate to
specify a non-correlated subquery with the EXISTS clause. This is
because a Cartesian product results from the execution.
For example, the following query might be used
to attempt to count those employees who have bad credit.
select
ename
from
emp e
where EXISTS
(select
null
from
bad_credit b
)
;
However, since each employee may have many
bad_credit rows, and we do not specify a join condition, Oracle
will perform a Cartesian join.
Next, let’s look at performing correlated
subqueries with the EXISTS clause.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|