 |
|
Automatic SQL
Transformation of Subqueries
Oracle Tips by
Burleson
|
Oracle has always been transforming SQL into
alternative SQL syntax. The simplest example is the conversion of the
nested select to a join. However, there are some types of poor SQL
syntax that the Oracle SQL optimizer can't transform into a join, and
there are also cases where Oracle doesn't transform the SQL as it
should. Oracle will transform several types of subqueries into
standard joins. These include:
However, you must always remember that there
are many other factors dictating when the SQL optimizer performs an
automatic query transformation of a subquery to a join. Foremost, you
generally need to have a unique index on both table join columns. In
sum, it is never a good idea to trust Oracle to rewrite malformed SQL
statements. You should always rewrite subqueries whenever possible.
Let’s take a look at correlated and
non-correlated subqueries and understand the issues surrounding the
use of the IN and EXISTS clauses.
Tuning Subqueries with the IN and EXISTS Clauses
There are many cases where we need to compare
the values in a outer table with another table using the EXISTS or IN
clause. In some cases, the IN and EXISTS clauses can be used
interchangeably.
Non-correlated Subqueries using the IN Clause
To begin, what does it mean to perform a nested
loop join instead of a non-correlated subquery? The nested loop join
extracts one row from one table and then extracts rows from the second
table where the join columns evaluate to true. Regarding the
non-correlated subquery, we know that the subquery is completely
executed before any comparison is made with the outer query.
Let’s illustrate the query rewrite
functionality with a simple example. In the query that follows, Oracle
must return the complete result set for the subquery of the
bad_credit table.
Select /*+
rule */
ename
from
emp
where
empno IN
(select
empno
from
bad_credit
where
bad_credit_date > sysdate-365
)
;
Because the Oracle SQL optimizer will transform
this query, we will simulate the untransformed behavior by removing
the empno indexes from both tables.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ----------------------------
----------
SELECT STATEMENT
5
MERGE JOIN
1
VIEW
VW_NSO_1
1
SORT
UNIQUE
1
TABLE ACCESS
FULL BAD_CREDIT
1
SORT
JOIN
2
TABLE ACCESS
FULL
EMP
1
When the VIEW execution method exists, the
operation represents a nested select. Since no unique key exists on
the empno column in the emp table, Oracle performs the
SORT UNIQUE operation and then joins the result set to the
bad_credit rows. If we had the empno indexes in place for
these tables, look how the execution plan has changed. Note that the
automatic SQL transformation is only appropriate when the columns in
the select list of the nested select form a unique key, and the
comparison operator is the IN operator.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ----------------------------
----------
SELECT STATEMENT
NESTED LOOPS
1
TABLE ACCESS
FULL
BAD_CREDIT
1
TABLE ACCESS
BY INDEX ROWID
EMP
2
INDEX
RANGE SCAN
EMP_EMPNO
1
Why is the unique key required to transform to
a join? The reason is that without a uniqueness guarantee (via the
index), it is possible for the transformed query join to produce a
different result set. Why? Because when uniqueness isn't guaranteed,
multiple rows might be joined to the row in the surrounding query,
thus producing a Cartesian product effect. That is why we add the
select distinct clause to eliminate duplicate rows.
But after the transformation, Oracle now has
the choice of whether to drive the query with the emp table or
the bad_credit table. The query transformation is only
appropriate when the columns in the select list of the noncorrelated
subquery make up a unique key, and the comparison operator is the IN
operator. For example, the preceding query returns all employees who
have had a bad credit rating in the past year.
Here is the transformed query. Note the
standard join and the use of select distinct to remove any
duplicate rows.
Select
distinct /*+ rule */
ename
from
emp,
bad_credit
where
emp.empno = bad_credit.empno
and
bad_credit_date > sysdate-365
;
Here you see the execution plan for this query
with a unique index on empno in both tables. We also use the
rule hint to ensure that the index is used.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ----------------------------
----------
SELECT STATEMENT
6
SORT
UNIQUE
1
HASH JOIN
1
TABLE ACCESS
FULL
BAD_CREDIT
1
TABLE ACCESS
FULL
EMP
2
Oracle will always make the transformation to a
join when the subquery is non-correlated and the query uses the IN
operator. We also see a great benefit in transforming this type of
subquery into a join.
However, you should never completely rely on
Oracle to transform and optimize the subquery. As we have noted in our
example, the Oracle transformation used a NESTED LOOPS join, while our
rewritten query uses a faster HASH JOIN. Also, Oracle does not always
determine the proper driving table for a transformed query, and you
can get even faster performance by using the ordered hint to
manually specify the driving table for the query.
TIP: Oracle will automatically transform only
non-correlated subqueries that use the IN clause. However, it is
always a good idea to rewrite the query manually to take advantage of
the use_hash and ordered hints.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|