 |
|
Speed of Correlated
Subqueries with the IN Clause
Oracle Tips by
Burleson
|
This is the output for a correlated subquery
using the IN clause. Here you see the counts for the query, and the
three timings for the query with the set timing on SQL*Plus
command. Note that the database was stopped and re-started between
timings.
COUNT(*)
----------
7000
Elapsed: 00:00:02.82
Elapsed: 00:00:04.72
Elapsed: 00:00:04.83
Here you see roughly twice the execution time
as with the non-correlated subquery. As I have discussed, the
correlated subquery is executed once for each row that is returned
from the parent query. Hence, if we return 14,000 rows from the parent
query, the inner query would be executed one for each row. As you see
from the execution plan, a full-table scan is invoked for
outer_table. As each row is fetched, the key column from
outer_table is used in an index-only merge scan to return the
counts from our query.
Timing Subqueries with the EXISTS Clause
The EXISTS clause can sometimes be used instead
of the IN clause for subqueries, but there are some important
differences in the behavior of the queries. When using EXISTS with a
subquery, the Boolean operator in the parent statement becomes true if
any rows are returned by the subquery, and this may not give
the desired result. Consequently, the following query would fail to
count the 7,000 matching rows:
select
count(*)
from
outer_table
where
exists
(select key from inner_table)
;
Speed of Non-correlated Subqueries with the EXISTS Clause
Here are the output timings for the query as a
non-correlated subquery using the EXISTS clause. Note that we get the
wrong answer for our count!
COUNT(*)
----------
14000
Elapsed: 00:00:00.12
Elapsed: 00:00:00.11
Elapsed: 00:00:00.13
Here we note that using the EXISTS clause in a
non-correlated subquery gives us the wrong number of rows. In this
non-correlated subquery, using the EXISTS clause results in returning
the number of rows from the parent table (outer_table in this
case), and the subquery seems to be disregarded.
Speed of Correlated Subqueries with the EXISTS Clause
But what about using the EXISTS clause as a
part of a correlated subquery? Since the correlated subquery is
executed once for each row in the parent table, we would expect that
this approach would work in identifying our 7,000 rows:
select
count(*)
from
outer_table
where
EXISTS
(select key from inner_table where outer_table.key =
inner_table.key)
;
This is the execution timing for our correlated
subquery with the EXISTS clause:
COUNT(*)
----------
7000
Elapsed: 00:00:03.73
Elapsed: 00:00:03.36
Elapsed: 00:00:03.36
Basic Characteristics of Subquery Execution
The immediate conclusion about subqueries is
that each SQL subquery must be evaluated on an individual basis, with
an emphasis on the number of expected rows that will be returned from
both the inner table and the outer table. Let’s review the conclusions
about subqueries:
When using a correlated subquery, the execution
plans are often identical for subqueries that use the IN clause or the
EXISTS clause.
-
The EXISTS clause is always inappropriate for a
non-correlated subquery.
-
When the outer query returns a relatively small
number of rows, the correlated subquery will perform faster than the
non-correlated subquery.
-
When the subquery returns more than one row,
the query cannot be automatically transformed into a join.
-
With a small number of rows in the inner query,
a non-correlated subquery will run faster than a correlated
subquery.
Now that we have reviewed the basics of
subqueries, let’s take a closer look and examine the execution plans
for each type of subquery.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|