 |
|
The Sort Merge Join
Oracle Tips by
Burleson
|
The sort
merge join is among the simplest and oldest implementations of
table joins. In a sort merge join, indexes are not used, and both
tables are accessed via a full-table scan. Following the full-table
scan, the result sets from each scan are independently sorted, and the
sorted result sets are then merged to arrive at the query result set
(Figure 16-9).
Figure 9: A sort merge
join
In general, a sort
merge join is invoked only under the following circumstances:
-
When no useful indexes exist to join the
table columns
-
When the query returns the majority of data
blocks from both tables
-
When the CBO determines that a full-table
scan is faster than an index access
As a general rule, the sort merge table access
method is useful only in rare cases, such as large batch reports that
return all of the rows in both tables. Otherwise, a nested loop or
hash join is almost always more efficient.
Here is a rule-based query where no indexes
exist:
select /*+
first_rows */
ename,
dname
from
dept,
emp
where
emp.deptno = dept.deptno
and
emp.deptno = 10
;
Now, we see the standard merge join with a
full-table scan against both tables.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
MERGE JOIN
1
SORT
JOIN
1
TABLE ACCESS
FULL
EMP
1
SORT
JOIN
2
TABLE ACCESS
FULL
DEPT
1
The Cartesian Merge Join
In a special case of the sort merge join, we
see another dialect of the merge join that can use indexes to avoid
the full-table scan. To illustrate, let’s return to our earlier query
and use the first_rows hint with no indexes on the deptno
column.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
3
MERGE JOIN
CARTESIAN
1
TABLE ACCESS
FULL DEPT
1
SORT
JOIN
2
TABLE ACCESS
FULL EMP
1
Here we see the sort merge join performing a
full-table scan on both tables, and this is a requirement of the
Cartesian merge join because it is normally invoked in cases where no
join columns exist for the tables.
In the following example, we re-explain the
query after adding indexes on deptno in both tables. We also
have a small number of rows in both tables. Because both tables have a
small number of rows, the CBO detects the low cardinality and invokes
a sort merge join; it also uses both deptno indexes instead of
a full-table scan.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME
POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
3
MERGE JOIN
CARTESIAN
1
TABLE ACCESS
BY INDEX ROWID
DEPT
1
INDEX
RANGE SCAN
DEPT_DEPTNO
1
SORT
JOIN
2
TABLE ACCESS
BY INDEX ROWID
EMP
1
INDEX
RANGE SCAN
EMP_DEPTNO
1
Next, let’s look at how we can force a sort
merge join with Oracle hints.
The use_merge Hint
The use_merge hint forces a sort merge
operation. The sort merge operation is often used in conjunction with
parallel query because a sort merge join always performs full-table
scans against the tables. Sort merge joins are generally best for
queries that produce very large result sets such as daily reports and
table detail summary queries, or tables that do not possess indexes on
the join keys. Here we see a simple query that has been formed to
perform a sort merge using parallel query against both tables:
select /*+
use_merge(e,b) parallel(e, 4) parallel(b, 4) */
e.ename,
hiredate,
b.comm
from
emp e,
bonus b
where
e.ename = b.ename
;
Here is the output of the execution plan for this query.
Note the full-table scans and the sort merge operation. Even though a
join equality was specified in the where clause (where
e.ename = b.ename), the use of the parallel hint told the
Oracle optimizer to bypass indexes and invoke parallel full-table
scans against both tables.
OPERATION
----------------------------------------------------------------------
OPTIONS
OBJECT_NAME POSITION
------------------------------ ---------------------------- ----------
SELECT STATEMENT
5
MERGE JOIN
1
PARALLEL_TO_SERIAL
SORT
JOIN
1
PARALLEL_COMBINED_WITH_PARENT
TABLE ACCESS
FULL
EMP
1
PARALLEL_TO_PARALLEL
SORT
JOIN
2
PARALLEL_COMBINED_WITH_PARENT
TABLE ACCESS
FULL
BONUS
1
PARALLEL_TO_PARALLEL
It is important to note that a sort merge join does not
use indexes to join the tables. In most cases, index access is faster,
but a sort merge join may be appropriate for a large table join
without a where clause, or in queries that no do have indexes
to join the tables.
TIP: Remember, there are limitations on the
speed of Oracle parallel query. The benefit from a parallel query is
heavily dependent on the number of CPUs on the database server and on
the distribution of the target data files across multiple disks.
Now let’s examine a special type of join that
is used in Oracle data warehouse queries.
This is an excerpt from "Oracle High-Performance SQL Tuning" by Donald
K. Burleson, published by Oracle Press.
|