 |
|
Using Oracle Global
Temporary Tables
Oracle Tips by Burleson
|
The capability of temporary tables was a new
feature of Oracle8i. A GLOBAL TEMPORARY table is one whose definition
is visible to all sessions, but whose data is visible only to the user
who is currently using the definition. In addition, a temporary table
can have session-specific or transaction-specific data, depending on
how the ON COMMIT clause is used in the table’s definition. The
temporary table doesn’t go away when the session or sessions are
finished with it; however, the data in the table is removed. GLOBAL
TEMPORARY tables are created in the user’s TEMPORARY tablespace by
default. Listing 4.4 is an example of the creation of both a preserved
and deleted temporary table.
LISTING 4.4 Example of the creation of a
temporary table.
SQL>
CREATE GLOBAL TEMPORARY TABLE test6 (
2 starttestdate DATE,
3 endtestdate DATE,
4 results NUMBER)
5* ON COMMIT PRESERVE ROWS
SQL> /
Table
created.
SQL> desc
test6
Name
Null? Type
------------------------------ ------- ------
STARTTESTDATE
DATE
ENDTESTDATE
DATE
RESULTS
NUMBER
SQL> CREATE
GLOBAL TEMPORARY TABLE test7 (
2 starttestdate DATE,
3 endtestdate DATE,
4 results NUMBER)
5 ON COMMIT DELETE ROWS
6 /
Table
created.
SQL> desc
test7
Name
Null? Type
------------------------------ ------- ------
STARTTESTDATE
DATE
ENDTESTDATE
DATE
RESULTS
NUMBER
SQL> insert
into test6 values (sysdate, sysdate+1, 100);
1 row
created.
SQL> commit;
Commit
complete.
SQL> insert
into test7 values (sysdate, sysdate+1, 100);
1 row
created.
SQL> select
* from test7;
STARTTEST
ENDTESTDA RESULTS
----------------------------------------------
29-MAR-99 30-MAR-99 100
SQL> commit;
Commit complete.
SQL> select
* from test6;
STARTTEST
ENDTESTDA RESULTS
----------------------------------------------
29-MAR-99 30-MAR-99 100
SQL> select
* from test7;
no rows
selected
SQL>
Pay particular attention here to these specific items in
Listing 4.4: First, notice that with the PRESERVE option, the data is
kept after a commit; with the DELETE option, the data is removed from
the table when a COMMIT occurs. Even with the GLOBAL option set and
select permission granted to public on the temporary table, I couldn’t
see the data in the table from another session; however, I could
perform a describe of the table and insert my own values into it,
which then the owner of the temporary table couldn’t see.
Creation of an Index-Only Table
Index-only tables have been around since
Oracle8. If neither the HASH or INDEX ORGANIZED options are used with
the CREATE TABLE command, then a table is created as a standard hash
table. If the INDEX ORGANIZED option is specified, the table is
created as a B-tree-organized table identical to a standard Oracle
index created on similar columns. Index-organized tables do not have
rowids.
Index-organized tables have the option of
allowing overflow storage of values that exceed optimal index row
size, as well as allowing compression to be used to reduce storage
requirements. Overflow parameters can include columns and the percent
threshold value to begin overflow. An index-organized table must have
a primary key. Index-organized tables are best suited for use with
queries based on primary key values. Index-organized tables can be
partitioned in Oracle8i and in Oracle as long as they do not contain
LOB or nested table types. The PCTHRESHOLD value specifies the amount
of space reserved in an index block for row data. If the row data
length exceeds this value, then the row(s) are stored in the area
specified by the OVERFLOW clause. If no overflow clause is specified,
rows that are too long are rejected. The INCLUDING COLUMN clause
allows you to specify at which column to break the record if an
overflow occurs. For example:
CREATE TABLE
test8
( doc_code CHAR(5),
doc_type INTEGER,
doc_desc VARCHAR(512),
CONSTRAINT pk_docindex PRIMARY KEY (doc_code,doc_type)
)
ORGANIZATION INDEX TABLESPACE data_tbs1
PCTTHRESHOLD 20 INCLUDING doc_type
OVERFLOW TABLESPACE data_tbs2
/
In this example, the IOT test8 has three
columns: the first two make up the key value; the third column in
test8 is a description column containing variable-length text. The
PCTHRESHOLD is set at 20, and if the threshold is reached, the
overflow goes into an overflow storage in the data_tbs2 tablespace
with any values of doc_desc that won’t fit in the index block. Note
that you will obtain the best performance from IOTs when the complete
value is stored in the IOT structure; otherwise, you end up with an
index and table lookup as you would with a standard index-table setup.
Using External Tables in Oracle
Oracle has added the capability to use
external tables as sources for data in an Oracle database. The data
will still reside in the external tables but will be accessible to
users from inside the database. The general format for accomplishing
this connection between an external data table and the database is:
CREATE TABLE
[schema.]table_name
[relational_properties]
ORGANIZATION EXTERNAL external_table_clause;
where relational_properties are:
( column datatype constraint list)
[table constraints]
[logging clauses]
and external_table_clause is:
([TYPE external_driver_type] external_data_properties) [REJECT LIMIT
n|UNLIMITED]
external_data_properties are:
DEFAULT DIRECTORY directory
[ACCESS PARAMETERS (opaque_format_spec)| USING CLOB subquery]
LOCATION ([directory:]’location specifier’)
TYPE. TYPE access_driver-type.
Indicates the access driver of the external table. The access driver
is the API that interprets the external data for the database. If you
do not specify TYPE, Oracle uses the default access driver,
ORACLE_LOADER.
DEFAULT DIRECTORY. Lets you specify one
or more default directory objects corresponding to directories on the
filesystem where the external data sources may reside. Default
directories can also be used by the access driver to store auxiliary
files such as error logs. Multiple default directories are permitted
to facilitate load balancing on multiple disk drives.
ACCESS PARAMETERS. (Optional) Lets you
assign values to the parameters of the specific access driver for this
external table:
opaque_format_spec. Lets you list the
parameters and their values.
USING CLOB subquery. Lets you derive
the parameters and their values through a subquery. The subquery
cannot contain any set operators or an ORDER BY clause. It must return
one row containing a single item of datatype CLOB. Whether you specify
the parameters in an opaque_format_spec or derive them using a
subquery, Oracle does not interpret anything in this clause. It is up
to the access driver to interpret this information in the context of
the external data.
LOCATION. Lets you specify one external
locator for each external data source. Usually the location_identifier
is a file, but it need not be. Oracle does not interpret this clause.
It is up to the access driver to interpret this information in the
context of the external data.
REJECT LIMIT. Lets you specify how many
conversion errors can occur during a query of the external data before
an Oracle error is returned and the query is aborted. The default
value is 0.
Use the external table clause to create an
external table, which is a read-only table whose metadata is stored in
the database but whose data in stored outside database. Among other
capabilities, external tables let you query data without first loading
it into the database.
Because external tables have no data in the
database, you define them with a small subset of the clauses normally
available when creating tables.
* Within the relational_properties clause, you
can specify only column datatype and column_constraint. Further, the
only constraints valid for an external table are NULL, NOT NULL, and
CHECK constraints.
* Within the table_properties clause, you can
specify only the parallel_clause and the enable-disable clause:
* The parallel_clause lets you parallelize
subsequent queries on the external data.
* The enable-disable clause lets you either
enable or disable a NULL, NOT NULL, or CHECK constraint. You can
specify only ENABLE or DISABLE, and CONSTRAINT constraint_name. No
other parts of this clause are permitted.
As for list partitioning, there are
restrictions associated with using external tables:
* No other clauses are permitted in the same
CREATE TABLE statement if you specify the external_table_clause.
* An external table cannot be a temporary
table.
* An external table cannot be indexed
* An external table cannot be analyzed
Creation of an External Table: An Example
I have a listing of all of the SQL scripts I
use to manage Oracle databases.
'-rw-r--r--';1;oracle;dba;626;Apr
17 18:25;accept.sql;
'-rw-r--r--';1;oracle;dba;11103;Apr 17 18:25;access.sql;
'-rw-r--r--';1;oracle;dba;3295;Apr 18 01:19;act_size8.sql;
'-rw-r--r--';1;oracle;dba;918;Apr 17 18:25;active_cursors.sql;
'-rw-r--r--';1;oracle;dba;63;Aug 21 12:35;afiedt.buf;
'-rw-r--r--';1;oracle;dba;273;Apr 17 18:25;alter_resource.sql;
'-rw-r--r--';1;oracle;dba;5265;Apr 17 18:25;alter_views.sql;
'-rw-r--r--';1;oracle;dba;401;Apr 17 18:25;anal_tab.sql;
'-rw-r--r--';1;oracle;dba;374;Apr 17 18:25;analyze_all.sql;
'-rw-r--r--';1;oracle;dba;244;Apr 17 18:25;analz_sch.sql;
'-rw-r--r--';1;oracle;dba;989;Apr 17 19:25;auto_chn.sql;
'-rw-r--r--';1;oracle;dba;1861;Apr 17 18:25;auto_defrag.sql;
'-rw-r--r--';1;oracle;dba;167;Apr 17 18:25;awt.sql;
'-rw-r--r--';1;oracle;dba;481;Apr 18 01:20;backup.sql;
'-rw-r--r--';1;oracle;dba;405;Apr 18 01:20;block_usage.sql;
'-rw-r--r--';1;oracle;dba;960;Apr 18 01:21;blockers.sql;
'-rw-r--r--';1;oracle;dba;940;Apr 17 18:25;blockers2.sql;
'-rw-r--r--';1;oracle;dba;1002;Apr 18 01:21;bound2.sql;
'-rw-r--r--';1;oracle;dba;1299;Apr 18 01:22;bound_ob.sql;
'-rw-r--r--';1;oracle;dba;1742;Apr 17 18:25;brown.sql;
To match this external file, I created a
CREATE TABLE command that matches up the columns in the internal
representation with the external file using standard SQLLOADER
controlfile syntax:
CREATE
DIRECTORY sql_dir as ‘/home/oracle/sql_scripts’;
CREATE TABLE sql_scripts (permissions VARCHAR2(20),
filetype NUMBER(3),owner VARCHAR2(20),
group_name VARCHAR2(20), size_in_bytes NUMBER,
date_edited DATE , script_name VARCHAR2(64))
ORGANIZATION EXTERNAL
(TYPE ORACLE_LOADER
DEFAULT DIRECTORY sql_dir
ACCESS PARAMETERS
(FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY "'"
(permissions, filetype,owner,group_name,size_in_bytes,
date_edited DATE(19) "Mon dd 2001 hh24:mi",
script_name))
LOCATION ('file.dat'))
/
Here is what it actually looked like during
the creation:
SQL> get
external_table
1 CREATE TABLE sql_scripts (permissions VARCHAR2(20),
2 filetype NUMBER(3),owner VARCHAR2(20),
3 group_name varchar2(20), size_in_bytes number,
4 date_edited date , script_name VARCHAR2(64))
5 ORGANIZATION EXTERNAL
6 (TYPE ORACLE_LOADER
7 DEFAULT DIRECTORY sql_dir
8 ACCESS PARAMETERS
9 (FIELDS TERMINATED BY ';' OPTIONALLY ENCLOSED BY "'"
10 (permissions, filetype,owner,group_name,size_in_bytes,
11 date_edited DATE(19) "Mon dd 2001 hh24:mi",
12 script_name))
13* LOCATION ('file.dat'))
SQL> /
Table
created.
Elapsed:
00:00:00.37
SQL> DESC
sql_scripts
Name
Null? Type
-------------------------------------- -------- ---------------------
PERMISSIONS
VARCHAR2(20)
FILETYPE
NUMBER(3)
OWNER VARCHAR2(20)
GROUP_NAME
VARCHAR2(20)
SIZE_IN_BYTES
NUMBER
DATE_EDITED
DATE
SCRIPT_NAME
VARCHAR2(64)
SQL> SET
AUTOTRACE ON EXPLAIN
SQL> SET
TIMING ON
SQL> SELECT
COUNT(*) FROM sql_scripts;
COUNT(*)
----------
441
Elapsed:
00:00:00.38
Execution
Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 EXTERNAL TABLE ACCESS
(FULL) OF 'SQL_SCRIPTS'
I then used multiple Linux copy (cp) commands
to make three additional copies of the file.dat file, and then used
the Linux concatenation (cat) command to combine them with the
original to make the file four times larger. Then I renamed the larger
file using the Linux move (mv) command to the name of the original
file. Without changing a thing inside Oracle, I was able to reselect
from the new external table:
SQL> /
COUNT(*)
----------
1764
Elapsed:
00:00:00.37
Execution
Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 EXTERNAL TABLE ACCESS
(FULL) OF 'SQL_SCRIPTS'
To get an idea of the performance of these external tables I
kept quadrupling the size of the external table until I reached
1,806,336 rows. The chart in Figure 4.1 shows the results from the
full table scans of the external table compared against a full table
scan of an identical internal table created from a SELECT * FROM
sql_tables command of the largest external table. The actual data is
shown in Table 4.1.
You must remember that external tables cannot
be analyzed nor indexed, so the performance will always be for a full
table scan. In addition, if you have a syntax error in your SQLLOADER
command file section, it will not show itself until an attempt is made
to access that row. In my initial specification, I had an error in the
date conversion; though I was able to select columns without the
dates, and do row counts and value counts, as soon as I did anything
that attempted to access the date_edited column, I would get an OCI
error. This indicates that you should always verify that you can
access all rows before turning over an external table for general use.
Also note that, by default, unless you specify no log be generated, a
log identical to a SQLPLOADER log will be generated and inserted into
each time a user accesses the external table.
Note: If you have a multiple file
external file, you can use parallel access to speed performance. There
is an excellent example of this type of external table access in the
Oracle SQL manual, in the CREATE TABLE section.
External tables can also be used to export
data out of Oracle. The following example uses the EMP and DEPT tables
in the SCOTT schema to show how all employees working in the MARKETING
and ENGINEERING departments can be exported in parallel using the
default degree of parallelism.
#
# Export
employees working in the departments 'MARKETING' and
#
'ENGINEERING'
#
CREATE TABLE
emp_ext
(empno NUMBER, ename VARCHAR2(100), ...)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_INTERNAL
DEFAULT DIRECTORY private2
LOCATION (export_dir:'/emp_data/emp1.exp',
export_dir:'/emp_data/emp2.exp')
...)
PARALLEL
AS SELECT e.*, d.dname
FROM emp e, dept d
WHERE e.deptno = d.deptno
AND d.dname IN ('MARKETING', 'ENGINEERING');
At this point, you will have an external table
that contains all of the records corresponding to the MARKETING and
ENGINEERING departments and that can be used by other programs if
desired. Once changes are made to the data, it can be reimported to
the external table specification and then moved easily from there back
into its parent tables. Notice that the TYPE becomes ORACLE_INTERNAL,
rather than ORACLE_LOADER ,and that no conversion specification is
required.
#
# Re-import
the employees that are working in the 'ENGINEERING' department
#
CREATE TABLE
emp_import
PARALLEL
AS SELECT *
FROM emp_ext
WHERE dname = 'ENGINEERING';
This example illustrates how the external
table feature can help to export (and import) a selective set of
records.
Sizing an Oracle Nonclustered Table
The procedures in this section describe how to
estimate the total number of data blocks necessary to hold data
inserted to a nonclustered table. No allowance is made for changes to
PCTFREE or PCTUSED due to insert, delete, or update activity. On the
Wiley Web site I have provided Oracle relational table and
index-sizing spreadsheets.
TIP: This sizing example is a
best-case scenario, when users insert rows without performing deletes
or updates.
Typically, the space required to store a set
of rows that undergo updates, deletes, and inserts will exceed this
calculated value. The actual space required for complex workloads is
best determined by analysis of an existing table and then scaled by
the projected number of future rows in the production table. In
general, increasing amounts of concurrent activity on the same data
block result in additional overhead (for transaction records), so it
is important to take into account such activity when scaling empirical
results.
Calculating space required by nonclustered
tables is a five-step process:
1.
Calculate the total block header size.
2.
Calculate the available data space per data block.
3.
Calculate the space used per row.
4.
Calculate the total number of rows that will fit in a data block.
5.
With the rows/block data, calculate the total number of data blocks
and convert to kilo- or megabytes.
Let’s
take a more detailed look at the steps.
Step 1: Calculate the Total Block Header
Size
The space required by the data block header is
the result of the following formula:
Space after
headers (hsize) =
DB_BLOCK_SIZE - KCBH - UB4 - KTBBH - ((INITRANS - 1) * KTBIT) -
KDBH
where:
DB_BLOCK_ SIZE. The database blocksize
with which the database was created. It can be viewed in the
V$PARAMETER view by selecting:
SELECT value FROM v$parameter WHERE name = 'db_block_size';
KCBH, UB4, KTBBH, KTBIT, KDBH.
Constants whose sizes you can obtain by selecting from entries in the
V$TYPE_SIZE view.
KCBH. The block common header; on NT,
with a 4-KB blocksize, this is 20.
UB4. Short for “either byte 4”;
on NT with a 4-KB blocksize, this is 4.
KTBBH. The transaction fixed-header
length; on NT with a 4-KB blocksize, this is 48.
KTBIT. The transaction variable header;
on NT with a 4-KB blocksize, this is 24.
KDBH. The data header; on NT with a
4-KB blocksize, this is 14.
INITRANS. The initial number of
transaction entries allocated to the table.
So, for an NT 4.0 platform with a 4-KB
blocksize and an INITRANS value of 5, the calculation would be:
DB_BLOCK_SIZE - KCBH - UB4 - KTBBH - ((INITRANS - 1) * KTBIT) - KDBH
hsize = 4192 - 20 - 4 - 48 - ((5 - 1)*24) - 14 =
4192 - 182 = 4010 bytes
Step 2: Calculate the Available Data Space per
Data Block
The space reserved in each data block for
data, as specified by PCTFREE, is calculated as follows:
available
data space (availspace) =
CEIL(hsize * (1 - PCTFREE/100 )) - KDBT
where:
CEIL. The round fractional result to the next
highest integer.
PCTFREE. The percentage of space reserved for
updates in the table.
KDBT. A constant corresponding to the Table
Directory Entry size, which you can obtain by selecting the entry from
the V$TYPE_SIZE view. For an NT 4.0 platform with a 4-KB blocksize,
this is 4 KB.
TIP: If you are unable to locate
the value of KDBT, use the value of UB4 instead.
So, to carry on our example, assuming a
PCTFREE of 20 for our table:
CEIL(hsize *
(1 - PCTFREE/100 )) - KDBT
CEIL(4010* (1 - 20/100)) - 4 = CEIL((4010*.8) - 4 ) = CEIL(3208 - 4) =
3204
Step 3: Calculate the Space Used per Row
Calculating the amount of space used per row
is a multistep task. First, you must calculate the column size,
including byte lengths:
Column size
including byte length =
column size + (1, if column size < 250, else 3)
I suggest using estimated averages for all
variable-length fields, such as numeric, VARCHAR2, and raw.
Remember that NUMBER datatypes are stored at a two-to-one ratio in the
database (e.g., a NUMBER(30) takes up 15 bytes of storage if each
place is filled). The maximum for a NUMBER is 21 bytes. The size for a
DATE is 7 bytes. Rowid takes 10 bytes for the extended and 6 bytes for
the restricted type of rowid. CHAR always takes its full specified
length; VARCHAR2, RAW, and other variable-length fields will use only
the space they actually take up.
TIP: You can also determine column size
empirically, by selecting avg(vsize(colname)) for each column in the
table.
For example, I have a table TEST with a single
VARCHAR2(50) column that has eight rows of various lengths. The return
from the select SELECT AVG(VSIZE(TEST1)) FROM TEST; is:
AVG(VSIZE(TEST1))
------------------------
29
The table also has a number column, TEST2:
AVG(VSIZE(TEST2))
----------------------
7
Then, calculate the row size:
Rowsize =
row header (3 * UB1) + sum of column sizes including length bytes
UB1 is 'UNSIGNED BYTE 1' and is 1 on NT 4.0
with a 4 KB block size.
Rowsize =
(3*1) + (8 + 30) = 41
Of course, if you have a sample table, the
quickest way to get average row size is to analyze it and then select
average row size from USER_TABLES:
SQL> analyze
table test1 compute statistics;
Table analyzed.
SQL> select avg_row_len from user_tables where table_name='TEST1';
AVG_ROW_LEN|
-----------
41
Finally, you can calculate the space used per
row:
Space used
per row (rowspace) =
MIN(UB1 * 3 + UB4 + SB2, rowsize) + SB2
where:
UB1, UB4, SB2. Constants whose size can be
obtained by selecting entries from the V$TYPE_SIZE view.
UB1. “Unsigned byte 1,” set to 1 for NT
4.0 with a 4-KB blocksize.
UB4. “Unsigned byte 4,” set to 4 for NT
4.0 with a 4-KB blocksize.
SB2. “Signed byte 2,” set to 2 for NT
4.0 with a 4-KB blocksize.
This becomes:
MIN((1*3) +
4 + 2, 41) + 2, or, 41 + 2 = 43.
To determine MIN, take the lesser of either
UBI *3 + UB4 + SB2 or the calculated rowsize value.
If the space per row exceeds the available
space per data block, but is less than the available space per data
block without any space reserved for updates (for example, available
space with PCTFREE=0), each row will be stored in its own block.
When the space per row exceeds the available
space per data block without any space reserved for updates, rows
inserted into the table will be chained into two or more pieces;
hence, the storage overhead will be higher.
Step 4: Calculate the Total Number of Rows
That Will Fit in a Data Block
You can calculate the total number of rows
that will fit into a data block using the following equation:
Number of
rows in block =
FLOOR(availspace / rowspace)
where FLOOR is the fractional result rounded
to the next-lowest integer.
For our example, this becomes:
FLOOR(3204/43) = 74
Step 5: Calculate the Total Blocks Required
The next step is to calculate the total blocks
required, which involves finding the ratio of total rows divided by
the maximum number of rows able to be stored per block. Once we have
the number of total blocks, it is a simple matter to multiply this
times the blocksize and make the appropriate conversions to get to
megabytes or gigabytes as required.
Total blocks
=
(total table rows) / (rows per block)
Total kilobytes = CEIL((total blocks * block size) / 1024)
Total megabytes =
CEIL((total blocks * block size) / 1048576)
— (1024^2)
For our example, we estimate we will have
42,000 rows in this table over the next year. So, the calculation
becomes:
((42000/74)*4192)/1024 = 2324k or 3m (rounding up)
Of course, you can also use the table-sizing
spreadsheet I have provided on the Wiley Web site. A screenshot of
this spreadsheet is shown in Figure 4.2.
Figure 4.2 Screenshot of the table-sizing
spreadsheet.
See
Code Depot for Full Scripts
 |
This is an excerpt
from Mike Ault, bestselling author of "Oracle
10g Grid and Real Application Clusters".
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts. |
|