Call (800) 766-1884 for Oracle support & training
Free Oracle Tips


Oracle Consulting Support
Oracle Upgrades
Use New Oracle Features
Oracle Replication Support
Oracle Training
Remote Oracle DBA
System Documentation
Oracle Tips
Oracle Performance
 

Free Oracle Tips


 

HTML Text

BC Oracle tuning

Oracle training

Oracle support

Remote Oracle

 

Oracle Export Data Subsets

Oracle Tips by Burleson Consulting

Advanced Oracle Utilities: The Definitive Reference by Rampant TechPress is written by top Oracle database experts (Bert Scalzo, Donald Burleson, and Steve Callan).  The following is an excerpt from the book.

This is probably the most powerful and useful aspect of the data export process, and yet it remains highly underutilized. For instance, if one wants to extract data from a table by using a filter upon the rows being returned. That is easily accomplished via a normal SELECT command’s WHERE clause placed in the query parameter passed to the export process. Then one could easily export only those customers who live in Texas as follows:

 

C:\> expdp bert/bert directory=data_pump_dir dumpfile=just_texas.dmp schemas=movies query=movies.customer:\"where state='TX'\"

 

That seems easy enough – but there is a small catch. The QUERY clause is applied to all the tables in the export set, so all the tables better have the columns referenced by that WHERE clause. A common example would be a schema table design where each table contains a last modified date column. So if the DBA wanted to unload just records in that schema which had been modified within the past three months, here is the data pump export command for that:

 

C:\>expdp bert/bert directory=data_pump_dir dumpfile=last_mod_date.dmp schemas=movies query=\"where last_mod_date is not null and last_mod_date > SYSDATE-90\"

 

Yet as easy and powerful as this method is, there is another method that sometimes can be exactly what one is looking for – the subset by random sample method. If one wanted to export 10% of one’s production data for use in development or test environments, then it applies the sample percentage against each object exported.

 

C:\> expdp bert/bert directory=data_pump_dir dumpfile=sample.dmp schemas=movies sample=10

 

However, there is one major drawback to the sample method:  it does not export referentially correct subsets of data. That is because it merely applies a simple algorithm, namely that the percentage represents the probability that a data block of rows will be included in the export’s sampling of the data. This is plainly applied at the table level across all of its data blocks. The sample method does not adhere to any referential integrity constraints or foreign keys defined in the data dictionary. So if an effort to export a 10% sample of the entire schema is made, it will generally end  up with messages like those shown below.

 

Processing object type SCHEMA_EXPORT/TABLE/CONSTRAINT/REF_CONSTRAINT

ORA-39083: Object type REF_CONSTRAINT failed to create with error:

ORA-02298: cannot validate (BERT.MOVIETITLE_FK) - parent keys not found

Failing sql is:

ALTER TABLE "BERT"."MOVIETITLE" ADD CONSTRAINT "MOVIETITLE_FK" FOREIGN KEY ("CATEGORYID") REFERENCES "BERT"."MOVIECATEGORY" ("CATE

GORYID") ENABLE

 

However, there are ways to get around this. One way could be to create a SQL*Plus script to generate a parameter file with a series of query parameters that would sample the data and retain the foreign key relationships. But that would constitute a two-step process: run the script to create parameter files and then run the data pump export with no easy way via the database to look at the intermediate results to verify their accuracy before attempting the actual data load. So instead, the preference is to execute the extract_data_subset.sql SQL*Plus script, shown below, to create the subset of the data in a second schema. Then the data can be readily examined for accuracy, and that schema can finally be exported once it is known to be correct.

 

    extract_data_subset.sql script

 

set linesize 200

set serveroutput on size 100000

 

create or replace package subsetdata

as

  procedure xgo (xsource varchar2, xtarget varchar2, xpercent integer);

end;

/

show error

create or replace package body subsetdata

as

  type tnames is table of varchar2(32);

  done_arr tnames := tnames();

  done_cnt integer := 0;

 

  procedure xsample (xsource varchar2, xtarget varchar2, current_table varchar2, xpercent integer)

  is

    s1   varchar2(256) := 'create table ' || xtarget || '.' || current_table || '

as

select * from ' || xsource || '.' ||current_table;

    s2   varchar2(256) := 'where rownum <= (select ceil(' || to_char(xpercent/100) || ' * count(*))

from ' || xsource || '.' || current_table || ');';

    s3   varchar2(256) := 'create table ' || xtarget || '.' || current_table || ' as

select T0.* from   ' || xsource || '.' || current_table || '  T0,';

    s4   varchar2(256) := 'where';

    cnt1 integer       := 0;

    cnt2 integer       := 0;

    i    integer       := 0;

    j    integer       := 0;

    es3  varchar2(1)   := '';

    es4  varchar2(4)   := '';

  begin

    i := 0;

    es3 := ',';

    select count(*)

    into cnt1

    from dba_constraints fk,

         dba_constraints pk

    where fk.constraint_type = 'R'

      and fk.owner = xsource

      and fk.R_owner = xsource

      and fk.table_name = current_table

      and pk.constraint_type in ('P','U')

      and pk.owner = xsource

      and pk.table_name != current_table

      and fk.r_constraint_name = pk.constraint_name;

    if (cnt1 = 0) then

      dbms_output.put_line(s1);

      dbms_output.put_line(s2);

    else

      for c1 in (select pk.table_name,

                        fk.constraint_name fk_name,

                        pk.constraint_name pk_name

                   from dba_constraints fk,

                        dba_constraints pk

                   where fk.constraint_type = 'R'

                     and fk.owner = xsource

                     and fk.R_owner = xsource

                     and fk.table_name = current_table

                     and pk.constraint_type in ('P','U')

                     and pk.owner = xsource

                     and pk.table_name != current_table

                     and fk.r_constraint_name = pk.constraint_name

                ) loop

          i := i + 1;

          if (i = cnt1) then

            es3 := '';

          end if;

          s3 := s3 || '

 ' || xtarget || '.' || c1.table_name || '  T' || to_char(i) || es3;

          j := 0;

          es4 := ' and';

          select count(*)

          into cnt2

          from dba_cons_columns fk,

               dba_cons_columns pk

          where fk.constraint_name = c1.fk_name

            and fk.owner = xsource

            and fk.table_name = current_table

            and pk.constraint_name = c1.pk_name

            and pk.owner = xsource

            and pk.table_name != current_table

            and fk.position = pk.position;

          for c2 in (select fk.column_name fk_col,

                            pk.column_name pk_col

                     from dba_cons_columns fk,

                          dba_cons_columns pk

                     where fk.constraint_name = c1.fk_name

                       and fk.owner = xsource

                       and fk.table_name = current_table

                       and pk.constraint_name = c1.pk_name

                       and pk.owner = xsource

                       and pk.table_name != current_table

                       and fk.position = pk.position

                       order by fk.position

                    ) loop

            j := j + 1;

            if (i = cnt1) and (j = cnt2) then

              es4 := ';';

            end if;

            s4 := s4 || '

 ' || 'T0.' || c2.fk_col || ' = T' || to_char(i) || '.' || c2.pk_col || es4;

          end loop;

        end loop;

      dbms_output.put_line(s3);

      dbms_output.put_line(s4);

    end if;

    done_arr.extend(1);

    done_cnt := done_cnt+1;

    done_arr(done_cnt) := current_table;

  end;

 

  procedure xprocess (xsource varchar2, xtarget varchar2, current_table varchar2, xpercent integer)

  is

    i    integer := 1;

    flg  integer := 1;

    cnt1 integer := 0;

    cnt2 integer := 0;

  begin

    xsample (xsource, xtarget, current_table, xpercent);

    for c1 in (select fk.table_name

               from dba_constraints fk,

                    dba_constraints pk

               where fk.constraint_type = 'R'

                 and fk.owner = xsource

                 and fk.R_owner = xsource

                 and fk.table_name != current_table

                 and pk.constraint_type in ('P','U')

                 and pk.owner = xsource

                 and pk.table_name = current_table

                 and fk.r_constraint_name = pk.constraint_name

              ) loop

      select count(*)

      into cnt1

      from dba_constraints fk,

           dba_constraints pk

      where fk.constraint_type = 'R'

        and fk.owner = xsource

        and fk.R_owner = xsource

        and fk.table_name = c1.table_name

        and pk.constraint_type in ('P','U')

        and pk.owner = xsource

        and pk.table_name != current_table

        and fk.r_constraint_name = pk.constraint_name;

      if (cnt1 > 0) then

        cnt2 := 0;

        flg  := 0;

        for c2 in (select pk.table_name

                   from dba_constraints fk,

                        dba_constraints pk

                   where fk.constraint_type = 'R'

                     and fk.owner = xsource

                     and fk.R_owner = xsource

                     and fk.table_name = c1.table_name

                     and pk.constraint_type in ('P','U')

                     and pk.owner = xsource

                     and pk.table_name != current_table

                     and fk.r_constraint_name = pk.constraint_name

                  ) loop

          i := 1;

          while (i <= done_cnt) loop

            if (c2.table_name = done_arr(i)) then

              cnt2 := cnt2 + 1;

            end if;

            i := i + 1;

          end loop;

        end loop;

        if (cnt1 = cnt2) then

          flg := 1;

        end if;

      end if;

      if (flg = 1) then

        xprocess (xsource, xtarget, c1.table_name, xpercent);

      end if;

    end loop;

  end;

 

  procedure xgo (xsource varchar2, xtarget varchar2, xpercent integer)

  is

  begin

    for c1 in (select table_name

               from dba_tables tab

               where tab.owner = xsource

                 and NOT EXISTS (select 1

                                 from dba_constraints fk

                                 where fk.constraint_type = 'R'

                                   and fk.owner = xsource

                                   and fk.R_owner = xsource

                                   and fk.table_name = tab.table_name

                                )

              ) loop

      xprocess (xsource, xtarget, c1.table_name, xpercent);

    end loop;

  end;

end;

/

show error

 

prompt

###

###  subsetdata.xgo(SOURCE_SCHEMA,TARGET_SCHEMA,PERCENTAGE)

###

exec subsetdata.xgo('MOVIES','BERT',10)

 

 

Oracle Consulting

  
 

Oracle performance tuning software 
 
 
 
 

Oracle performance tuning book

 

 
 
 
Oracle performance Tuning 10g reference poster
 
 
 
Oracle training in Linux commands
 
Oracle training Excel
 
Oracle training & performance tuning books
 
   

Copyright © 1996 -  2011 by Burleson Enterprises. All rights reserved.

Oracle® is the registered trademark of Oracle Corporation.