SQL*Loader

From Oracle FAQ
Jump to: navigation, search

SQL*Loader is a tool used by DBAs and developers to populate Oracle tables with data from flat files. The SQL*Loader gives a lot of flexibility to selectively load certain columns but not others, or to exclude certain records entirely. SQL*Loader has some advantages over programming languages that allow embedded SQL statements, as well. SQL*Loader consist of understanding its elements. The first is the data to be loaded, which is stored in the datafile (this is basically a flat/text file, and should not be confused with Oracle Server datafiles, that make up the database). The second is the control file (this is basically a text file, acting as a directive to the Loader, it should not be confused with Oracle Servers control file, which holds database-related information). SQL*Loader accepts special parameters that can affect how the load occurs, called command-line parameters. These parameters which include the ID (username/password or commonly know as the schema) to use while loading data, the name of the data file, and the name of the control file.

SQL*Loader in action consist of several additional items. If, in the course of performing data loads, SQL*Loader encounters records it cannot load, the record is rejected and the SQL*Loader puts it in special file called bad file. Additionally, SQL*Loader gives the user options to reject data based on special criteria. These criteria are defined in the control file as a part of the when clause. Is the SQL*Loader encounters a record that fails a specified when clause, the record is placed in a special file called discard file.

Contents

[edit] Modes of operation

SQL*Loader operates in two modes:

[edit] Conventional loads

In a conventional load, SQL*Loader reads multiple data records from the input file (flat/text file) into a bind array. When the array fills, SQL*Loader passes the data to the Oracle SQL processing mechanism for insertion into the database tables. Oracle SQL processing mechanism will in-turn generates equivalent INSERT statements. All records have to pass through database buffer cache, and only DBWR writes the records to the physical datafiles. Since all data passes through SGA, in case of instance failure recovery is possible. The conventional path load is nondisruptive and work on the same principles that a normal database insert (DML) works, only much faster. Some of its characteristics are outlined below:

  • When loading data across network (client/server), it is better to user conventional load.
  • When loading data in clustered tables, only conventional load can be used.
  • Records loaded will update the associated indexes, enforce any database integrity rules defined (Primary Key, Foreign Key, Check constraints), as the records are loaded.
  • No exclusive locks are acquired when the conventional path loads are performed.

[edit] Direct loads

During direct loads, SQL*Loader reads records from the datafile, converts those records directly into Oracle data blocks, and writes them directly to the disk. Since direct path loading bypasses SGA (database buffer cache), in case of instance failure recovery is not possible. Direct loads are typically used to load large amount of data in a short time. The direct path load works much faster compared to conventional loads. Some of its characteristics of Direct Load are outlined below:

  • At the beginning of the direct load, SQL*Loader makes a call to Oracle to put a lock on the tables which are being loaded, at the end it makes another call to release the lock. It may call Oracle intermediately to get extent information. So a direct path load makes very few calls to Oracle.
  • Direct path loader checks the integrity constraints only in the end (i.e. after loading entire data).
  • The constraints are disabled by the direct load process, before starting loading. Once done the constraints are enabled.
  • If enabling of constraints fails due to data errors, then the state of the constraints are left disabled, which is undesirable. So it is always better to check the state of your constraints after direct path loads.
  • Direct load process, will not update indexes as the data is loaded, it does this operation after all the data is loaded. In fact it rebuilds the indexes associated with the table.
  • Due to reasons like: not finding more space to load data (datafile being full), or instance failure or duplicate values in primary key columns. If the load fails, then the indexes are left in direct Load State and are unusable, a DBA or developers should note this and remove culprit records and enable his constraints and rebuild his indexes.
  • Insert triggers are disabled at the beginning of direct Load State. For example you have an insert trigger which fires after each row is inserted to update the time and userid field of a table, the insert trigger will not be functional when you are loading the records. You may have to write an update trigger or stored procedures to handle these records after direct loading.
  • Any Referential integrity constraint defined on a table, is not enforced during direct loads.
  • If an column datatype is varchar2(100),direct path load will allow to store more than 100 characters in that column.

[edit] Command Line parameters

The following parameters are accepted by SQL * Loader on the command line. I.e. at the "$ "or "dos" prompt.

USERID - Oracle userid and password i.e. schema 
CONTROL - Control filename 
LOG - Log filename 
BAD - Bad filename 
DATA - Data filename 
DISCARD - Discard filename 
DISCARDS - Number of discards to terminate the load 
SKIP - Number of logical records to skip (default: 0) 
LOAD - Number of logical records to load 
ROWS - Number of rows in the conventional path bind array or between direct path saves (conventional path : 64, direct path : all) 
BINDSIZE - Size of conventional path bind array in bytes 
SILENT - Suppress messages between run 
DIRECT - Use direct path load (default :FALSE) 
PARFILE - Parameter filename 
PARALLEL - Perform parallel load (default :FALSE) 
FILE - Datafile to allocate extents 

[edit] Control files

The control file provides information to the SQL*Loader for loading the data from flat files. It provides information regarding : datafile/flat file names and format, character set used, data types of the fields, how each field is delimited, which tables and columns to load. The following parameters can be included in the control file as directives:

  • -- - Any comments to be placed
  • option - Command-line parameters to be placed in the control file as options
  • unrecoverable - Specifies whether to create redo log entries for loaded data. Unrecoverable can be used on direct loads only
  • recoverable - Specifies whether to create redo log entries for loaded data.
  • load - Load or continue_load must be specified
  • continue_load - Load or continue_load must be specified
  • data - Provided for readability
  • characterset - Specifies the character set of the data file
  • preserve blanks - Retains leading white spaces from the datafile in cases where enclosure delimiter are not present
  • begin data - Keyword denoting the beginning of data records to be loaded
  • infile [name] - Keyword to specify name(s) of input files. An asterisk (*) following this keyword indicates data records are in the control file.
  • badfile [name] - Keyword to specify the name of bad file. They are interchangable
  • discardfile [name] - Keyword to specify name of discard file. They are interchangable
  • discards [x] - Allows X discards before opening the next datafile.
  • discardmax [x] - Allows X discards before terminating the load
  • insert - Puts rows into an empty table
  • append - Appends to an existing table
  • truncate - Deletes current rows in the table, and loads new data
  • replace - Deletes current rows in the table, and loads new data
  • sorted indexes - For direct path loads, indicates that data has already been sorted in specified indexes
  • singlerow - For use when appending rows or loading a small amount of data into a large table (rowcount ration of 1:20 or less)

[edit] Data files

Datafiles, or text file, which is used to load data intoDatabase can be in two formats:

  • fixed-length fields -
Field 1 : Column 1-7 Field 2 : Column 8-15
  • variable-length fields - variable-length records are terminated by special character or enclosed by special characters
eg. |Deepak|Chebbi|

[edit] Return codes

When executing sqlldr within UNIX, the return code values have changed. In 7.x the sqlldr utility returned a 0 if successful and a 1 if not successful. If there were records that were rejected, then the sqlldr utility would return a successful return code of 0.

In Oracle 8 and above, it has four return code values. They are:

0 - successful 
1 - failed 
2 - warn 
3 - fatal 

Here are the conditions and how the return code work:

All rows loaded successfully - 0 
All/some rows discarded - 2 
All/some row rejected - 2 
Discontinued load - 2 
Command line/syntax errors - 1 
Errors Fatal to Sql*Loader - 1 
OS related errors - 3

[edit] Examples

[edit] Combined data and control file

The example below is a combined data and control file (i.e. data part of the file is present in the control file itself). The control file holds key words and data as a directive to SQL * Loader. The table under consideration is "empmast" and has fields "emp_no number(6), emp_lname varchar2(24)". The sequence of steps are listed below:

  • Create a control file, to hold the directives. You can use your favorite editor to create the file. I advice you to follow a naming standard to identify the file. In this example I have named the file "case1.ctl". The contents of the control file are:
--This control file hold data to be loaded into empmast table 
-- * is use only is the data is contained in the control file 
LOAD DATA 
INFILE * 
APPEND INTO TABLE empmast 
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY ' " ' 
(emp_no, emp_lname) 
BEGINDATA 
100, "Chebbi" 
200, Grant 
  • Invoke SQl * Loader
sqlldr / control=case1.ctl

[edit] Fixed width data, one data file - one table

The example below is a control file for fixed width data, the datafile and control fiel are separate (i.e. data part of the file is not present in the control file). The table under consideration is "empmast" and has fields "emp_no number(6), emp_lname varchar2(24)". The data file is named "xyz.dat" and the control file "case2.ctl". The sequence of steps are listed below:

  • The data file contents is as shown
100000Chebbi 
200000Grant 
300000Zinky 
  • Create a control file, to hold the directives. You can use your favorite editor to create the file.
--case2.ctl 
LOAD DATA 
INFILE 'xyz.dat' 
BADFILE 'xyz.bad' 
LOG xyz.log 
INSERT INTO TABLE empmast 
(emp_no POSITION(1:6) INTEGER, 
emp_name POSITION(7:31) CHAR) 
  • Invoke SQl * Loader
sqlload / control=case2.ctl 

[edit] Fixed width data, one data file - two tables

The example below is a control file for fixed width data, the datafile and control file are separate (i.e. data part of the file is not present in the control file). The tables under consideration is "empmast" having fields "emp_no number(6), emp_lname varchar2(24)" and empsal having fields "emp_no number(6), salary number(6)". A single datafile is being loaded into two tables. The sequence of steps are listed below:

The data file is named "xyz.dat" and the control file "case3.ctl"

  • The data file contents is listed below
100000Chebbi 
200000GrantBBBBBBBBBBBBBBBBBBB3000 
300000Zinky 4000 
  • Create a control file, to hold the directives. You can use your favorite editor to create the file.
--case3.ctl 
LOAD DATA 
INFILE 'xyz.dat' 
BADFILE 'xyz.bad' 
LOG xyz.log 
INSERT INTO TABLE empmast 
(emp_no POSITION(1:6) INTEGER, 
emp_name POSITION(7:31) CHAR) 
INTO TABLE emp_sal 
(emp_no POSITION(1:6) INTEGER, 
empsal POSITION(32:37) INTEGER) 
  • Invoke SQl * Loader
sqlload / control=case3.ctl 

[edit] Selective Load

The example illustrates the use of when clause, in this case if the first field is blank the record is not loaded. The tables under consideration is "empmast" having fields "emp_no number(6), emp_lname varchar2(24)". The data file is named "xyz.dat" and the control file "case4.ctl". The sequence of steps are listed below:

  • The data file contents is listed below
Chebbi 
2000 Grant 
3000 CarpinoBBBBBBBBBBBBBBBBB 
  • Create a control file, to hold the directives. You can use your favorite editor to create the file.
--case4.ctl 
LOAD DATA 
INFILE 'xyz.dat' 
BADFILE 'xyz.bad' 
LOG xyz.log 
DISCARDFILE 'xyz.dsc' 
INSERT INTO TABLE empmast 
WHEN emp_no != ' ' 
(emp_no POSITION(1:6) INTEGER, 
emp_name POSITION(7:31) CHAR) 
  • Invoke SQl * Loader
sqlload / control=case4.ctl 

[edit] Use Of Functions

The example illustrates the use of function initcap, in this case no matter the case of the second filed it is converted as upper case and stored in the database. The table under consideration is "empmast" having fields "emp_no number(6), emp_lname varchar2(24)". The data file is named "xyz.dat" and the control file "case5.ctl". The sequence of steps are listed below:

  • The data file contents is listed below
100000chebbi 
200000grantBBBBBBBBBBBBBBBBBBB 
300000zinky 
  • Create a control file, to hold the directives. You can use your favorite editor to create the file.
--case5.ctl 
LOAD DATA 
INFILE 'xyz.dat' 
BADFILE 'xyz.bad' 
LOG xyz.log 
INSERT INTO TABLE empmast 
(emp_no POSITION(1:6) INTEGER, 
emp_name POSITION(7:31) CHAR "initcap(:emp_name)") 
  • Invoke SQl * Loader
sqlload / control=case5.ctl 

[edit] Assigning Constants

The example illustrates the use of plugging in a constant value to a field. The table under consideration is "empmast" having fields "emp_no number(6), emp_lname varchar2(24), alive char(1)". The data file is named "xyz.dat" and the control file "case6.ctl". The sequence of steps are listed below:

  • The data file contents is listed below
100000chebbi 
200000grantBBBBBBBBBBBBBBBBBBB 
300000zinky 
  • Create a control file, to hold the directives. You can use your favorite editor to create the file.
--case6.ctl 
LOAD DATA 
INFILE 'xyz.dat' 
BADFILE 'xyz.bad' 
LOG xyz.log 
INSERT INTO TABLE empmast 
(emp_no POSITION(1:6) INTEGER, 
emp_name POSITION(7:31) CHAR, 
alive CONSTANT "Y") 
  • Invoke SQl * Loader
sqlload / control=case6.ctl 

[edit] Use Of Sequence

The example illustrates the use of plugging in a sequence number to a field. The table under consideration is "empmast" having fields "emp_no number(6), emp_lname varchar2(24), seq_no NUMBER". The data file is named "xyz.dat" and the control file "case7.ctl". The sequence of steps are listed below:

  • The data file contents is listed below
100000chebbi 
200000grantBBBBBBBBBBBBBBBBBBB 
300000zinky 
  • Create a control file, to hold the directives. You can use your favorite editor to create the file.
--case7.ctl 
LOAD DATA 
INFILE 'xyz.dat' 
BADFILE 'xyz.bad' 
LOG xyz.log 
INSERT INTO TABLE empmast 
(emp_no POSITION(1:6) INTEGER, 
emp_name POSITION(7:31) CHAR, 
seq_no SEQUENCE(MAX,1)) 
  • Invoke SQl * Loader
sqlload / control=case7.ctl 

[edit] Integer Feeds treated as date

The example illustrates the use of date .The table under consideration is "empmast" having fields "emp_no number(6), emp_lname varchar2(24), hire_date DATE". The data file is named "xyz.dat" and the control file "case8.ctl". I have often found in situations where in when a flat file comes from mainframes as feed to be loaded into Oracle database, the date format is in chracter or integer format (010100) . To loads this data here are the steps:

  • The data file contents is listed below
100000chebbiAAAAAAAAAAAAAAAAAA040699 
200000grantBBBBBBBBBBBBBBBBBBB040599 
  • Create a control file, to hold the directives. You can use your favorite editor to create the file.
--case8.ctl 
LOAD DATA 
INFILE 'xyz.dat' 
BADFILE 'xyz.bad' 
LOG xyz.log 
INSERT INTO TABLE empmast 
(emp_no POSITION(1:6) INTEGER, 
emp_name POSITION(7:31) CHAR, 
hire_date POSITION(32:37) DATE 'RRMMDD' nullif (hire_date = '000000') 
  • Invoke SQl * Loader
sqlload / control=case8.ctl 

[edit] Conditional Load

In some situation you may want to load the records depending on the value of the first field, for example say you want to load only records having first field value as "100".The table under consideration is "invoice_detail" having fields "inv_no number(6), inv_quantity number(6), line_no number(6)". The data file is named "xyz.dat" and the control file "case9.ctl".

  • The data file contents is listed below
1000002001 
2000003001 
  • Create a control file, to hold the directives. You can use your favorite editor to create the file. In the following control file records with only invoice number "1000000" are loaded.
--case9.ctl 
LOAD DATA 
INFILE 'xyz.dat' 
BADFILE 'xyz.bad' 
LOG xyz.log 
DISCARDFILE 'xyz.dsc' 
INSERT INTO TABLE invoice_detail 
WHEN inv_no = 100000 
(inv_no POSITION(1:6) INTEGER, 
inv_quantity POSITION(7:12) INTEGER, 
line_no POSITION(13:18) INTEGER) 
  • Invoke SQl * Loader
sqlload / control=case9.ctl

[edit] Also see

[edit] External links

The documentation is available at: