Oracle Real Application Clusters
| < Day Day Up > |
|
SQL*Loader is a utility provided by Oracle that helps read data from a flat file and load data based on certain criteria specified in a control file. Oracle provides two methods to load data using SQL*Loader:
-
Conventional path method (default)
-
Direct path (supports parallel load option)
6.7.1 Conventional path
Conventional path load uses the INSERT statement and a bind array buffer to load data into database tables. All Oracle tools and applications use this method.
When SQL*Loader performs a conventional path load, it competes equally with all other processes for buffer resources. This can slow the load significantly. Extra overhead is added as SQL commands are generated, passed to Oracle, and executed. Oracle looks for partially filled blocks and attempts to fill them on each insert.
6.7.2 Direct path
A direct path load eliminates much of the Oracle database overhead by formatting Oracle data blocks and writing the data blocks directly to the database files. However, while providing us with the benefit of loading data directly and avoiding overhead, this method has a disadvantage in constraint validation. A direct path method disables all database triggers and referential integrity constraints during the load and they must be manually enabled after loads have been completed. Indexes are retained.
Instead of filling a bind array buffer and passing it to Oracle with an SQL INSERT command, a direct path load parses the input data according to the description given in a control file, converts the data for each input field to its corresponding Oracle column data type, and builds a column array structure (an array of <length, data> pairs).
Figure 6.5 illustrates the SQL*Loader architecture, showing the various files that are involved when SQL*Loader loads data into the database. SQL*Loader then uses the column array structure to format Oracle data blocks and to build index keys. The newly formatted database blocks are then written directly to the database (multiple blocks per I/O request using asynchronous writes if the host platform supports asynchronous I/O), bypassing most RDBMS processing. Internally, multiple buffers are used for the formatted blocks. While one buffer is being filled, one or more buffers are being written (if asynchronous I/O is available on the host platform).
The direct path option of SQL*Loader supports loading of data in parallel streams that is based on the definitions in the control file. Data can be loaded into a table, or multiple tables, by reading one or more data files simultaneously in multiple streams. Parallel loads take advantage of the high-speed load option by dramatically reducing the elapsed time needed to perform the load operation. Apart from bypassing all the validation layers of Oracle, direct path loads insert the rows above the high water mark (HWM) of the data file.
For example, to load three files simultaneously into the database the DIRECT =TRUE and PARALLEL =TRUE options are enabled:
SQLLOAD USERID =MVALLATH/MJGN CONTROL =LOAD1.CTL DIRECT =TRUE PARALLEL =TRUE SQLLOAD USERID =MVALLATH/MJGN CONTROL =LOAD2.CTL DIRECT =TRUE PARALLEL =TRUE SQLLOAD USERID =MVALLATH/MJGN CONTROL =LOAD3.CTL DIRECT =TRUE PARALLEL =TRUE
Restrictions on parallel direct path loads
The following restrictions are enforced on parallel direct path loads:
-
When loading data into a partitioned table, the load can maintain neither local nor global indexes.
-
Referential integrity and check constraints must be disabled.
-
Triggers must be disabled.
-
Rows can only be appended to the already loaded table. When loading data into the same table using the parallel load option, APPEND is the only mode of data loading allowed. REPLACE, TRUNCATE,and INSERT cannot be used (this is because the individual loads are not coordinated). If a table must truncate before a parallel load, it must be done manually.
| < Day Day Up > |
|