Oracle Real Application Clusters

 < Day Day Up > 


With respect to parallel execution, all the features and functions described in the sections above also apply to RAC. Parallel execution in a RAC environment behaves exactly the same way as a parallel execution process on a single instance, with one difference. This difference, which is advantageous to parallel execution, is that portions of the statement can be executed on multiple nodes that are participating as a member in the cluster. The server subdivides the statement into smaller operations that run against a common database residing on a shared disk.

Several CPUs on multiple nodes can each scan part of the table in parallel and aggregate the results. With the cache-fusion-based architecture in RAC, Oracle uses a function shipping strategy to perform work on remote nodes. Oracle's parallel architecture uses function shipping when the target data is located on the remote node. This function shipping technology eliminates unneeded internode data transfers across the cluster interconnect. Participation of multiple nodes in the parallel execution task depends on the volume of work being handled by each of the servers. If the load on the system is minimal, the work will be spread across as many servers as required by the query definition. If the system is fully loaded, then a few local servers will be used to minimize any additional overhead required to coordinate local processes and to avoid any interinstance overhead.

Figure 6.4 illustrates the parallel execution process on a RAC environment. The QC process is on Node 1 because Node 1 is the query initiator. Notice that there are several PQS processes on both Node 1 and Node 2, however there is none on Node 3. There could be several reasons for this behavior. Either Node 3 could be busy and low on resources or the parallel option has not been enabled on the node. However, Node 2 is only partially busy and hence some of the PQS processes are on Node 2.

Figure 6.4: Parallel proces sing in a three- node RAC environment.

The number of nodes that can participate in the parallel execution depends on the degree of parallelism (DOP) assigned to each table or index.

6.6.1 Degree of parallelism

The parallel execution coordinator may enlist two or more of the instance's parallel execution servers to process an SQL statement. The number of parallel execution servers associated with a single operation is known as the degree of parallelism (DOP).

DOP applies directly to intra-operation parallelism. The total number of parallel execution servers for the statement can be twice the specified DOP. No more than two sets of parallel execution servers can execute simultaneously. Each set of parallel execution servers may process multiple operations. Only two sets of parallel execution servers need to be active to guarantee optimal inter-operation parallelism.

DOP can be applied to the query in several different ways:

The DOP is calculated as

DEGREE * INSTANCES

This means that a query on an object with DEGREE set to 1 and INSTANCES set to 5 will run in parallel with five slaves per set. Pre-8i, the query would run serially.

Parallel operations could also be performed by grouping or pooling instances together to form a parallel operation cluster. For example if a 12-node cluster is utilized for batch and OLTP processing, certain nodes could be isolated for OLTP purposes and others could be isolated for batch-type operations.

In this case a certain number of instances could be grouped together to form an instance group using the parameter INSTANCE_GROUP. In the example above, if instances RAC6, RAC7, RAC8 are used for batch operations they could be grouped together using the parameter, such as:

During execution, the group name is assigned to the process using the parameter PARALLEL_INSTANCE_GROUP = batch. When the batch pro cess is executed from the instance(s) in which this parameter is defined, it uses the group of instances to execute the operation in parallel.

6.6.2 Parallel DDL

All DDL operations, with the exception of tables with object or LOB columns (datatypes), can use the parallel execution process. Examples of DDL statements are:

The PARALLEL option can be set at table or index level by using the CREATE or ALTER operation as shown below:

CREATE TABLE PRODUCT PARALLEL (degree x instances y)

where x is the number of slaves to use per instance per slave set and y is the number of instances to be used to parallelize across. Each slave set gets instance x slaves per instance. If the internode parallel query (IPQ) is in use, there will be xy slaves per slave set. If the query being run is complex enough to require producer and consumer slave sets, the actual requirements are 2x, or, in the IPQ case 2xy.

The attributes can be checked by querying the data dictionary tables DBA_TABLES and DBA_INDEXES:

COL DEGREE FORMAT A10 COL INSTANCES FORMAT A10 SELECT TABLE_NAME, DEGREE, INSTANCES FROM DBA_TABLES WHERE TRIM(DEGREE) ! = '1'; TABLE_NAME DEGREE INSTANCES ----------------- ---------- ---------- PRODUCT 2 1 SELECT INDEX_NAME, DEGREE, INSTANCES FROM DBA_TABLES WHERE TRIM(DEGREE) ! = '1';

Note 

Setting the degree of parallelism for a table or index will force the use of cost-based optimizer (CBO) even in the complete absence of statistics.

6.6.3 Parallel DML

Parallel DML is most often used to speed up DML operations against large objects (e.g., overnight batch updates). The extra statement required to enable PDML is documented (in Oracle documentation) as being too much of an overhead for OLTP transactions. This feature was not intended to be used for small transactions, as the costs involved in calculating how data is to be partitioned amongst parallel execution slaves could actually decrease performance.

All DML (INSERT, UPDATE, and DELETE) operations could use the parallel execution process. Parallel DML uses parallel execution mechanisms to speed up or scale up large DML operations against large database tables and indexes. INSERT INTO...SELECT FROM statements that operate against multiple tables as part of single DML operation can also use parallel operations.

Behavior of locks placed on the tables during DML operations is different between serial and parallel DML operations; therefore, it is required that parallel DML be explicitly enabled by issuing the following statement:

ALTER SESSION ENABLE PARALLEL DML;

Once this statement has been issued, any DML activity against a table with the parallel attribute will occur in parallel if no parallel DML (PDML) restrictions are violated.

This statement must be the first statement of a transaction. If any transaction has not been rolled back or committed before enabling PDML, the following error will be raised:

ORA-12841: Cannot alter the session parallel DML state within a transaction

Once a PDML statement has been executed, a rollback or commit must be issued before the table can be modified again or queried within the same session. If the transaction is not committed or rolled back, subsequent DML or queries against the table will report:

ORA-12838: cannot read/modify an object after modifying it in parallel

Each slave process is considered a separate transaction. Therefore, they cannot see any uncommitted data from other transactions, including the other slave processes currently working on the same parent transaction. This is the reason that enabling PDML must be the first statement in a transaction. The coordinator processes (created when connecting to the database) will also create a separate transaction. These processes will not be able to see the data changed by the slave processes until it is committed, hence the ORA-12838 error is raised when trying to view or modify the table.


 < Day Day Up > 

Категории