High Availability Scenarios With IBM Tivoli Workload Scheduler And IBM Tivoli Framework

 < Day Day Up > 


2.3 Software configurations

In this section we cover the different ways to implement IBM Tivoli Workload Scheduler in a cluster and also look at some of the currently available software configurations built into IBM Tivoli Workload Scheduler.

2.3.1 Configurations for implementing IBM Tivoli Workload Scheduler in a cluster

Here we describe the different configurations of IBM Tivoli Workload Scheduler workstations, how they are affected in a clustered environment, and why each configuration would be put into a cluster. We will also cover the different types of Extended Agents and how they work in a cluster.

Master Domain Manager

The Master Domain Manager is the most critical of all the IBM Tivoli Workload Scheduler workstation configurations. It is strongly recommended to configure this into a cluster, as it manages and controls the scheduling database. From this database, it generates and distributes the 24-hour daily scheduling plan called a symphony file. It also controls, coordinates and keeps track of all the scheduling dependences throughout the entire IBM Tivoli Workload Scheduler network.

Keep the following considerations in mind when setting up a Master Domain Manager in a cluster:

Let's examine these considerations in more detail.

IBM Tivoli Workload Scheduler database

The IBM Tivoli Workload Scheduler database is held in the same file system as the installed directory of IBM Tivoli Workload Scheduler. Therefore, providing this is not being mounted or links to a separate file system, then the database will follow the IBM Tivoli Workload Scheduler installation.

If the version of IBM Tivoli Workload Scheduler used is prior to Version 8.2, then you will have to consider the TWShome/../unison/ directory, as this is where part of the database is held (workstation, NT user information); the working security file is also held here.

The directory TWShome/../unison/ may not be part of the same file system as the TWShome directory, so this will have to be added as part of the cluster package. Because the database is a sequential index link database, there is no requirement to start the database before IBM Tivoli Workload Scheduler can read it.

IBM Tivoli Workload Scheduler components file

All versions prior to IBM Tivoli Workload Scheduler Version 8.2 require a components file. The contents of this file must contain the location of both maestro and Netman installations, and it is installed in the directory c:\win32app\TWS\Unison\netman. Under the UNIX operating system /usr/unson/ this needs to be accessed on both sides of the cluster.

IBM Tivoli Workload Scheduler console

The IBM Tivoli Workload Scheduler console (called the Job Scheduling Console) connects to the IBM Tivoli Workload Scheduler engine through the IBM Tivoli Management Framework or the Framework.

The Framework authenticates the logon user, and communicates to the IBM Tivoli Workload Scheduler engine through two Framework modules (Job Scheduling Services and Job Scheduling Connector). Therefore, you need to consider both the IP address of the Framework and the location of the IBM Tivoli Workload Scheduler engine code.

Domain Manager

The Domain Manager is the second critical workstation that needs to be protected in a HA cluster, because it controls, coordinates and keeps track of all scheduling dependences between workstations that are defined in the domain that this Domain Manager is managing (which may be hundreds or even a thousand workstations).

The considerations that should be kept in mind when setting up a Domain Manager in a cluster are:

In addition, the starting of all IBM Tivoli Workload Scheduler processes and services, the coordination of all messages from and to the IBM Tivoli Workload Scheduler network, and the linking of all workstations in its domain should be taken into account.

Fault Tolerant Agent

The Fault Tolerant Agent may be put in a cluster because a critical application needs to be in a HA environment, so the Fault Tolerant Agent that schedules and controls all the batch work needs to be in this same cluster.

Keep the following considerations in mind when setting up a Fault Tolerant Agent in a cluster:

In addition, the starting of all IBM Tivoli Workload Scheduler processes and services should be taken into account.

Extended Agents

An Extended Agent (xa or x-agent) serves as an interface to an external, non-IBM Tivoli Workload Scheduler system or application. It is defined as an IBM Tivoli Workload Scheduler workstation with an access method and a host. The access method communicates with the external system or application to launch and monitor jobs and test Open file dependencies. The host is another IBM Tivoli Workload Scheduler workstation (except another xa) that resolves dependencies and issues job launch requests via the method.

In this section, we consider the implications of implementing these Extended Agents in a HA cluster with the different Extended Agents currently available. All the Extended Agents are currently installed partly in the application itself and also on a IBM Tivoli Workload Scheduler workstation (which can be a Master Domain Manager, a Domain Manager or an Fault Tolerant Agent), so we need to consider the needs of the type of workstation the Extended Agent is installed on.

We will cover each type of Extended Agent in turn. The types of agents that are currently supported are: SAP R/3; Oracle e-Business Suite; PeopleSoft; z/OS access method; and Local and Remote UNIX access. For each Extended Agent, we describe how the access method will work in a cluster.

SAP R/3 access method

When you install and configure the SAP Extended Agent and then create a workstation definition for the SAP instance you wish to communicate with, there will be an R3batch method in the methods directory.

This is a C program that communicates with the remote R3 system. It finds where to run the job by reading the r3batch.opts file, and then matching the workstation name with the first field in the r3batch.opts file. R3batch then reads all the parameters in the matched workstation line, and uses these to communicate with the R/3 system.

The parameter that we are interested in is the second field of the r3batch.opts file: R/3 Application Server. This will be a IP address or domain name. In order for the Extended Agent to operate correctly, this system should be accessed from wherever IBM Tivoli Workload Scheduler is running. (This operates in the same way for the Microsoft or the UNIX cluster.)

Oracle e-Business Suite access method

The Oracle e-Business Suite Extended Agent is installed, configured on the same system as the Oracle Application server. When setting this up in a cluster, you must first configure the Fault Tolerant Agent and Extended Agent to be in the same part of the cluster.

When the Oracle Applications x-agent is started, the IBM Tivoli Workload Scheduler host executes the access method mcmagent. Using the x-agent's workstation name as a key, mcmagent looks up the corresponding entry in the mcmoptions file to determine which instance of Oracle Applications it will connect to. The Oracle Applications x-agent can then launch jobs on that instance of Oracle Applications and monitor the jobs through completion, writing job progress and status information to the job's standard list file.

PeopleSoft access method

The PeopleSoft Extended Agent is installed and configured on the same system as the PeopleSoft client. It also requires an IBM Tivoli Workload Scheduler Fault Tolerant Agent to host the PeopleSoft Extended Agent, which is also installed and configured on the same system as the PeopleSoft client.

When setting this configuration up in a cluster, you must first configure the Fault Tolerant Agent and Extended Agent to be in the same part of the cluster as the PeopleSoft Client.

To launch a PeopleSoft job, IBM Tivoli Workload Scheduler executes the psagent method, passing it information about the job. An options file provides the method with path, executable and other information about the PeopleSoft process scheduler and application server used to launch the job. The Extended Agent can then access the PeopleSoft process request table and make an entry in the table to launch the job. Job progress and status information are written to the job's standard list file.

z/OS access method

IBM Tivoli Workload Scheduler z/OS access method has three separate methods, depending on what you would like to communicate to on the z/OS system. All of these methods work in the same way, and they are: JES, OPC and CA7. The Extended Agent will communicate to the z/OS gateway over TCP/IP, and will use the parameter HOST in the workstation definition to communicate to the gateway.

When configuring a z/OS Extended Agent in a cluster, be aware that this Extended Agent is hosted by a Fault Tolerant Agent; the considerations for a Fault Tolerant Agent are described in 2.3.1, "Configurations for implementing IBM Tivoli Workload Scheduler in a cluster" on page 46.

The parameter that we are interested in is in the workstation definition HOST. This will be a IP address or domain name. In order for the Extended Agent to operate correctly, this system should be accessed from wherever the IBM Tivoli Workload Scheduler is running. (This operates in the same way for the Microsoft or the UNIX cluster.)

Figure 2-7 on page 52 shows the architecture of the z/OS access method.

Figure 2-7: z/OS access method

Local UNIX access method

When the IBM Tivoli Workload Scheduler sends a job to a local UNIX Extended Agent, the access method, unixlocl, is invoked by the host to execute the job. The method starts by executing the standard configuration script on the host workstation (jobmanrc). If the job's logon user is permitted to use a local configuration script and the script exists as $HOME/.jobmanrc, the local configuration script is also executed. The job itself is then executed either by the standard or the local configuration script. If neither configuration script exists, the method starts the job.

For the local UNIX Extended Agent to function properly in a cluster, the parameter that we are interested in is host, which is in the workstation definition. This will be an IP address or domain name, and providing that wherever the IBM Tivoli Workload Scheduler is running this system can be accessed, then the Extended Agent will still operate correctly.

Remote UNIX access method

Note 

In this section we explain how this access method works in a cluster; this explanation is not meant to be used as a way to set up and configure this Extended Agent.

When the IBM Tivoli Workload Scheduler sends a job to a remote UNIX Extended Agent, the access method, unixrsh, creates a /tmp/maestro directory on the non-IBM Tivoli Workload Scheduler computer. It then transfers a wrapper script to the directory and executes it. The wrapper then executes the scheduled job. The wrapper is created only once, unless it is deleted, moved, or outdated.

For the remote UNIX Extended Agent to function properly in a cluster, the parameter that we are interested in is host, which is in the workstation definition. This will be an IP address or domain name, and providing that wherever the IBM Tivoli Workload Scheduler is running this system can be accessed, then the Extended Agent will still operate correctly.

One instance of IBM Tivoli Workload Scheduler

In this section, we discuss the circumstances under which you might install one instance of the IBM Tivoli Workload Scheduler in a high availability cluster.

The first consideration is where the product is to be installed: it must be in the shared file system that moves between the two servers in the cluster.

The second consideration is how the IBM Tivoli Workload Scheduler instance is addressed: that must be the IP address that is associated to the cluster.

Why to install only one copy of IBM Tivoli Workload Scheduler

In this configuration there may be three reasons for installing only one copy of IBM Tivoli Workload Scheduler in this cluster:

When to install only one copy of IBM Tivoli Workload Scheduler

You would install the workstation in this cluster in order to provide high availability to an application or to the IBM Tivoli Workload Scheduler network by installing the Master Domain Manager in the cluster.

Where to install only one copy of IBM Tivoli Workload Scheduler

To take advantage of the cluster, install this instance of IBM Tivoli Workload Scheduler on the shared disk system that moves between the two sides of the cluster.

What to install

Depending on why you are installing one instance of IBM Tivoli Workload Scheduler, you may install a Master Domain Manager, Domain Manager or Fault Tolerant Agent in the cluster.

Two instances of IBM Tivoli Workload Scheduler

In this section, we discuss the circumstances under which you might install two instances of the IBM Tivoli Workload Scheduler.

The first consideration is where the product is to be installed: each IBM Tivoli Workload Scheduler instance must have a different installation directory, and that must be in the shared file system that moves between the two servers in the cluster. Each instance will also have its own installation user.

The second consideration is how the IBM Tivoli Workload Scheduler instance is addressed: that must be the IP address that is associated to the cluster. Each IBM Tivoli Workload Scheduler instance must also have its own port number. If the version of IBM Tivoli Workload Scheduler is older than 8.2, then it will need to access the components file from both sides of the cluster to run. If the version of IBM Tivoli Workload Scheduler is 8.2 or higher, then the components file is only needed to be sourced when upgrading IBM Tivoli Workload Scheduler.

Why to install two instances of IBM Tivoli Workload Scheduler

In this configuration there may be two reasons for installing two copies of IBM Tivoli Workload Scheduler in this cluster:

When to install two instances of IBM Tivoli Workload Scheduler

You would install both instances of IBM Tivoli Workload Scheduler in this cluster in order to give a high availability to an application or to the IBM Tivoli Workload Scheduler network by installing the Master Domain Manager or Domain Manager in this cluster.

Where to install two instances of IBM Tivoli Workload Scheduler

To take advantage of the cluster, you would install the two instances of IBM Tivoli Workload Scheduler on the shared disk system that moves between the two sides of the cluster. You would set up the cluster software in such a way that the first instance of IBM Tivoli Workload Scheduler would have a preference of running on server A and the second instance would have a preference of running on server B.

What to install

Depending on why you are installing two instances of IBM Tivoli Workload Scheduler, you may install a combination of a Master Domain Manager, Domain Manager or Fault Tolerant Agent in the cluster.

Three instances of IBM Tivoli Workload Scheduler

In this section, we discuss the circumstances under which you might install three instances of the IBM Tivoli Workload Scheduler.

The first consideration is where the product is to be installed. When two instances of IBM Tivoli Workload Scheduler are running on the same system, you must have each IBM Tivoli Workload Scheduler instance installed in a different directory-and one of the instances must be installed in the shared file system that moves between the two servers in the cluster. Each instance will have it own installation user.

The second consideration is how the IBM Tivoli Workload Scheduler instance is addressed. In this case, one will have the IP address that is associated to the cluster, and the other two will have the IP address of each system that is in this cluster. Each IBM Tivoli Workload Scheduler instance must have its own port number. If the version of IBM Tivoli Workload Scheduler is older than 8.2, then it will need to access the components file from both sides of the cluster to run. If the version of IBM Tivoli Workload Scheduler is 8.2 or higher, then the components file is only needed to be sourced when upgrading IBM Tivoli Workload Scheduler.

Why to install three instances of IBM Tivoli Workload Scheduler

In this configuration, only one instance is installed in a high availability mode; the other two are installed on the local disks shown in Figure 2-8 on page 56. Why would you install IBM Tivoli Workload Scheduler in this configuration? Because an application is running on both sides of the cluster that cannot be configured in a cluster; therefore, you need to install the IBM Tivoli Workload Scheduler workstation with the application. Also, you may wish to install the Master Domain Manager in the cluster, or an third application is cluster-aware and can move.

Figure 2-8: Three-instance configuration

When to install three instances of IBM Tivoli Workload Scheduler

You would install one instance of the IBM Tivoli Workload Scheduler in this cluster in order to give high availability to an application or to the IBM Tivoli Workload Scheduler network by installing the Master Domain Manager or Domain Manager in this cluster, and one instance of IBM Tivoli Workload Scheduler on each local disk. This second instance may be scheduling batch work for the systems in the cluster, or an application that only runs on the local disk subsystem.

Where to install three instances of IBM Tivoli Workload Scheduler

Install one instance of IBM Tivoli Workload Scheduler on the shared disk system that moves between the two sides of the cluster, and one instance of IBM Tivoli Workload Scheduler on the local disk allocated to each side of the cluster, as shown in shown in Figure 2-8.

What to install

Depending on why you are installing one instance of IBM Tivoli Workload Scheduler as described above, you may a Master Domain Manager, Domain Manager or Fault Tolerant Agent in the cluster. You would install a Fault Tolerant Agent on each side of the cluster.

Multiple instances of IBM Tivoli Workload Scheduler

In this section, we discuss the circumstances under which you might install multiple instances of the IBM Tivoli Workload Scheduler.

The first consideration is where the product is to be installed, because each IBM Tivoli Workload Scheduler instance must have a different installation directory. These installation directories must be in the shared file system that moves between the two servers in the cluster. Each instance will also have its own installation user.

The second consideration is how the IBM Tivoli Workload Scheduler instance is addressed: that must be the IP address that is associated to the cluster. Each IBM Tivoli Workload Scheduler instance must also have its own port number. If the version of IBM Tivoli Workload Scheduler is older than 8.2, then it will need to access the components file from both sides of the cluster to run. If the version of IBM Tivoli Workload Scheduler is 8.2 or higher, then the components file is only needed to be sourced when upgrading IBM Tivoli Workload Scheduler.

Why to install multiple instances of IBM Tivoli Workload Scheduler

In this configuration there may be many applications running in this cluster, and each application would need to have its own workstation associated with this application. You might also want to install Master Domain Manager and even the Domain Manager in the cluster to make the entire IBM Tivoli Workload Scheduler network more fault tolerant against failures.

When to install multiple instances of IBM Tivoli Workload Scheduler

You would install multiple instances of IBM Tivoli Workload Scheduler in this cluster to give high availability to an application and to the IBM Tivoli Workload Scheduler network by installing the Master Domain Manager or Domain Manager in this cluster.

Where to install multiple instances of IBM Tivoli Workload Scheduler

All instances of IBM Tivoli Workload Scheduler would be installed on the shared disk system that moves between the two sides of the cluster. Each instance would need its own installation directory, its own installation user, and its own port address.

What to install

Depending on why you are installing multiple instances of IBM Tivoli Workload Scheduler, you may install a combination of a Master Domain Manager, Domain Manager or Fault Tolerant Agent in the cluster.

2.3.2 Software availability within IBM Tivoli Workload Scheduler

In this section we discuss software options currently available with IBM Tivoli Workload Scheduler that will give you a level of high availability if you do not have, or do not want to use, a hardware cluster.

Backup Master Domain Manager

A Backup Master Domain Manager (BMDM) and the Master Domain Manager (MDM) are critical parts of a highly available IBM Tivoli Workload Scheduler environment. If the production Master Domain Manager fails and cannot be immediately recovered, a backup Master Domain Manager will allow production to continue.

The Backup Master Domain Manager must be identified when defining your IBM Tivoli Workload Scheduler network architecture; it must be a member of the same domain as the Master Domain Manager, and the workstation definition must have the Full Status and Resolve Dependencies modes selected.

It may be necessary to transfer files between the Master Domain Manager and its standby. For this reason, the computers must have compatible operating systems. Do not combine UNIX with Windows NT® computers. Also, do not combine little-endian and big-endian computers.

When a Backup Master Domain Manager is correctly configured, the Master Domain Manager will send any changes and updates to the production file to the BMDM-but any changes or updates that are made to the database are not automatically sent to the BMDM. In order to keep the BMDM and the MDM databases synchronized, you must manually copy on a daily basis, following start-of-day processing, the TWShome\mozart and TWShome\..\unison\network directories (the unison directory is only for versions older than 8.2). Any changes to the security must be replicated to the BMDM, and configuration files like localopts and globalopts files must also be replicated to the BMDM.

The main advantages over a hardware HA solution is that this currently exists in the IBM Tivoli Workload Scheduler product, and the basic configuration where the BMDM takes over the IBM Tivoli Workload Scheduler network for a short-term loss of the MDM is fairly easy to set up. Also, no extra hardware or software is needed to configure this solution.

The main disadvantages are that the IBM Tivoli Workload Scheduler database is not automatically synchronized and it is the responsibility of the system administrator to keep both databases in sync. Also, for a long-term loss of the MDM, the BMDM will have to generate a new production day plan and for this an operator will have to submit a Jnextday job on the BMDM. Finally, any jobs or job streams that ran on the MDM will not run on the BMDM, because the workstation names are different.

Backup Domain Manager

The management of a domain can be assumed by any Fault Tolerant Agent that is a member of the same domain.The workstation definition has to have Full Status and Resolve Dependencies modes selected. When the management of a domain is passed to another workstation, all domain workstations members are informed of the switch, and the old Domain Manager is converted to a Fault Tolerant Agent in the domain.

The identification of domain managers is carried forward to each new day's symphony file, so that switches remain in effect until a subsequent switchmgr command is executed.

Once a new workstation has taken over the responsibility of the domain, it has the ability to resolve any dependencies for the domain it is managing, and also the ability to process any messages to or from the network.

Switch manager command

The switch manager command is used to transfer the management of a IBM Tivoli Workload Scheduler domain to another workstation. This command can be used on the Master Domain Manager or on a Domain Manager.

To use the command switchmgr, the workstation that you would like to have take over the management of a domain must be a member of the same domain. It must also have resolve dependences and full status to work correctly. The syntax of the command is switchmgr domain;newmgr.

The command stops a specified workstation and restarts it as the Domain Manager. All domain member workstations are informed of the switch, and the old Domain Manager is converted to a Fault Tolerant Agent in the domain.

The identification of Domain Managers is carried forward to each new day's symphony file, so that switches remain in effect until a subsequent switchmgr command is executed. However, if new day processing (the Jnextday job) is performed on the old domain manager, the domain will act as though another switchmgr command had been executed and the old Domain Manager will automatically resume domain management responsibilities.

2.3.3 Load balancing software

Using load balancing software is another way of bringing a form of high availability to IBM Tivoli Workload Scheduler jobs; the way to do this is by integrating IBM Tivoli Workload Scheduler with IBM LoadLeveler®, because IBM LoadLeveler will detect if a system is unavailable and reschedule it on one that is available.

IBM LoadLeveler is a job management system that allows users to optimize job execution and performance by matching job processing needs with available resources. IBM LoadLeveler schedules jobs and provides functions for submitting and processing jobs quickly and efficiently in a dynamic environment. This distributed environment consists of a pool of machines or servers, often referred to as a LoadLeveler cluster.

Jobs are allocated to machines in the cluster by the IBM LoadLeveler scheduler. The allocation of the jobs depends on the availability of resources within the cluster and on rules defined by the IBM LoadLeveler administrator. A user submits a job to IBM LoadLeveler and the scheduler attempts to find resources within the cluster to satisfy the requirements of the job.

At the same time, the objective of IBM LoadLeveler is to maximize the efficiency of the cluster. It attempts to do this by maximizing the utilization of resources, while at the same time minimizing the job turnaround time experienced by users.

2.3.4 Job recovery

In this section we explain how IBM Tivoli Workload Scheduler will treat a job if it has failed; this is covered in three scenarios.

A job abends in a normal job run

Prior to IBM Tivoli Workload Scheduler Version 8.2, if a job finished with a return code other than 0, the job was treated as ABENDED. If this was the correct return code for this job, the IBM Tivoli Workload Scheduler administrator would run a wrapper script around the job or change the .jobmanrc to change the job status to SUCCES.

In IBM Tivoli Workload Scheduler Version 8.2, however, a new field in the job definition allows you to set a boolean expression for the return code of the job. This new field is called rccondsucc. In this field you are allowed to type in a boolean expression which determines the return code (RC) required to consider a job successful. For example, you can define a successful job as a job that terminates with a return code equal to 3 or with a return code greater than or equal to 5, and less than 10, as follows:

rccondsucc "RC=3 OR (RC>=5 AND RC<10)"

Job process is terminated

A job can be terminated in a number of ways, and in this section we look at some of the more common ones. Keep in mind, however, that it is not the responsibility of IBM Tivoli Workload Scheduler to roll back any actions that a job may have done during the time that it was executing. It is the responsibility of the person creating the script or command to allow for a rollback or recovery action.

When a job abends, IBM Tivoli Workload Scheduler can rerun the abended job or stop or continue on with the next job. You can also generate a prompt that needs to be replied to, or launch a recovery job. The full combination of the job flow is shown in Figure 2-9 on page 61.

Figure 2-9: IBM Tivoli Workload Scheduler job flow

Here are the details of this job flow:


 < Day Day Up > 

Категории