Node Manager

WebLogic provides a standalone Java tool called the Node Manager, which is responsible for managing the availability of all Managed Servers running on a machine. It runs as a dedicated process on a machine, either as a daemon on a Unix machine or as a service on the Windows platform. It provides a way to automatically restart Managed Servers in the case of failure, and even handles servers that are in a "failed" state. A Node Manager also lets the Administration Server remotely start, kill, and monitor Managed Server instances. A single Node Manager process should run on every machine that hosts Managed Servers. When the Node Manager boots a server, it creates a separate process for that server, just as if you had run the startManagedWebLogic script on that machine.

A Node Manager does not control the starting and stopping of the Administration Server. The machine that hosts the Administration Server doesn't need a Node Manager, unless it also hosts one or more Managed Servers.

Figure 13-5 illustrates the role of Node Managers in a domain.

Figure 13-5. Node Managers act as agents to the Administration Server

In order to control the life cycle of a Managed Server, using either the Administration Server or the weblogic.Admin tool, you must start the servers under the control of a Node Manager. For instance, if you restart a Managed Server remotely using the Administration Console, the Administration Server contacts the appropriate Node Manager to perform the task. Even though you have no explicit, direct control over the Node Managers, they act as agents for the Administration Server.

Even though the Administration Server works closely with the Node Managers running on the different machines that host the Managed Servers in the domain, a Node Manager is still outside the scope of a WebLogic domain. The same Node Manager monitors all Managed Servers on a machine, regardless of the domains to which they belong.

The Node Manager will not perform health monitoring and automatic restarts on servers that were not started using the Node Manager.

It is important to note that when Node Managers are used, each server is started up through the Node Manager running on the machine, using either the Administration Console or a JMX application, and not through your own startup scripts.

Finally, Node Managers use SSL in their communication. The Administration Server talks to the Node Managers using (short-lived) two-way SSL-protected messages, ensuring that only authorized Administration Servers can control the Node Managers. In addition, the Node Manager itself uses an SSL connection with each of the Managed Servers under its control. This connection remains alive for the entire duration that a Managed Server is up, and is used to monitor the server.

Configuring the Node Managers can be a little tricky, but once you have set them up, you can leave them humming away by themselves without any further intervention. The following sections look at how to configure a machine to use a Node Manager, how to configure a server to use a Node Manager, and where to locate the Node Manager logs in case things go wrong. The configuration is a three-part process:

  1. Configure the Node Manager on each physical machine that hosts the Managed Servers of the domain.
  2. Configure each machine in a WebLogic Domain to use a Node Manager.
  3. Assign each Managed Server to a machine, and configure the interaction between the Managed Server and the Node Manager assigned to the machine.

13.7.1 Configuring the Node Manager for a Physical Machine

Every physical machine should have a single Node Manager instance running. Security is the most important aspect of the configuration. WebLogic tries hard to ensure that only authorized users can access the Node Manager otherwise, it could be used to tamper with your servers. For this reason, WebLogic secures the Node Managers in two ways:

13.7.1.1 Trusted hosts

In order to configure a list of trusted hosts for a Node Manager, you must create a text file with the addresses of all Administration Servers that are allowed to contact the Node Manager. Each line specifies either the IP address or the DNS host name of an Administration Server. By default, a Node Manager uses the nodemanager.hosts file located under the WL_HOMEcommon odemanager folder.[5] For example, you could have the following entries in the file:

[5] In WebLogic 7.0, it's in the config subdirectory of this folder.

wladmin.oreilly.com 10.0.10.10

The default entries allow access from the local host only. You can create a different trusted hosts file, and modify the Node Manager's startup script so that it specifies the location of this file:

java -Dbea.home=%BEA_HOME% -Dweblogic.nodemanager.javaHome=%JAVA_HOME% -Dweblogic.nodemanager.trustedHosts=nodemanager.myhosts ... -Dweblogic.ListenAddress=10.0.10.10 weblogic.nodemanager.NodeManager

If you specify DNS names, you also must enable a reverse DNS lookup for the Node Manager (by default, it is not enabled). To do this, simply specify an additional system property in the startup script:

-Dweblogic.nodemanager.reverseDnsEnabled=true

The Node Manager then will accept connections only from an Administration Server running on one of the addresses specified in the trusted hosts file.

13.7.1.2 SSL configuration

Because all communication between the Administration Server and the Node Manager uses SSL, both the server and Node Manager must have SSL configured. Refer to Chapter 16 for the necessary SSL background. A Node Manager uses the same public key infrastructure as WebLogic Server itself, and the default installation uses the DemoIdentity.jks and DemoTrust.jks stores. So, if you just want to get everything going, you can use the default configuration and ignore the rest of this setup.

The best way to modify the default setup is to edit the nodemanager.properties file in the WL_HOMEcommon odemanager directory. Alternatively, you can specify any of the system properties from the command line when starting up the Node Manager. The default nodemanager.properties file also provides the syntax for most properties. For example, depending on which keystores you wish to use, the KeyStores property can take any of the following values:

#Possible values for the Keystores property #KeyStores = [DemoIdentityAndDemoTrust| CustomIdentityAndJavaStandardTrust|CustomIdentityAndCustomTrust]

Here is an example property file:

KeyStores=CustomIdentityAndCustomTrust CustomIdentityKeyStoreFileName=y:mystoresmyIdentityStore.jks CustomIdentityKeyStorePassPhrase=mykeystorepass CustomIdentityKeyStoreType=JKS CustomIdentityAlias=myalias CustomIdentityPrivateKeyPassPhrase=mypassword CustomTrustKeyStoreFileName=y:serverlibDemoTrust.jks #These are commented out as the default trust store doesn't need them #CustomTrustKeyStorePassPhrase=mypassphrase #CustomTrustKeyStoreType=JKS #CustomTrustKeyPassPhrase=mykeypass

This file sets up a custom identity and trust store for the Node Manager, which is typical of most production deployments. It references the demonstration trust store and an example identity store that is described in Chapter 16. After restarting the Node Manager, all of the pass phrases will be encrypted.

13.7.1.3 SSL for WebLogic 7.0

SSL configuration for a Node Manager in WebLogic 7.0 is slightly different. You can either use your own key and certificate files, or in a test setup, use the sample key and certificate files that are supplied with WebLogic's installation. The demonstration SSL certificate and key files are located in the WL_HOMEcommon odemanagerconfig directory, as well as in the root directory of any domain created using the Configuration Wizard.

Once you have the required SSL certificate and key files, you need only to specify additional system properties in the Node Manager's startup script:

weblogic.nodemanager.keyFile

This property identifies the path to the key file.

weblogic.nodemanager.keyPassword

This property specifies the password to use if the key file is encrypted.

weblogic.nodemanager.certificateFile

This property identifies the path to the certificate file.

weblogic.security.SSL.trustedCAKeyStore

This property identifies the path to the keystore that holds the trusted CA certificates.

weblogic.nodemanager.sslHostNameVerificationEnabled

This property causes the Node Manager to perform hostname verification of the Administration Server that is communicating with it.

Chapter 16 provides a more detailed explanation of SSL configuration for WebLogic.

13.7.1.4 Additional configuration properties

Table 13-2 provides a list of additional system properties that you may need to specify. For instance, you may wish to modify the listen address for the Node Manager. All of these properties can simply be placed in the nodemanager.properties file. In WebLogic 7.0, you must specify them from the command line.

Table 13-2. Node Manager properties

Property name

Description

Default

JavaHome

This property specifies the Java home that should be used to start the managed servers. Otherwise, it uses the Java home defined in the Remote Start tab for the server. If that is not defined, it uses the Java home used to start the Node Manager itself.

None

WeblogicHome

This property sets the WebLogic home directory. You also can specify it on a per-server basis on a server's Remote Start tab.

None

ListenAddress

This property sets the address on which the Node Manager should listen.

All IP addresses assigned to the machine

ListenPort

This property determines the port number on which the node manager should listen.

5555

NativeVersionEnabled

This property defines whether the Node Manager will run in a native mode.

true

ReverseDnsEnabled

This property defines whether reverse DNS may be used to resolve addresses in the trusted host file.

false

SavedLogsDirectory

This property determines where the log files will be written.

./NodeManagerLogs

TrustedHosts

This property determines the file containing the list of all trusted hosts.

./nodemanager.hosts

ScavengerDelaySeconds

This property is used if a server is started using the Node Manager. It will wait for this number of seconds before expecting a response from the server. Otherwise, it considers the task to have failed.

60 seconds

StartTemplate

This property is used by Unix systems to specify the path to a script file that will be used to start Managed Servers.

./nodemanager.sh

If you change any of these properties, you must stop and restart the Node Manager for the changes to take effect.

13.7.1.5 Starting a Node Manager

In a production environment, it is very important that the Node Manager is running at all times. Without the Node Manager, there is no way to automatically start, restart, or kill Managed Servers. The simplest way to accomplish this is to ensure that it runs as a Unix daemon or Windows Service. The default installation process provides you with an option to install the Node Manager in this way. For the Windows platform, you can use two scripts located in the WL_HOMEserverin directory to install and uninstall the service:

installNodeMgrSvc.cmd

This script installs the Node Manager as a Windows Service.

uninstallNodeMgrSvc.cmd

This script stops and uninstalls the Node Manager service.

Make sure that you first modify these scripts to include the system properties we described earlier. The WebLogic documentation provides additional information on more advanced configurations of the Node Manager and Windows Services.

In addition, you can start the Node Manager using the startNodeManager script, which is also located in the WL_HOMEserverin directory. To check on the status of a Node Manager, select a machine node from the left pane of the Administration Console and then choose the Monitoring/Node Manager Status tab.

13.7.2 Configuring a Machine to Use a Node Manager

After installing, configuring, and running a Node Manager on each physical machine, you must configure the machines for the domain and assign server instances to these machines. This information tells WebLogic which Managed Servers run on which physical machines, and hence which servers are under the control of the Node Manager on that machine. This is a two-part process. First you have to define the machines and configure them to use the Node Manager, and then you have to assign Managed Servers to the machines.

Using the Administration Console, select the Machines node in the left pane to view all of the machines in the domain. Each machine entry should encapsulate the settings for a physical machine. Use the righthand pane to create a new machine or modify an existing machine entry. For each machine, select the Node Manager tab and enter the listen address and port used by the Node Manager on that machine.

Finally, you need to assign the machine to the Managed Servers. Use the Servers tab to select those servers that run on the chosen machine. You also can assign a machine to a server from the Configuration/General tab of that server. This assignment is used in other situations as well. For instance, in a clustered environment WebLogic will try to replicate session data onto a server that runs on separate hardware. It does this by treating the different machines in the domain as physically different pieces of hardware. The servers assigned to a machine then determine which servers in the cluster are collocated (and which aren't).

13.7.3 Configuring the Node Manager for a Managed Server

The final task is to configure each Managed Server so that the Node Manager can control it. Because the Node Manager does not rely on external scripts to remotely start and kill a Managed Server, the information found in the startup scripts needs to be configured for each server using the Administration Console. The information is then saved as part of the domain configuration. Select a Managed Server from the left pane, and then choose the Configuration/Remote Start tab to specify the following parameters:

All of these settings mirror the environment variables used in the startWebLogic scripts; we already saw that some of them can take on default values assigned to the Node Manager. Note that the directory paths used in the preceding settings must be valid on the machine that hosts the Managed Server, and not the Administration Server. This data is sent to the Node Manager on that machine, which then starts up the Managed Server in a separate process.

13.7.4 Configuring Node Manager Behavior

By default, the Node Manager will automatically restart servers that fail, or when it cannot determine the server's state. Once a Managed Server has failed, it will try to restart it no more than twice within the next hour.

Table 13-3 lists the configuration settings available for monitoring the health of a Managed Server. You can modify these settings from the Administration Console. Select a Managed Server from the left pane, then select the Configuration/Health Monitoring tab.

Table 13-3. Configuring server health monitoring

Setting

Description

Default

Auto Restart

If you disable this option, the Node Manager will not attempt to restart a failed server.

true

Auto Kill if Failed

If this is set to true, the Node Manager may kill the server process if the server's health is in the failed state, or when it cannot query the server for its health state.

false

Restart Interval ; Max Restarts within Interval

The Node Manager will try to restart the server only within the specified restart interval period. If this time period is exceeded, no further attempts will be made. During the time period, the Node Manager will try no more than Max Restarts to restart the server. By default, the Node Manager makes no more than two attempts within an hour to restart a failed server.

3600; 2

Health Check Interval

This setting determines the interval (in seconds) at which the Node Manager polls the server for its health state.

180

Health Check Timeout

This setting determines the number of seconds to wait for a response from a health check. By default, if the timeout is reached, the Node Manager will kill the server process and attempt to restart the server.

60

Restart Delay Seconds

This setting determines the number of seconds that the Node Manager will wait before trying to restart the server after killing it. This may be needed on some systems where killing the process does not immediately release all resources before the restart.

0

 

13.7.5 Default Operation of the Node Manager

Once a Node Manager has been installed and configured on a machine and the Managed Servers have been configured, the Node Manager is finally ready for use. You interact with the Node Manager indirectly using the Administration Console or the weblogic.Admin tool. To use the Administration Console, select a Managed Server from the left pane and then choose the Control tab. You then will be able to start, suspend, resume, and shut down a server. We discuss the various shutdown options and the use of the weblogic.Admin tool in a later section.

13.7.5.1 Starting managed servers

Imagine that you try to start a Managed Server remotely. Let's say that you want to start ServerA in Figure 13-5. The Administration Server will receive the instruction and forward it to the Node Manager on the machine that is configured to host ServerA i.e., MachineB. The Node Manager running on MachineB then will start the server. By default, if the Managed Server doesn't respond within 60 seconds (the Scavenger Delay), the Node Manager will set the server's state to UNKNOWN. If the server does start after this delay, the Node Manager will change this state to RUNNING.

13.7.5.2 Suspending and stopping managed servers

Requests to suspend or stop managed servers don't proceed quite in the same fashion. The commands are issued directly to the Managed Servers from the Administration Server. Only if the Administration Server cannot reach a Managed Server does it dispatch the command to the appropriate Node Manager, which then forwards it to the Managed Server. Likewise, if a Managed Server does not respond to a shutdown request, the Node Manager can shut down the process forcibly (it records the process ID for this purpose).

13.7.5.3 Health monitoring

By default, the Node Manager checks the health status of each Managed Server every 180 seconds. If a Managed Server is in the failed state and its Auto Kill If Failed attribute is set to true, the Node Manager will kill and restart the process. By default, this attribute is set to false. The same occurs if a server fails to respond to three consecutive health queries.

By default, the Node Manager will not restart a Managed Server more than twice within an hour. The frequency of restarts is governed by the Restart Interval and Max Restarts within Interval attributes.

It is worth stressing the following points on the use of a Node Manager:

13.7.6 Node Manager Logs

Two sets of logs are associated with the Node Manager. Both sets are useful when you need to debug any problems with the Node Manager or when you need to set up a more comprehensive monitoring environment. A subset of the logs is available from the Administration Console. Choose a Managed Server from the left pane, and then select the Control/Remote Start Output tab.[6]

[6] In WebLogic 7.0, it's the Monitoring/Process Output tab.

Three sets of logs are maintained for each Node Manager:

Node Manager client logs

The Administration Server maintains Node Manager log files in the NodeManagerClientLogs directory of the domain. These logs hold information about the commands directed to the Node Manager via the Administration Console (or the weblogic.Admin tool).

Node Manager logs

The Node Manager itself generates log messages when it starts up or shuts down. These logs are located in the WL_HOME/common/nodemanager/NodeManagerLogs/NodeManagerInternal directory on the particular machine. Use these log files to diagnose whether a Node Manager is not starting up properly. These logs essentially correspond to the View Node Manager Output option in the Administration Console.

Managed Server logs

The Node Manager maintains a subdirectory under the NodeManagerLogs directory, for each Managed Server that it controls. These log files hold the full output of the server that was started. These logs correspond to the View Server Output option in the Administration Console.

You may need to clean these directories periodically as the number and size of log files continue to grow.

13.7.6.1 Node Manager client logs

The client logs record all actions executed by a Node Manager on behalf of a JMX-based client, such as the Administration Console or the weblogic.Admin tool. A separate directory created within the domain log directory for each server within the domain. All of the recorded actions are timestamped and usually include a notification of the success or failure of the action. Here's a typical example of the client logs:

<05-Jul-2003 14:07:05 BST> <10-Jul-2003 13:12:38 BST> <10-Jul-2003 13:14:50 BST> <_ _COMMAND_DONE_ _>

These logs contain only actions that were submitted through the Administration Console or any JMX-based client. For instance, if the Node Manager automatically restarts a failed server, this action is not recorded in the logs. Instead, it will be recorded in the machine logs for the Node Manager in charge of that server.

13.7.6.2 Managed Server logs

The server logs also are organized into subfolders, one for each server running on the machine. Each directory contains the following files:

servername_pid

This file contains, in text, the process ID of the Managed Server. If a Managed Server on a machine is using all of the CPU for some reason, you can trace the error to the actual server by grepping through these files. The Node Manager in turn uses this data to kill the process.

servername_output.log

This file records startup messages saved by the Node Manager when it starts a server.

servername_error.log

This file records any error messages that are generated when the Node Manager starts a server.

config.xml

This file contains any configuration information passed to the Node Manager by the Administration Server and can be safely ignored.

Except for the configuration file, all of the Managed Server log files are renamed by appending _prev to the filename whenever a server is restarted.

Категории