Network Programming with Perl


 
Network Programming with Perl

By Lincoln  D.  Stein

Slots : 1

Table of Contents
Chapter  14.   Bulletproofing Servers

    Content

Because network daemons are detached from standard error, warnings and diagnostics have nowhere to go unless the daemon explicitly writes the messages to a log file. However, there are issues involved with this, such as where to write the log file and how to synchronize log messages from the several children of a forked server. Fortunately, the UNIX operating system provides a robust and flexible logging system known as syslog that solves these problems.

Syslog is run by a daemon known as syslogd and configured via a system file called /etc/syslog.conf. Some systems have two logging daemons, one for kernel messages and one for all others.

The syslog system receives messages from two main sources: the operating system kernel itself and from user -space programs such as daemons. Incoming messages are distributed according to rules defined in /etc/syslog.conf to a set of files and/or devices. Messages are typically written to a growing set of files in the directories /var/log or /var/adm or echoed to the system text console. The syslog daemon is also able to send messages remotely via the network to another host for logging and can receive remote messages for logging locally.

A short excerpt from a log file on my laptop machine looks like this:

Aug 18 08:46:51 pesto dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67 Aug 18 08:46:51 pesto dhclient: DHCPACK from 132.239.12.9 Aug 18 08:46:51 pesto dhclient: bound to 132.239.12.42 - renewal in 129600 seconds. Aug 18 11:46:51 pesto cardmgr[32]: executing: './serial start ttyS2' Aug 18 08:51:25 pesto sendmail[11142]: gethostbyaddr() failed for 132.239.12.42 Aug 18 08:51:27 pesto sendmail[11142]: IAA11142: from=lstein, size=667904, class=0 Aug 18 11:51:36 pesto xntpd[207]: synchronized to 64.7.3.44, stratum=4 Aug 18 11:51:30 pesto xntpd[207]: time reset (step) -6.315089 s

Here we see messages from four daemons. Each message consists of a time stamp, the name of the host the daemon is running on ( pesto ), the name and optional PID of the daemon, and a one-line log status message.

The syslog system is a standard part of all common UNIX and Linux distributions. Windows NT/2000 has a similar facility known as the Event Log, but it is less straightforward to use because its log files use a binary format. However, the Win32::EventLog module, available from CPAN, makes it possible for Perl scripts to read and write NT event logs. Alternatively, the free NTsyslog package is a Windows interface to the UNIX syslog service. It is available at http://www.sabernet.net/software/ntsyslog.html

About UNIX Syslog

To send an entry to the syslog system, a daemon must provide syslogd with a message containing three pieces of information: a facility, a priority, and a message text.

A Facility

The "facility" describes the type of program that is sending the message. The facility is used to sort the message into one or more log files or other destinations. Syslog defines these facilities:

  • auth ”user authorization messages

  • authpriv ”privileged user authorization messages

  • cron ”messages from the cron daemon

  • daemon ”messages from miscellaneous system daemons

  • ftp ”messages from the FTP daemon

  • kern ”kernel messages

  • local0-local7 ”facilities reserved for local use

  • lpr ”messages from the printer system

  • mail ”messages from the mail system

  • news ”messages from the news system

  • syslog ”internal syslog messages

  • uucp ”messages from the uucp system

  • user ”messages from miscellaneous user programs

Network daemons generally use the daemon facility or one of the local0 through local7 facilities.

A Priority

Each syslog message is associated with a priority, which indicates its urgency. The syslog daemon can sort messages by priority as well as by facility, with the intent that urgent messages get flagged for immediate attention. The following priorities exist:

  • emerg > ”an emergency; system is unusable

  • alert ”action must be taken immediately

  • crit ”a critical condition

  • err ”an error

  • warning ”a warning

  • notice ”a normal but significant condition

  • info ”normal informational message

  • debug ”debugging message

The interpretation of the various priorities is subjective . The main dividing line is between warning , which indicates something amiss, and notice , which is issued during normal operations.

Message Text

Each message to syslog also carries a human-readable text message describing the problem. For best readability in the log file, messages should not contain embedded newlines, tabs, or control characters (a single newline at the end of the message is OK).

The syslog daemon can accept messages from either of two sources: local programs via a UNIX domain socket (Chapter 22) and remote programs via an Internet domain socket using UDP (Chapter 18). The former strategy is more efficient, but the latter is more flexible, because it allows several hosts on the local area network to log to the same logging host. The syslog daemon may need to be configured explicitly to accept remote connections; remote logging has been a source of security breaches in the past.

Sys::Syslog

You can send messages to the syslog daemon from within Perl using the Sys::Syslog module, a standard part of the Perl distribution. [1] When you use Sys::Syslog, it imports the following four functions:

[1] Prior to Perl 5.6, this module required the syslog.ph header file in order to run, but this file did not come with the distribution and had be generated manually using the h2ph tool (described in Chapter 17 under Implementing sockatmark () ). You should upgrade to 5.6 or higher before trying to use Sys::Syslog.

openlog ($identity, $options, $facility)

openlog() initializes Sys::Syslog and sets options for subsequent messages. You will generally call it near the beginning of the program. The three arguments are the server identity, which will be prepended to each log entry, a set of options, which controls the way the log entry is formatted, and a facility, which is selected from one of the facilities just listed.

The openlog() options consist of a space- or comma-separated list of the following key words:

  • cons ”write directly to the system console if the message can't be sent to the syslogd

  • ndelay ”open connection to syslogd immediately, rather than waiting for the first message to be logged

  • pid ”include the process ID of the program in the log entry

  • nowait ”do not wait for log message to be delivered; return immediately

For example, to log entries under the name of " eliza ," with PIDs printed and a facility of local0 , we would call openlog() this way:

openlog('eliza','pid','local0');

The return value will be true if Sys::Syslog was successfully initialized .

$bytes = syslog ($priority, $format, @args)

After calling openlog() , you will call syslog() to send log messages to the daemon. $priority is one of the log priorities just listed. $format is a sprintf() -style format string, and the remaining arguments, if any, are passed to sprintf() for interpolation into the format.

The syntax of the format string is identical to the format strings used by printf() and sprintf() with the exception that the %m format sequence will be automatically replaced with the value of $! and does not need an argument. The POD documentation for sprintf() explains the syntax of the format string.

For example, this sends a message to the system log using the err priority:

syslog('err',"Couldn't open %s for writing: %m",$file);

This results in a log entry like the following:

Jun 2 17:10:49 pesto eliza[14555]: Couldn't open /var/run/eliza.pid for writing: Permission denied

If successful, syslog() returns the number of bytes written. Otherwise , it returns undef .

closelog()

This function severs the connection to syslogd and tidies up. Call it when you are through sending log messages. It is not strictly necessary to call closelog() before exiting.

setlogsock($socktype)

The setlogsock() function controls whether Sys::Syslog will connect to the syslog daemon via an Internet domain socket or via a local UNIX domain socket. The $socktype argument may be either "inet," the default, or "unix." You may need to call this function with the "unix" argument if your version of syslogd is not configured to allow network messages.

setlogsock() is not imported by default. You must import it along with the default Sys::Syslog functions in this manner:

use Sys::Syslog qw(:DEFAULT setlogsock);

For best results, call setlogsock() before the first call to openlog() or syslog() .

In addition to these four subroutines, there is a fifth one called setlogmask() , which allows you to set a mask on outgoing messages so that only those of a certain priority will be sent to syslogd . Unfortunately, this function requires you to translate priority names into numeric bitmasks , which makes it difficult to use.

There is also an internal variable named $Sys::Syslog::host , which controls the name of the host that the module will log to in "inet" mode. By default, this is set to the name of the local host. If you wish to log to a remote host, you may set this variable manually before calling openlog() . However, because this variable is undocumented, use it at your own risk.

Adding Logging to the Psychotherapist Server

We can now add logging to the psychotherapist server from Chapter 10, Figure 10.6. The various functions for autobackgrounding the server and managing the PID file are beginning to get a little unwieldy in the main script file, so let's put them in a separate module called Daemon, along with some new helper functions for writing to the syslog. As long as we're at it, we might as well create a new init_daemon() function that rolls the autobackgrounding, PID file management, and syslog opening into one convenient package. The code for Daemon is shown in Figure 14.1.

Figure 14.1. The Daemon module

Lines 1 “12: Module initialization We load the Sys::Syslog module, along with POSIX, Carp, IO::File, and File::Basename. The latter will be used to generate the program name used for logging error messages. The rest of this section exports five functions: init_server(), log_debug(), log_notice(), log_warn() , and log_die(). init_server() autobackgrounds the daemon, opens syslog, and does other run-time initialization. log_debug() and its brethren will write log messages to the syslog at the indicated priority.

Lines 13 “15: Define constants We choose a default path for the PID file and a log facility of local0 .

Lines 16 “24: init_server() subroutine The init_server() subroutine performs server initialization. We get a path for the PID file from the subroutine argument list, or if no path is provided, we generate one internally. We then call open_pid_file() to open a new PID file for writing, or abort if the server is already running.

Provided everything is successful so far, we autobackground by calling become_daemon() and write the current PID to the PID file. At this point, we callinit_log() to initialize the syslog system. We then return the current PID to the main program.

Lines 25 “37: become_daemon() subroutine This is almost the same subroutine we looked at in Chapter 10, Figure 10.4. It autobackgrounds the server, closes the three standard filehandles, and detaches from the controlling TTY. The only new feature is that the subroutine now installs the CHLD signal handler for us, rather than relying on the main program to do so. The CHLD handler is a subroutine named reap_child() .

Lines 38 “42 init_log() subroutine This subroutine is responsible for initializing the syslog connection. We begin by setting the connection type to a local UNIX-domain socket; this may be more portable than the default "inet" type of connection. We recover the program's base filename and use it in a call to openlog() .

Lines 43 “55: log_* subroutines Rather than use the syslog() call directly, we define four shortcut functions called log_debug(), log_notice(), log_warn() , and log_die() . Each function takes one or more string arguments in the manner of warn() , reformats them, and calls syslog() to log the message at the appropriate priority. log_die() is slightly different. It logs the message at the crit level and then calls die() to exit the program.

The _msg() subroutine is used internally to format the log messages. It follows the conventions of warn() and die() . The arguments are first concatenated using the current value of the output record separator variable, $ \ , to create the error message. If the message does not end in a newline, we append the phrase " at $filename line $line " to it, where the two variables are the filename and line number of the line of the calling code derived from the built-in caller() function.

Lines 56 “59: getpidfilename() subroutine This subroutine returns a default name for the PID file, where we store the PID of the server while it is running. We invoke basename to remove the directory and " .pl " extension from the script, and concatenate it with the PIDPATH directory.

Lines 60 “71: open_pid_file() subroutine This subroutine is identical to the original version that we developed in Chapter 10, Figure 10.5.

Lines 72 “74: reap_child() subroutine This is the now-familiar CHLD handler that calls waitpid() until all children have been reaped.

Line 75: END{} block The package's END{} block unlinks the PID file automatically when the server exits. Since the server forks, we have to be careful to remove the file only if its current PID matches the PID saved during server initialization.

With the Daemon module done, we can simplify the psychotherapist daemon code and add event logging at the same time Figure 14.2).

Figure 14.2. Psychotherapist daemon with logging

Lines 1 “6: Load modules We load the Chatbot::Eliza and IO::Socket modules, as well as the new Daemon module. We also define the default port to listen on.

Lines 7 “8: Install signal handlers We install a signal handler for the TERM and INT signals, which causes the server to shut down normally. This gives the Daemon module time to unlink the PID file in its END{} block.

Note that we no longer install a CHLD handler, because this is now done in the init_server() subroutine.

Lines 9 “15: Open listening socket and initialize server We open a listening TCP socket on the port indicated on the command line and die on failure. We then call init_server() to initialize logging and autobackground, and store the returned PID into a global variable. Once this subroutine returns, we are in the background and can no longer write to standard error.

Line 16: Log startup message We call log_notice() to write an informational message to the system log.

Lines 17 “28: Accept loop We now enter the server's accept loop. As in previous iterations of this server, we accept an incoming connection and fork a new process to handle it. A new feature, however, is that we log each new incoming connection using this fragment of code:

my $host = $connection->peerhost; log_notice("Accepting a connection from %s\n",$host);

We call the connected IO::Socket object's peerhost() method to return the dotted -quad form of the remote host's IP address and send syslog a message indicating that we've accepted a connection from that host. Later, after the child process finishes processing the connection with interact() , using a similar idiom we log a message indicating that the connection is complete.

The other change from the original version of the server is that we indicate a failure of the fork() call by invoking log_die() to log a critical message and terminate the process.

Lines 29 “42: The interact() and _testquit() subroutines These are identical to the subroutines introduced in Chapter 10.

Lines 43 “45: END{} block At shutdown time, we log an informational message indicating that the server is exiting. As in the earlier versions, we must be careful to check that the process ID matches the parent's. Otherwise, each child process will invoke this code as well and generate confusing log messages. The Daemon module's END{} block takes care of unlinking the PID file.

When we run this program, we see log entries just like the following:

Jun 2 23:12:36 pesto eliza_log.pl [14893]: Server accepting connections on port 12005 Jun 2 23:12:42 pesto eliza_log.pl [14897]: Accepting a connection from 127.0.0.1 Jun 2 23:12:48 pesto eliza_log.pl[14897]: Connection from 127.0.0.1 finished Jun 2 23:12:49 pesto eliza_log.pl[14899]: Accepting a connection from 192.168.3.5 Jun 2 23:13:02 pesto eliza_log.pl[14901]: Accepting a connection from 127.0.0.1 Jun 2 23:13:19 pesto eliza_log.pl[14899]: Connection from 192.168.3.5 finished Jun 2 23:13:26 pesto eliza_log.pl[14801]: Connection from 127.0.0.1 finished Jun 2 23:13:39 pesto eliza_log.pl[14893]: Server exiting normally

Notice that the log messages indicating that the server is starting and stopping are logged with the parent's PID, while the messages about individual connections are logged with various child PIDs.

Logging with warn() and die()

Although we now have a way to log error messages explicitly to the system log, we're still stuck with the fact that error messages issued with warn() and die() vanish without a trace. Fortunately, Perl provides a mechanism to overload warn() and die() with custom functions. This allows us to arrange for warn() to call log_warn() and die() to call log_die() .

Two special keys in the %SIG array give us access to the warn() and die() handlers. If $SIG{__WARN__} and/or $SIG{__DIE__} are set to code references, then that code will be invoked whenever warn() or die() is called instead of the default routines. The change requires just a small addition to Daemon's init_log() subroutine, as follows:

$SIG{__WARN__} = \&log_warn; $SIG{__DIE__} = \&log_die;

With this change in place, we no longer have to remember to invoke log_notice() or log_die() to write messages to the log. Instead, we can use the familiar warn() function to send nonfatal warnings to the system log and die() to log a fatal message and terminate the program.

The log_warn() and log_die() routines do not change. The fact that log_die() itself calls die() does not cause infinite recursion. Perl is smart enough to detect that log_die() is called within a $SIG{__DIE__} handler and to use the built-in version of die() in this context.

An interesting thing happens when I install the $SIG{__WARN__} and $SIG{__DIE__} handlers on the psychotherapist example (see Figure 15.2). Messages like this began to appear in the system log whenever a client application exits:

Jun 13 06:22:11 pesto eliza_hup.pl[8933]: Can't access 'DESTROY' field in object of class Chatbot::Eliza

This represents a warning from Perl regarding a missing DESTROY subroutine in the Chatbot::Eliza object. The fix is trivial; I just add a dummy DESTROY definition to the bottom of the server script file:

sub Chatbot::Eliza::DESTROY { }

I hadn't been aware of this warning in the earlier incarnations of the server because the standard error was closed and the diagnostic was lost. This illustrates the perils of not logging everything!

Using the Event Log on Win32 Platforms

The Win32::EventLog module for Windows NT/2000 provides similar functionality to Sys::Syslog. Just three method calls provide the core logging API that network daemons need.

$log = Win32::EventLog->new($sourcename [,$ servername ])

The new() class method opens an event log on a local or remote machine. The $sourcename argument indicates the name of the log which must be one of the standard names "Application," "System," or "Security." The optional $servername argument specifies the name of a remote server in the standard Windows network format (e.g.,\\ SERVER9 ). If $servername is provided, new() attempts to open the specified log file on the remote machine. Otherwise, it opens the local log file. If successful, new() returns a Win32::EventLog object to use for logging messages.

$result = $log->Report(\%data);

report() writes an entry to the selected log file. Its argument is a hash reference containing the keys EventType, Category, EventID, Data , and Strings .

EventType indicates the type and severity of the error. It should be one of the following constants:

  • EVENTLOG_INFORMATION_TYPE ”an informational message

  • EVENTLOG_WARNING_TYPE ”a nonfatal error

  • EVENTLOG_ERROR_TYPE ”a fatal error

  • EVENTLOG_AUDIT_SUCCESS ”a "success" audit event, usually written to the Security log

  • EVENTLOG_AUDIT_FAILURE ”a "failure" audit event, usually written to the Security log

Category is application-specific. It can be any numeric value you choose. The Event Viewer application can sort and filter log entries on the basis of thecategory field.

EventID is an application-specific ID for the event. It can be used to identify aparticular error message numerically .

Data contains raw data associated with the log entry. It is generally used by compiled programs to store exception data. You can safely leave it blank.

Strings contains one or more human-readable strings to be associated with the log entry. The error message goes to this field. You can separate the message into multiple smaller strings by separating each string by a NULL character ( \0 ).

$log->Close()

The Close() method closes and cleans up the EventLog object.

This example writes an informational message to the Application log on the local machine.

use Win32::EventLog; my $log = Win32::EventLog->new('Application') or die "Can't log: $!"; $log->Report({ EventType => EVENTLOG_INFORMATION_TYPE, Category => 1, EventID => 1, Data => undef, Strings => "Server listening on port 12345" }); $log->Close;

Direct Logging to a File

A simple alternative to logging with syslog or the Windows EventLog is to do the logging yourself by writing directly to a file. This is the preferred solution for applications that do heavy logging, such as Web servers, which would otherwise overload the logging system.

Logging directly to a file is simply a matter of opening the file for appending and turning on autoflush mode for the filehandle. The last step is important because otherwise the output from spawned children may intermingle in the log file. The other issue is handling multiple servers that might want to log to the same file. If they all try to write at once, the messages might become intermingled. We will avoid this by using advisory file locks created with the built-in flock () function.

The syntax of flock() is simple:

$boolean = flock(FILEHANDLE,$how);

The first argument is a filehandle open on the file you wish to lock, and the second is a numeric constant indicating the locking operation you wish to perform (Table 14.1).

Table 14.1. Arguments to flock()

Operation Description
LOCK_EX An exclusive lock
LOCK_SH A shared lock
LOCK_UN Unlock the file

Shared locks created with LOCK_SH can be held by several processes simultaneously and are used when there are multiple readers on a file. LOCK_EX is the type of lock we will use, because it can only be held by a single process at a time and is suitable for locking a file that you wish to write to. These three constants can be imported from the Fcntl module using the :flock tag.

We rewrite the log_debug(), log_notice(), and log_warn() functions to write suitably formatted messages to the filehandle. As an added frill, we'll make these functions respect an internally defined $PRIORITY package variable so that only those messages that equal or exceed the priority are written to the log. This allows you to log verbosely during development and debugging but restricts logging to error messages after deployment.

An example of this scheme is shown in Figure 14.3, which defines a small module called LogFile. Here is a synopsis of its use:

Figure 14.3. Logging to a file

#!/usr/bin/perl use LogFile; init_log('/usr/local/logs/mylog.log') or die "Can't log!"; log_priority(NOTICE); log_debug("This low-priority debugging statement will not be seen.\n"); log_notice("This will appear in the log file.\n"); log_warn("This will appear in the log file as well.\n"); die "This is an overridden function.\n";

After loading LogFile, we call init_log() and pass it the pathname of the log file to use. We then call log_priority with an argument of NOTICE , to suppress all messages of a lower priority. We then log some messages at different priorities, and finally die() to demonstrate that warn() and die() have been overridden. After running this test program, the log file shows the following entries:

Wed Jun 7 09:09:52 2000 [notice] This will appear in the log file. Wed Jun 7 09:09:52 2000 [warning] This will appear in the log file as well. Wed Jun 7 09:09:52 2000 [critical] This is an overridden function.

Let's walk through the module.

Lines 1 “10: Module initialization We load IO::File and other utility packages and define the current module's exported functions.

Lines 11 “14: Constant definitions We define numeric constants for priorities named DEBUG, NOTICE, WARNING , and CRITICAL .

Line 15: Globals The package maintains two global variables. $PRIORITY is the current priority threshold. Only messages with priorities greater than or equal to $PRIORITY will be logged. $fh contains the filehandle opened on the log file. It will also be used for locking. A consequence of this design decision is that a process can open only one log file at a time.

Lines 16 “24: init_log() init_log() is called with the pathname of the desired log file. We attempt to open the file for appending. If successful, we turn on autoflush mode, set the priority threshold to DEBUG , and replace warn() and die() with our own log_warn() and log_die() routines.

Lines 25 “28: log_priority() This function gets and sets the $PRIORITY global, controlling the threshold for logging messages of a given priority.

Lines 29 “36: _msg() This is similar to the _msg() function defined earlier, except that it now has the responsibility for adding a time stamp and a priority label to each log entry.

Lines 37 “42: _log() This is an internal subroutine for writing messages to the log file. We are called with a single argument containing the message to write. After locking the log file by calling flock() with an argument of LOCK_EX , we print() the message to the filehandle and then release our lock by calling flock() again with an argument of LOCK_UN .

Lines 43 “59: log_debug(), log_notice(), log_warn(), log_die() Each of these functions accepts an error message in the manner of warn() or die(). If $PRIORITY is higher than the priority of the message, we do nothing. Otherwise, we call _msg() to format the message and pass the result to _log() for writing to the log file. log_die() does the additional step of calling the real die() in order to terminate the program abnormally.


   
Top

Категории