Network Programming with Perl


 
Network Programming with Perl

By Lincoln  D.  Stein

Slots : 1

Table of Contents
Chapter  5.   The IO::Socket API

    Content

IO::Socket Methods

We'll now look at IO::Socket in greater depth.

The IO::Handle Class Hierarchy

Figure 5.3 diagrams the IO::Handle class hierarchy. The patriarch of the family tree, IO::Handle, provides object-oriented syntax for all of Perl's various input/output methods. Its immediate descendent , IO::Socket, defines additional methods that are suitable for Berkeley sockets. IO::Socket has two descendents. IO::Socket::INET defines behaviors that are specific for Internet domain sockets; IO::Socket::UNIX defines appropriate behaviors for AF_UNIX (a.k.a., AF_LOCAL) sockets.

Figure 5.3. The IO::Handle class hierarchy

One never creates an IO::Socket object directly, but creates either an IO::Socket::INET or an IO::Socket::UNIX object. We use the IO::Socket::INET subclass both in this chapter and in much of the rest of this book. Future versions of the I/O library may support other addressing domains.

Other important descendents of IO::Handle include IO::File, which we discussed in Chapter 2, and IO::Pipe, which provides an object-oriented interface to Perl's pipe() call. From IO::Socket::INET descends Net::Cmd, which is the parent of a whole family of third-party modules that provide interfaces to specific command-oriented network services, including FTP and Post Office Protocol. We discuss these modules beginning in Chapter 6.

Although not directly descended from IO::Handle, other modules in the IO::* namespace include IO::Dir for object-oriented methods for reading and manipulating directories, IO::Select for testing sets of filehandles for their readiness to perform I/O, and IO::Seekable for performing random access on a disk file. We introduce IO::Select in Chapter 12, where we use it to implement network servers using I/O multiplexing.

Creating IO::Socket::INET Objects

As with other members of the IO::Handle family, you create new IO::Socket::INET objects by invoking its new() method, as in:

$sock = IO::Socket::INET->new('wuarchive.wustl.edu:daytime');

This object is then used for all I/O related to the socket. Because IO::Socket::INET descends from IO::Handle, its objects inherit all the methods for reading, writing, and managing error conditions introduced in Chapter 2. To these inherited methods, IO::Socket::INET adds socket-oriented methods such as accept() , connect() , bind() , and sockopt() .

As with IO::File, once you have created an IO::Socket option you have the option of either using the object with a method call, as in:

$sock->print('Here comes the data.');

or using the object as a regular filehandle:

print $sock 'Ready or not, here it comes.';

Which syntax you use is largely a matter of preference. For performance reasons discussed at the end of this chapter, I prefer the function-oriented style whenever there is no substantive difference between the two.

The IO::Socket::INET->new() constructor is extremely powerful, and is in fact the most compelling reason for using the object-oriented socket interface.

$socket = IO::Socket::INET->new (@args);

The new() class method attempts to create an IO::Socket::INET object. It returns the new object, or if an error was encountered , undef . In the latter case, $! contains the system error, and $@ contains a more verbose description of the error generated by the module itself.

IO::Socket::INET->new() accepts two styles of argument. In the simple "shortcut" style, new() accepts a single argument consisting of the name of the host to connect to, a colon , and the port number or service name. IO::Socket::INET creates a TCP socket, looks up the host and service name, constructs the correct sockaddr_in structure, and automatically attempts to connect() to the remote host. The shortcut style is very flexible. Any of these arguments is valid:

wuarchive.wustl.edu:echo wuarchive.wustl.edu:7 128.252.120.8:echo 128.252.120.8:7

In addition to specifying the service by name or port number, you can combine the two so that IO::Socket::INET will attempt to look up the service name first, and if that isn't successful, fall back to using the hard-coded port number. The format is hostname:service(port). For instance, to connect to the wuarchive echo service, even on machines that for some reason don't have the echo service listed in the network information database, we can call:

my $echo = IO::Socket::INET->new('wuarchive.wustl.edu:echo(7)') or die "Can't connect: $!\n";

The new() method can also be used to construct sockets suitable for incoming connections, UDP communications, broadcasting, and so forth. For these more general uses, new() accepts a named argument style that looks like this:

my $echo = IO::Socket::INET->new(PeerAddr => 'wuarchive.wustl.edu', PeerPort => 'echo(7)', Type => SOCK_STREAM, Proto => 'tcp') or die "Can't connect: $!\n";

Recall from Chapter 1 that the " => " symbol is accepted by Perl as a synonym for ",". The newlines between the argument pairs are for readability only. In shorter examples, we put all the name/argument pairs on a single line.

The list of arguments that you can pass to IO::Socket::INET is extensive . They are summarized in Table 5.1

Table 5.1. Arguments to IO::Socket::INET->new()

Argument Description Value
PeerAddr Remote host address <hostname or address>[:<port>]
PeerHost Synonym for PeerAddr  
PeerPort Remote port or service <service name or number>
LocalAddr Local host bind address <hostname or address>[:port]
LocalHost Synonym for LocalAddr  
LocalPort Local host bind port <service name or port number>
Proto Protocol name (or number) <protocol name or number>
Type Socket type SOCK_STREAM SOCK_DGRAM ...
Listen Queue size for listen <integer>
Reuse Set SO_REUSEADDR before binding <boolean>
Timeout Timeout value <integer>
MultiHomed Try all adresses on multihomed hosts <boolean>

The PeerAddr and PeerHost arguments are synonyms which are used to specify a host to connect to. When IO::Socket::INET is passed either of these arguments, it will attempt to connect() to the indicated host. These arguments accept a hostname, an IP address, or a combined hostname and port number in the format that we discussed earlier for the simple form of new() . If the port number is not embedded in the argument, it must be provided by PeerPort .

PeerPort indicates the port to connect to, and is used when the port number is not embedded in the hostname. The argument can be a numeric port number, a symbolic service name, or the combined form, such as "ftp(22)."

The LocalAddr , LocalHost , and LocalPort arguments are used by programs that are acting as servers and wish to accept incoming connections. LocalAddr and LocalHost are synonymous, and specify the IP address of a local network interface. LocalPort specifies a local port number. If IO::Socket::INET sees any of these arguments, it constructs a local address and attempts to bind() to it.

The network interface can be specified as an IP address in dotted -quad form, as a DNS hostname, or as a packed IP address. The port number can be given as a port number, as a service name, or using the "service(port)" combination. Itis also possible to combine the local IP address with the port number, as in "127.0.0.1:http(80)." In this case, IO::Socket::INET will take the port number from LocalAddr , ignoring the LocalPort argument.

If you specify LocalPort but not LocalAddr , then IO::Socket::INET binds to the INADDR_ANY wildcard, allowing the socket to accept connections from any of the host's network interfaces. This is usually the behavior that you want.

Stream-oriented programs that wish to accept incoming connections should also specify the Listen and possibly Reuse arguments. Listen gives the size of the listen queue. If the argument is present, IO::Socket will call listen() after creating the new socket, using the argument as its queue length. This argument is mandatory if you wish to call accept() later.

Reuse , if a true value, tells IO::Socket::INET to set the SO_REUSEADDR option on the new socket. This is useful for connection-oriented servers that need to be restarted from time to time. Without this option, the server has to wait a few minutes between exiting and restarting in order to avoid "address in use" errors during the call to bind() .

Proto and Type specify the protocol and socket type. The protocol may be symbolic (e.g., "tcp") or numeric, using the value returned by getprotoby name() . Type must be one of the SOCK_* constants, such as SOCK_STREAM . If one or more of these options is not provided, IO::Socket::INET guesses at the correct values from context. For example, if Type is absent, IO::Socket:: INET infers the correct type from the protocol. If Proto is absent but a service name was given for the port, then IO::Socket::INET attempts to infer the correct protocol to use from the service name. As a last resort, IO::Socket::INET defaults to "tcp."

Timeout sets a timeout value, in seconds, for use with certain operations. Currently, timeouts are used for the internal call to connect() and in the accept() method. This can be handy to prevent a client program from hanging indefinitely if the remote host is unreachable.

The MultiHomed option is useful in the uncommon case of a TCP client that wants to connect to a host with multiple IP addresses and doesn't know which IP address to use. If this argument is set to a true value, the new() , method uses gethostbyname() to look up all the IP addresses for the hostname specified by PeerAddr. It then attempts a connection to each of the host's IP addresses in turn until one succeeds.

To summarize, TCP clients that wish to make outgoing connections should call new() with a Proto argument of tcp, and either a PeerAddr with an appended port number, or a PeerAddr/PeerPort pair. For example:

my $sock = IO::Socket::INET->new(Proto => 'tcp', PeerAddr => 'www.yahoo.com', PeerPort => 'http(80)');

TCP servers that wish to accept incoming connections should call new() , with a Proto of " tcp ", a LocalPort argument indicating the port they wish to bind with, and a Listen argument indicating the desired listen queue length:

my $listen = IO::Socket::INET->new(Proto => 'tcp', LocalPort => 2007, Listen => 128);

As we will discuss in Chapter 19, UDP applications need provide only a Proto argument of " udp " or a Type argument of SOCK_DGRAM . The idiom is the same for both clients and servers:

my $udp = IO::Socket::INET->new(Proto => 'udp');

IO::Socket Object Methods

Once a socket is created, you can use it as the target for all the standard I/O functions, including print() , read() , <> , sysread () , and so forth. The object-oriented method calls discussed in Chapter 2 in the context of IO::File are also available. In addition, IO::Socket::INET adds the following socket-specific methods to its objects:

$connected_socket = $listen_socket->accept()

($connected_socket,$remote_addr) = $listen_socket->accept()

The accept() method performs the same task as the like-named call in the function-oriented API. Valid only when called on a listening socket object, accept() retrieves the next incoming connection from the queue, and returns a connected session socket that can be used to communicate with the remote host. The new socket inherits all the attributes of its parent, except that it is connected.

When called in a scalar context, accept() returns the connected socket. When called in an array context, accept() returns a two-element list, the first of which is the connected socket, and the second of which is the packed address of the remote host. You can also recover this address at a later time by calling the connected socket's peername() method.

$return_val = $sock->connect ($dest_addr)

$return_val = $sock->bind ($my_addr)

$return_val = $sock->listen ($max_queue)

These three TCP-related methods are rarely used because they are usually called automatically by new() . However, if you wish to invoke them manually, you can do so by creating a new TCP socket without providing either a PeerAddr or a Listen argument:

$sock = IO::Socket::INET->new(Proto=>'tcp'); $dest_addr = sockaddr_in(...) # etc. $sock->connect($dest_addr);

$return_val = $sock->connect ($port, $host)

$return_val = $sock->bind ($port, $host)

For your convenience, connect() and bind() both have alternative two-argument forms that take unpacked port and host addresses rather than a packed address. The host address can be given in dotted-IP form or as a symbolic hostname.

$return_val = $socket->shutdown($how)

As in the function-oriented call, shutdown() is a more forceful way of closing a socket. It will close the socket even if there are open copies in forked children. $how controls which half of the bidirectional socket will be closed, using the codes shown in Table 3.1.

$my_addr = $sock->sockname()

$her_addr = $sock->peername()

The sockname() and peername() methods are simple wrappers around their function-oriented equivalents. As with the built-in functions, they return packed socket addresses that must be unpacked using sockaddr_in() .

$result = $sock->sockport()

$result = $sock->peerport()

$result = $sock-> sockaddr ()

result = $sock->peeraddr()

These four methods are convenience functions that unpack the values returned by sockname() and peername() . sockport() and peerport() return the port numbers of the local and remote endpoints of the socket, respectively. sockaddr() , and peeraddr() return the IP address of the local and remote endpoints of the socket as binary structures suitable for passing to gethostbyaddr() . To convert theresult into dotted-quad form, you still need to invoke inet_ntoa() .

$my_name = $sock->sockhost()

$her_name = $sock->peerhost()

These methods go one step further, and return the local and remote IP addresses in full dotted-quad form ("aa.bb.cc.dd"). If you wish to recover the DNS name of the peer, falling back to the dotted-quad form in case of a DNS failure, here is the idiom:

$peer = gethostbyaddr($sock->peeraddr,AF_INET) $sock->peerhost;

$result = $sock->connected()

The connected() method returns true if the socket is connected to a remote host, false otherwise . It works by calling peername() .

$protocol = $sock->protocol()

$type = $sock->socktype()

$domain = $sock->sockdomain()

These three methods return basic information about the socket, including its numeric protocol, its type, and its domain. These methods can be used only to get the attributes of a socket object. They can't be used to change the nature of an already-created object.

$value = $sock->sockopt($option [,$value])

The sockopt() method can be used to get and/or set a socket option. It is a front end for both getsockopt() and setsockopt() . Called with a single numeric argument, sockopt() retrieves the current value of the option. Called with an option and a new value, sockopt() sets the option to the indicated value, and returns a result code indicating success or failure. There is no need to specify an option level, as you do with getsockopt() , because the SOL_SOCKET argument is assumed.

Unlike the built-in getsockopt() , the object method automatically converts the packed argument returned by the underlying system call into an integer, so you do not need to unpack the option values returned by sockopt() . As we discussed earlier, the most frequent exception to this is the SO_LINGER option, which operates on an 8-byte linger structure as its argument.

$val = timeout([$timeout])

timeout() gets or sets the timeout value that IO::Socket uses for its connect() , and accept() methods. Called with a numeric argument, it sets the timeout value and returns the new setting. Otherwise, it returns the current setting. The timeout value is not currently used for calls that send or receive data. The eval{} trick, described in Chapter 2, can be used to achieve that result.

$bytes = $sock->send ($data [, $flags ,$destination])

$address = $sock-> recv ($buffer,$length [,$flags])

These are front ends for the send() and recv() functions, and are discussed in more detail when we discuss UDP communications in Chapter 19.

An interesting side effect of the timeout implementation is that setting theIO::Socket::INET timeout makes the connect() and accept() calls interruptable by signals. This allows a signal handler to gracefully interrupt a program that is hung waiting on a connect() or accept() . We will see an example of this in the next section.


   
Top

Категории