Internet Information Services (IIS) and the Internet Protocols

IIS 6 is an integrated set of Internet services available in Microsoft Windows Server 2003 that provides Web publishing, file transfer, network news, and mail services. IIS 6 includes servers for Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), Network News Transfer Protocol (NNTP), and Simple Mail Transfer Protocol (SMTP). In this chapter, we look at each protocol and the functionality it provides.

HTTP

HTTP is an application-level protocol used for communications and data transmission between client computers and HTTP servers. Client requests are frequently submitted from browser software such as Microsoft Internet Explorer and Netscape Navigator to HTTP servers such as IIS. HTTP is the protocol underlying today's World Wide Web.

Early versions of HTTP, referred to as HTTP/0.9, have existed since 1990. Between 1990 and 1996, HTTP continued to evolve in an uncontrolled manner. It was common for different vendors to modify the protocol to suit their own needs. RFC 1945 was the first formal HTTP specification. To some degree, a major objective of RFC 1945 was to summarize and unify the implementation details. As a result, the industry generally expected it to be replaced rather quickly. For example, HTTP/1 lacked sufficient definition for caching, hierarchical proxies, virtual hosts, and persistent connections. A more stringent version, HTTP/1.1, was defined in RFC 2068, which has been updated in RFC 2616. IIS 6 contains an HTTP/1.1-compliant server.

Before analyzing HTTP in depth, you must understand certain basic terms as they relate to the protocol:

HTTP in Operation

HTTP is a request/response protocol. A client wishing to retrieve a resource from an HTTP server issues a request message containing a request method, URI, protocol version ID, and resource-specific information. The following is a trace of a simple request andresponse between an HTTP client and server. The full content of Capture 21-01 is provided on the companion CD-ROM.

12 1.838983 0050564050E1 0050564050EA HTTP GET Request (from client using port 4283) 10.10.1.68 10.10.1.74 + Frame: Base frame properties + ETHERNET: ETYPE = 0x0800 : Protocol = IP: DOD Internet Protocol + IP: ID = 0x9BAB; Proto = TCP; Len: 314 + TCP: .AP..., len: 274, seq:1182807736-1182808010, ack:3591543841, win:17520, src: 4283 dst: 80 HTTP: GET Request (from client using port 4283) HTTP: Request Method = GET HTTP: Uniform Resource Identifier = /rebecca.htm HTTP: Protocol Version = HTTP/1.1 HTTP: Accept = image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* HTTP: Accept-Language = en-us HTTP: Accept-Encoding = gzip, deflate HTTP: User-Agent = Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3 HTTP: Host = kapoho10 HTTP: Connection = Keep-Alive 13 1.848998 0050564050EA 0050564050E1 HTTP Response (to client using port 4283) 10.10.1.74 10.10.1.68 + Frame: Base frame properties + ETHERNET: ETYPE = 0x0800 : Protocol = IP: DOD Internet Protocol + IP: ID = 0xA119; Proto = TCP; Len: 441 + TCP: .AP..., len: 401, seq:3591543841-3591544242, ack:1182808010, win:17246, src: 80 dst: 4283 HTTP: Response (to client using port 4283) HTTP: Protocol Version = HTTP/1.1 HTTP: Status Code = OK HTTP: Reason = OK HTTP: Content-Length = 176 HTTP: Content-Type = text/html HTTP: Last-Modified = Tue, 27 Aug 2002 22:05:16 GMT + HTTP: Undocumented Header = Accept-Ranges: bytes + HTTP: Undocumented Header = ETag: "0b23ed3154ec21:68a" HTTP: Server = Microsoft-IIS/6.0 HTTP: Date = Tue, 27 Aug 2002 22:06:49 GMT HTTP: Data: Number of data bytes remaining = 176 (0x00B0)

In this trace, a client, kapoho10, issues a GET method request to the server with the intention of getting the page ebecca.htm. The server then responds with the page (contained in the data field). The preceding summary omits the necessary name resolution and TCP connection overhead, although Capture 21-01 (found in the Captures folder on the companion CD-ROM) includes these exchanges.

An HTTP request can be issued to the server, a tunneling agent, a proxy, or a gateway. Each participant in the transaction can simultaneously process multiple HTTP connections, which are typically issued over TCP port 80, although the participant applications can specify other ports.

One restriction of most implementations of HTTP/1 was that a separate connection had to be established for each request. With HTTP/1.1, multiple transactions can be processed over a single connection, which is kept open. This is called a keep-alive, which can improve protocol performance and reduce server overhead. You can see the keep-alive request in the preceding trace. HTTP provides no mechanisms for guaranteed delivery of messages. HTTP relies on TCP to provide this functionality.

After receiving a request message, an HTTP server responds with a message indicating success or error, as well as protocol-version information and possible resource-specific data, including the data itself. In the preceding example, the server returns the page ebecca.htm to the client.

The protocol version is sent in the format "HTTP/x.y," where x is the "major" version identifier and y is the "minor" version identifier (in the preceding example, this is HTTP/1.1). By sending version numbers as part of the HTTP message, clients and servers negotiate communication format, with both client and server sending the highest protocol version they both understand. With the exception of tunnels, which do not maintain any awareness of the contents of HTTP data, each party in the process can potentially cache message content to facilitate faster retrieval for future requests.

A proxy is not permitted to send a request message in a protocol version higher than the proxy itself supports. In addition, it is required to do one of three things: downgrade any incoming request message with a higher version than its own, respond to the client with an error message, or switch to tunnel mode. Proxies are also required to upgrade any lower protocol version requests to match their own, but return the response to the client in the same major version as the original request.

URIs

A URI is simply a standard format for defining a retrievable resource. A familiar term, the URL, is actually a subset of the URI, as is the less familiar Uniform Resource Name (URN). A retrievable item can be requested by referencing the item's location, as in the case of a URL, or by referencing the item's distinguished name, as in the case of a URN. The HTTP protocol does not limit the length of URIs, and servers are required to be capable of receiving a URI with a length that is at least as long as that of any resource they serve.

URI Syntax

URIs can be absolute or relative to a base URI. Although RFC 2396 defines generic syntax, several RFCs define the syntax of a URI. An absolute URI lists the entire scheme (this is usually the name of the protocol used in this request) and path that will be used to request the resource, whereas relative URIs build on a previously established base location and scheme. A typical URI construction is as follows:

://:/?

It might appear as:

http://search.msn.com/results.asp?q=Thomas+Lee+Joseph+Davies+TCP%2FIP

In this example, the scheme is http:, indicating that the request will be passed over HTTP. The slashes are reserved characters that serve as separators between scheme and scheme-specific details. The host name given here is search.msn.com, so the request is issued over TCP port 80 to the server at search.msn.com. If a port were specified, the TCP connection would be opened over that port. If for some reason the target server is not listening for requests over this port number, an error results. The absolute path given hereis to the search facility, with the rest of the URI being parameters to send to this search engine. The presence of a question mark following this path indicates that a query is being performed, and the remainder of the URI is query parameter information (in this case, the keywords that should be searched for).

HTTP Messages

As noted earlier, HTTP clients and servers use HTTP messages as the basic form of communication. These messages are designed to support a heterogeneous collection of servers and follow a well-defined syntax.

Message Types

RFC 822 defines HTTP request and response messages as being used to transfer data and resources, called entities, from client to server, and vice versa. These messages are comprised of a start line that might or might not be followed by header fields, which are often simply referred to as headers. An end-of-line marker (carriage return line feed [CRLF]) indicates the end of the headers, and message body data might follow. HTTP/1.1dictates that no more than one CRLF be used sequentially in any HTTP message to eliminate unnecessary parsing on the part of either the server or client, although flawed implementations of HTTP often issue multiple CRLFs in succession. If a CRLF is received where none is expected, it is ignored. Syntax of a typical HTTP message is the following:

Request-Line|Response-Line *((general-header|request-header|response- header|entity-header)CRLF)CRLF [message-body]

  More Info

HTTP request and response messages are defined in RFC 822, which can be found in the Rfc folder on the companion CD-ROM.

Message Headers

HTTP header fields are used to define parameters regarding the message being transmitted, whether the message is general, request, response, or entity information. Header fields can be preceded by any amount of linear white space (LWS), and consist simply of the header name followed by a colon (:) and the header value. General header fields apply to both request and response messages but do not relate to the entity being transferred. Request and response header fields are specific to their respective message types, and entity header fields provide additional information about the resource being passed in the message.

Request Messages

Client programs issue request messages both to establish communication parameters and to initiate resource transfer and manipulation. The following example shows the headers of the message generated by a client requesting a resource from a server (as demonstrated in Capture 21-01):

HTTP: GET Request (from client using port 4283) HTTP: Request Method = GET HTTP: Uniform Resource Identifier = /rebecca.htm HTTP: Protocol Version = HTTP/1.1 HTTP: Accept = image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* HTTP: Accept-Language = en-us HTTP: Accept-Encoding = gzip, deflate HTTP: User-Agent = Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3 HTTP: Host = kapoho10 HTTP: Connection = Keep-Alive

This output is a single line in the HTTP message, with spaces separating the individual elements, and terminated with a CRLF sequence. This line identifies several criteria: the general type of request being made (a GET, in this case); the method token, or action to perform on the requested resource (again a GET); the URI of the requested resource(/Rebecca.htm); and the protocol version number (HTTP/1.1). There are extra header fields to inform the server what type of object the client will accept, the client's language, the user agent in use, and whether to keep connections alive between client and server. The client HTTP port number listed in this request, 4283, is unrelated to the serverTCP port. Although the client made its HTTP request over TCP/port 4283, the requestwas issued to TCP port 80. This can be seen in the TCP header of the HTTP GET message, as follows:

TCP: .AP..., len: 274, seq:1182807736-1182808010, ack:3591543841, win:17520, src: 4283 dst: 80 TCP: Source Port = 0x10BB TCP: Destination Port = Hypertext Transfer Protocol TCP: Sequence Number = 1182807736 (0x468036B8) TCP: Acknowledgement Number = 3591543841 (0xD6129C21) TCP: Data Offset = 20 (0x14) TCP: Reserved = 0 (0x0000) + TCP: Flags = 0x18 : .AP... TCP: Window = 17520 (0x4470) TCP: Checksum = 0xDB66 TCP: Urgent Pointer = 0 (0x0) TCP: Data: Number of data bytes remaining = 274 (0x0112)

Request Message Methods

Methods are actions that a request message can ask to be performed at the server or applied to a resource entity. Table 21-1 lists common HTTP/1.1 methods and the actions that they request.

Table 21-1: Common HTTP/1.1 Method Codes

Method

Description

OPTIONS

Requests information regarding the server capabilities or what actions can be performed on a specified resource. This is an information-gathering method only; no action is performed on the resource. Results of this method cannot be cached. If the request URI is an asterisk (*), the client is testing the capabilities of the server to facilitate further action. If the request URI is anything other than an asterisk, the client can retrieve information only about the capabilities of the resource specified.

GET

Requests the HTTP server to return some object. GET methods can also be conditional, where the retrieval is performed only if specified conditions are completely or partially met. For example, the client can specify that it hasalready received and cached some portion of the requested data, and needs to retrieve the portions that it does not yet have. Web proxy servers, such as Microsoft ISA Server, use partial GETs to keep their Web cache up to date while minimizing data retrieved from a server

HEAD

The HEAD method is identical to the GET method, except that the server does not return the actual resource in its response, but merely information about the resource. This method is often used to test validity of hypertext links so as not to request a resource that might not actually be available.

POST

A client uses a POST method to send data to the server. The POST command is often used to supply a server with information that a user has input into an HTML form. The POST method uses a specific file on the server as part of the URI—the file named is generally a server-side script or executable that is capable of processing the data the client is sending.

PUT

The PUT method is used to create an entity on the HTTP server under the requested URI. Generally, PUT is used as a simple method of uploading a file to the server. PUT might create new files, or might replace files that already exist and have the same name as the entity named in the PUT request.

DELETE

The DELETE method requests that a specific resource be deleted from the server. Even though the server might respond with a message indicating success, the client does not know if the deletion actually occurred, as it might be halted by human intervention.

TRACE

The TRACE method is used to request a loopback of the request message. This is used primarily as a troubleshooting tool to determine which data is actually being received at the other end of the request. No entities are passed in a TRACE method.

CONNECT

CONNECT is used for proxies that can dynamically become tunnels, as is required by Secure Sockets Layer (SSL) tunneling.

Safe Methods

Methods can be said to be safe, meaning that if they are properly implemented, they should cause no ill effects to the server. For example, the GET and HEAD methods are considered safe methods because they retrieve only data from the server and do not actually manipulate that data while it resides on the server.

Response Messages

After a server receives a client request message, it returns an HTTP response message, similar in construction to the client request, in the following format:

Status line *(( general header | response header | entity header ) CRLF) CRLF [message body]

Each of these components is analyzed in further detail in this section.

Response Message Status Lines

When a server receives a request message from a client, it evaluates the request method and might or might not perform an action on the requested resource as a result. Regardless of whether or not the action is performed, the server must respond to the requesting client. The response is sent in the form of a status line with the following syntax:

Status Line HTTP version Status code Reason phrase

Status codes are organized into classes that identify the generic response type. Table 21-2 lists the classes and meanings of status codes defined in HTTP/1.1.

Table 21-2: HTTP/1.1 Status Code Classes and Meanings

Number

Class

Indication

1xx

Informational

This indicates a provisional response and returns only a status line indicating status and optional headers. Essen- tially, this is the server's method of responding with an acknowledgment message.

2xx

Successful

The server understands and accepts the client's request.

3xx

Redirection

The client needs to take further action to retrieve the requested resource. If the method used in the subsequent request message is GET or HEAD, the redirect can occur without any user intervention.

4xx

Client Error

The server believes that the client has performed an error. The server should provide an explanation of the error and indicate whether this is a permanent or temporary error. In addition, the server should wait for TCP acknowledg- ment of client receipt of the error message so it does not close the connection prematurely.

5xx

Server Error

The server is incapable of performing the request, the client is not allowed access to the resource, or a server error has occurred. The server should include an explana- tion of the error and indicate whether it is temporary or permanent.

Each class of status response has codes that the server returns to the client indicating specific information. Table 21-3 summarizes the individual codes that a server can return to a requesting client, as defined in RFC 2616. The messages listed are merely recommendations and can be customized without affecting the protocol.

Table 21-3: HTTP/1.1 Status Codes

Code

Message

Meaning

100

Continue

The client should continue sending the remainder of the request; if the entire request has already been sent, the client ignores the message on receipt.

101

Switching Protocols

The client has requested to switch to another application protocol by issuing an Upgrade message header.

200

OK

The client's request has been successfully pro cessed; remaining information returned varies according to the type of client request.-

201

Created

A new resource has been successfully created at the client's request.

202

Accepted

The client's request has been accepted, but not yet processed. The server should indicate to the client when the request might be fulfilled or pro vide a pointer to a status monitor for the request.-

203

Non-Authoritative Information

The server would normally issue a 200 (OK) response, but the server is not authoritative for the information returned in the message. This indicates that the information was gathered from another source, and therefore this server cannot verify it.

204

No Content

The server has fulfilled the client request, but does not need to return a new object. The server can, however, return information that causes the user interface to be updated.

205

Reset Content

The server has fulfilled the client request and is instructing the user agent (client software) to reset the current document view. This is commonly used to allow a user to input form data, then clear the form so that more data can be entered.

206

Partial Content

The client has issued a partial GET request, and the server has fulfilled that request. The server must indicate what portion of the requested data it has fulfilled for this particular GET.

300

Multiple Choices

The requested resource exists in multiple locations, and the server is providing a list of these locations o the client. The server can indicate preference for a specific location, but the client chooses what it deems the appropriate location.

301

Moved Permanently

The requested resource has been moved perma nently and future requests for the resource should be directed to the location the server returns.-

302

Found

The requested resource exists elsewhere, but its location might change and the client should therefore continue to use the same request URI. Most browsers do not implement this correctly and treat a 302 response in the same manner as a301 response.

303

See Other

The response to the request exists under a different URI and should be retrieved using a GET method. The 303 response should not be cached, although the response received from the redirection location can potentially be cached.

304

Not Modified

The client has performed a conditional GET, but the document has not been modified.

305

Use Proxy

This response must be issued only by the server, and indicates that the requested resource must be accessed using the proxy provided in the response.

306

Unused

This status code was used in previous implemen tations of HTTP, has no function in HTTP/1.1, and is reserved.-

307

Temporary Redirect

The requested resource temporarily exists under a different URI, which should be provided to the client as a hyperlink to the new URI (for pre HTTP/1.1 clients that do not understand the 307 code). Automatic redirection without user input should occur only if the client issued a GET or HEAD request that triggered this message.-

400

Bad Request

The client issued a malformed request that the server could not interpret and one that should not be issued again without modification.

401

Unauthorized

The requested resource requires authentication; the server must issue a WWW-Authenticate chal- lenge. If the client has already responded to a challenge response, this message indicates that the credentials presented do not have permission to access the resource.

402

Payment Required

Reserved for future use.

403

Forbidden

The server refuses to fulfill the request made by the client. The server can indicate why the refusal has been generated with this message, or can dis guise the reason by issuing a 404 message instead.-

404

Not Found

The requested URI was not found on the server; the server is not required to give an indication as to whether this condition is temporary or perma nent. This message is typically used when the server does not wish to reveal or does not know why the resource is unavailable.-

405

Method Not Allowed

The request method is not permitted on this URI. A list of valid methods for the URI must be returned as part of the response.

406

Not Acceptable

The resource identified in the request is capable of generating responses only to the accept headers that it returns to the client, and not to the accept header that the client originally sent.

407

Proxy Authentication Required

Similar to a 401 response, but indicates that the client must authenticate with the proxy that forwarded the request.

408

Request Timeout

The server did not receive a request from the client within the time that the server was prepared to wait. The client can reissue the request later.

409

Conflict

The request made by the client could not be ful filled because it conflicts with the current state of the resource. Information can be returned that allows the user to correct the condition causing the conflict and then resubmit the request. This message is seen most often in response to PUT requests that can cause version conflicts in a resource.-

410

Gone

The requested resource is no longer available and its new location is unknown.

411

Length Required

The server will not accept the request unless the client reissues the request with the addition of a valid Content-Length header field.

412

Precondition Failed

The client issued a request containing precondition header fields and one or more of the fields evalu ated to false. This allows the client to perform conditional requests.-

413

Request Entity Too Large

The request is larger than the server is capable of processing and is being refused. The server is permitted to close the connection with the client so that the request cannot be resubmitted.

414

Request URI Too Long

The request URI is longer than the server is capable of accepting. This typically occurs when the user has input too much data into a form that is sent using a GET request.

415

Unsupported Media Type

The client issued a request method that is not supported for the resource in question.

416

Requested Range Not Satisfiable

The request included a Range Request header field that does not overlap with any of the values for the requested resource.

417

Expectation Failed

The request included an Expect Request header that the server cannot fulfill.

500

Internal Server Error

The server cannot fulfill the request because of an unexpected error.

501

Not Implemented

The server does not recognize the request method and is incapable of fulfilling it.

502

Bad Gateway

The server is acting as a gateway and received an invalid response from the requested server.

503

Service Unavailable

The server is experiencing a temporary condition that causes it to be unable to fulfill the request, such as overloading or maintenance being performed.

504

Gateway Timeout

The server is acting as a gateway or proxy and did not receive a response from the upstream server in time to process the request. Some proxies return this as a result of Domain Name System (DNS) timeout errors.

505

HTTP Version Not Supported

This server does not support the HTTP version specified in the request message.

  More Info

HTTP/1.1 status codes are defined in RFC 2616, which can be found in the Rfc folder on the companion CD-ROM.

A typical server response line might appear as follows:

HTTP: Response (to client using port 4283) HTTP: Protocol Version = HTTP/1.1 HTTP: Status Code = OK HTTP: Reason = OK HTTP: Content-Length = 176 HTTP: Content-Type = text/html HTTP: Last-Modified = Tue, 27 Aug 2002 22:05:16 GMT + HTTP: Undocumented Header = Accept-Ranges: bytes + HTTP: Undocumented Header = ETag: "0b23ed3154ec21:68a" HTTP: Server = Microsoft-IIS/6.0 HTTP: Date = Tue, 27 Aug 2002 22:06:49 GMT HTTP: Data: Number of data bytes remaining = 176 (0x00B0)

Note that this data does not appear to list the actual code number returned by the server to the client. However, analysis of the raw data sent shows that the following information was actually sent:

HTTP/1.1 200 OK

Status codes are extensible, meaning that server implementations can issue codes not listed in the preceding tables. Because the client might or might not understand the meaning of a particular extended status code, the code must be issued as an extension of an existing class. If the specific code number is unrecognized, the client responds as if it received a generic code from that class. For example, if the server issues error code 509 and the client does not recognize this code, it treats the error as a 500 error.

Header Fields

A client can generate request headers to query parameters regarding a resource or to negotiate content with the server. A server can issue response headers to provide information that cannot be included in the status line. Both request and response headers can indicate the presence of an entity to be transferred by including entity header fields and possible entity data. Although these field names cannot be universally extended without an accompanying change in the protocol version, extension headers can be used, provided both the client and the server understand them. Any header that is not recognized by the recipient is ignored. Internet Explorer sends nonstandard header fields to IIS 6, which is capable of utilizing them.

Headers can be either end-to-end headers or hop-by-hop headers. End-to-end headers are transmitted to the final message recipient, and must be stored as part of a cached entry. Proxies are forbidden to modify many end-to-end headers, cautioned against modifying others, and might or might not be permitted to add headers to a message. Hop-by-hop headers are useful only to the next recipient on a path, and proxies neither cache nor forward these headers.

Headers defined in RFC 2616 are summarized in Tables 21-4, 21-5, 21-6, and 21-7. Sender and recipient can refer to either the client or the server in these tables, as both can send messages containing header fields during the transaction.

Table 21-4: Request Header Fields

Header Name

Type

Interpretation

Example

Accept

End-to-end

Used to specify which media types are considered accept-able for the response. These might be limited to specific types of media, or might list groups of acceptable media. If no Accept header is present it is assumed that all media types are acceptable. If the client issues an Accept header that the server is not capable of fulfilling, the server issues a406 (Not Acceptable) response.,

Accept: image/gif; image/x-xbitmap; image/jpeg; image/ pjpeg; */*. Gifs, bitmaps, and jpegs are acceptable media types; "*/*" indicates that all media are acceptable. "Image/*" would indicate that all image types are acceptable.

Accept- Charset

End-to-end

Indicates acceptable responsecharacter sets. If no Accept-Charset header is present, it isassumed that all character setsare acceptable. Character setscan be given an associatedquality value, representinguse preference for a givencharacter set.

Accept-Charset: iso-8859-5, Unicode-1-1;q=0.8. Indicatesthat this client accepts bothcharacter sets listed, with apreference as indicated by codes assigned to the q value.

Accept Encoding -

End-to-end

Used to designate acceptablecontent codings, such as com-press or gzip. Content coding isdiscussed later in this chapter.

Accept-Encoding: gzip, com- press. Indicates that both gzip and compress codingsare acceptable.

Accept- Language

End-to-end

Defines acceptable languages,such as English, German, andJapanese.

Accept-Language: en-us U.S.English will be accepted.

Authori- zation

End-to-end

Clients attempting to authen- ticate with a server will issue their credentials in the Authori- zation header. Results can be cached.

Authorization: Basic bWNO0Tg6bWN0ND-MyPQ==.

Expect

End-to-end

The client expects specificbehavior from the server; if theserver does not understand theExpect header, it must return a417 (Expectation Failed) error.

Expect: 100-continue. Indi-cates that the client expects the server to continue themessage exchange.

From

End-to-end

Used to identify the user thatinitiated this sequence of mes-sages. Typically, this is utilizedby robots that gather informa-tion. By providing the e-mailaddress of the robot's owner, the user can be contacted if therobot causes server problems.

From: owner@microsoft.com.

Host

End-to-end

Identifies the host name andport number of the owner ofthe requested resource. If noport number is specified, port80 is assumed. This header isused to allow the server to dis-tinguish between multiple sitesresponding to the same IP andTCP port.

Host: technet.microsoft.com.

If-Match

End-to-end

Used to make the request mes-sage conditional; generally, thisis used to verify that the client'sresource is current.

If-Match: *. Indicates that this allows a match with any cur-rent version of the resource, rather than a specific entitytag.

If-Modi-fied-Since

End-to-end

Used to make a method con-ditional on whether the speci-fied resource has been modifiedsince a particular date. Used bythe browser to determine if aspecific resource has been up-dated since it was last cached.

If-Modified-Since: Sat, 11Sept 1999 12:26:31 GMT.

If-None-Match

End-to-end

Used to facilitate efficient cach-ing by verifying that there is nomatch on the server for thespecified resource; also used byclients to ensure that a PUTmethod does not inadvertentlyreplace a resource.

If-None-Match: "a0cde3e0c444be1:18e2." Indicates this value could befollowed with a PUT method to place the resource on the server.

If-Range

End-to-end

A client can use this tag to deter-mine whether a resource it hasa partial copy of has changed.If the resource has not changed,the client might then be able torequest the remainder of theentity range to obtain thecomplete resource.

If-Range: "a0cde3e0c444be1:18e2."

If-Unmod-ified-Since

End-to-end

Used to make a method condi-tional; the converse of the If-Modified-Since header field.

If-Unmodified-Since: Sat,11 Sept 1999 12:26:31 GMT.

Max-Forwards

End-to-end

Used in conjunction with theTrace and Options headers tolimit the number of proxiesthat can forward the requestmessage. This is generally usedto troubleshoot paths that aresuspected to be looping backon themselves.

Max Forwards: 3.

Proxy-Authori-zation

Hop-by-hop

Used by a client to identifyitself to a proxy that requires authentication. The proxy indi-cates this by the return of a 407(Proxy Authentication Required)message to the client.

Proxy Authorization: Basic bQR0OTg6b-WN0NDMyNP==.

Range

End-to-end

Used to specify the portion of aresource that the client wantsto retrieve. In some cases, thisheader can be used in conjunc-tion with the If-Range header.This header allows a client toobtain part of a resource, ratherthan the entire resource.

Range: "a0cde3e0c444be1:18e2."

Referer

End-to-end

A client uses the Referer [sic the HTTP RFCs as well as in actual implementation) to] header (misspelled throughout inform a server where the client received the reference that directed it to the server for the Request URI message.

Referer: http:// partnering.microsoft.com/ exchange/pf/root.asp.

TE

Hop-by-hop

Indicates the extension transfercodings that the client is willingto accept in a response message.If this is accompanied by atrailers keyword, the client willaccept the resource in chunkedtransfer coding, which meansthat it will accept the responseas a series of pieces of therequested entity. This headerapplies only to the current con-nection and must be reissuedas necessary.

TE: trailers, deflate. Indicates the client's willingness to accept resources in chunks that are deflate encoded.

User-Agent

End-to-end

Used to pass information regard-ing the software that the clientis using to send its requests.Servers can then tailor theirresponses to the limitations orcapabilities of this software.

User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Microsoft Windows NT; DigExt). Indicates that thebrowser software being usedis Internet Explorer 5, andlists its capabilities.

Vary

End-to-end

Used to specify header fieldsthat dictate whether a futureresponse to a request for thisresource can be issued fromcache, rather than beingre-retrieved from a server.

Vary: *. Can be issued by a server to dictate that a proxycannot issue a responsefrom its cache, as theserver wishes to negotiate all content itself.

Table 21-5: Response Header Fields

Header Name

Type

Interpretation

Example

Accept- Ranges

End-to-end

Allows the server to specify whether it will accept range requests from a client, and in what format.

Accept-Ranges: bytes. Indi- cates that the server will accept range requests that are specified in bytes.

Age

End-to-end

The sender's estimate of the amount of time elapsed since it cached the named resource received from an origin server. Essentially, how old a cached entity is.

Age: 86,400. Indicates that the resource was cached 86,400 seconds previously.

ETag

End-to-end

The entity tag identifier; might be used to compare against other entities received from this source.

ETag: "077d777c8f1be1:189e."

Location

End-to-end

Used to redirect a client to another location for a requested resource, or, if it is part of a 201(Created) response, to indicate where the new resource is located.

Location: /exchange/pf/ root.asp.

Proxy-Authen-ticate

Hop-by-hop

Must be included as part of a407 (Proxy AuthenticationRequired) response, andincludes the authenticationscheme for the requested URI.This header is not passed far-ther down the path, as theauthentication is to occurbetween the client and theproxy.

Proxy-Authenticate: Basicrealm="Enterprise Server." RFC 2617, "HTTP Authentica-tion: Basic and Digest Access Authentication," extensively describes authentication mechanisms.

Retry-After

End-to-end

Used by a server to inform aclient as to how long a serviceis expected to be unavailable.Usually part of a 503 (ServiceUnavailable) response messageor a 3xx (Redirection) responseto direct the client to wait aspecified amount of timebefore attempting to retrievethe requested entity.

Retry-After: Fri, 31 Dec 1999 23:59:59 GMT. Tells the clientto wait until a specified datebefore attempting to retrieve the resource, while the fol-lowing example specifies await time in seconds: Retry-After: 240.

Server

End-to-end

Used by the server to indicatewhat software it uses to serviceHTTP requests.

Server: Microsoft-IIS/5.0.

WWW-Authenti-cate

End-to-end

Always issued as part of a 401(Unauthorized) message toindicate the authorizationscheme that the server requiresthe client to pass.

WWW-Authenticate: Basic realm="partnering. microsoft.com."

Table 21-6: Entity Header Fields

Header Name

Type

Interpretation

Example

Allow

End-to-end

List of methods allowed forthe resource identified by theRequest-URI message. Thisdoes not prevent a client fromrequesting a disallowedmethod. If a client requests amethod that is not allowed fora resource, the server issues a405 (Method Not Allowed) errorand must include an Allowheader in the response so thatthe client can adjust its request.

Allow: GET, HEAD, PUT, POST. Specifies that the GET, HEAD, PUT, and POST methods areacceptable for the identifiedresource.

Content-Encoding

End-to-end

Specifies encoding methodsapplied to the resource speci-fied in the Request-URI mes-sage so that the messagerecipient knows which decod-ing mechanisms to apply.

Content-Encoding: gzip.Specifies the type of encodingused to compress the requested resource.

Content-Language

End-to-end

Defines the natural languageof the entity being transferred.

Content-Language: en-us. Language content isU.S. English.

Content-Length

End-to-end

Specifies the length of theentity body in a decimal num-ber of octets.

Content-Length: 4110. Indicates a length of 4110 octets.

Content-Location

End-to-end

A server can use this to specifythe resource location that theclient requested, particularlyif the resource is accessiblefrom a separate location thanwas originally requested in theclient message. This value isalso used to set the base URI.

Content-Location: http:// 192.168.0.1/Default.htm.Specifies Default.htm at thelisted IP address as the baseURI for the requestedresource.

Content-Range

End-to-end

When a server returns a par-tial entity body in its response,it uses Content-Range to spe-cify the range of the full entitythat is covered by this partialpartial entity. Essentially, thisis a marker to identify theportion of the resource beingsent. Servers returning statuscodes 206 (Partial Content) or416 (Requested Range Not Satis-fiable) utilize the Content-Rangeheader field.

Content-Range: bytes 500-999/1500. Informs the clientthat the range encompassesbytes 500–999 of a 1500-byte resource.

Content-Type

End-to-end

Identifies the media typeof the entity being sent.

Content-Type: text/html.Indicates the message beingsent consists of HTML text.

Expires

End-to-end

The resource specified is con-sidered stale at its expirationdate and should not be returned from the cache atthat point; rather, the resourceshould again be retrieved fromthe server.

Expires: Mon, Nov 16 199917:38:01 GMT.

Last-Modified

End-to-end

Indicates the date that theserver believes to be the lastmodification date for theresource in question.

Last-Modified: Sun, 29 Aug1999 02:44:51 GMT.

Table 21-7: General Header Fields

Header Name

Type

Interpretation

Example

Cache-Control

End-to-end

Issues directives regardingcacheability of this informa-tion that must be obeyed byall points along the requestchain. HTTP/1 caches that donot implement Cache-Controlcan simply assume the entityis not cacheable.

Cache-Control: private.Indicates that the server is issuing a message that cannot be cached by any intermedi-ary points, as the content isprivately cacheable only bythe client that requested it.

Connec-tion

Hop-by-hop

Used to specify options for asingle connection along thepath; must not be propagatedfarther along the path. Anyheaders included in the con-nection header must notinclude end-to-end headers.

Connection: keepalive. Indicates that the connec-tion between these two points is to be kept open after this message is for- warded. An HTTP/1 recipient might not be capable ofinterpreting a keepalive,and therefore might removeand ignore the header fieldspertaining to the keepalive.

Date

End-to-end

Indicates the date and time that this message originated.

Date: Fri, 03 Sep 1999 00:58:28 GMT.

Pragma

End-to-end

Used to provide instructionsto each recipient of the mes-sage along the path. These directives are implementation-specific, and therefore mightbe ignored by proxies that donot understand their meaning.However, the proxy is requiredto forward the header whetheror not it understands its directives.

Pragma: no-cache. Specifiesthat a proxy is expected to pass the request message on to a server even though theproxy might already have the requested item in itsown cache.

Trailer

Hop-by-hop

Used to inform a client thatthere are header fields in thismessage pertaining to chunkedtransfer encoding, so that theclient can know that it needsto use these for decoding andreassembly.

Trailer: Range. Indicates the presence of a Range headerin this message.

Transfer-Encoding

Hop-by-hop

Used to indicate what type ofencoding has been applied tothe body of the message, sothat the recipient can deter-mine how to decode it.

Transfer-Encoding: Chunked,deflate. Indicates the encod-ing applied to this message, in the order in which it wasapplied.

Upgrade

Hop-by-hop

Frequently issued by a serveras part of a 101 (SwitchingProtocols) message to indicateadditional communicationprotocols that it supports. Theclient can also use this headerfor negotiating the protocolsto be used. Because this headerapplies only to the current con-nection, it must be supplied aspart of a Connection header.

Upgrade: HTTP/1.2, SHTTP/1.3, IRC/7.0.

Via

End-to-end

Must be used by proxies andgateways to indicate the pro-tocols and intermediate recip-ients between the requestingclient and the issuing server.This is used for purposes oftracking the path of a request/response transaction.

Via: 1.0 microsoft.com, 1.1 technet.microsoft.com(Microsoft-IIS/5.0). Specifiesthe order of the hops made and the HTTP server programrunning at those hops, inthis case, IIS 5.

Warning

End-to-end

Used to issue warnings regard-ing message content. Thesewarnings are issued in human-readable language.

Warning: 110 Response is stale. Indicates that the mes-sage has exceeded its fresh-ness lifetime, or the indica-tor of how long the entitycan be considered accurate.

HTTP Codings

HTTP uses content codings to specify data transformation mechanisms, such as compression, that have been applied to an entity. By specifying the coding method in header fields, client and server applications can determine how to decode the entity to make it legible. Content codings are registered with the Internet Assigned Numbers Authority (IANA), and include the following:

Transfer Codings

Transfer codings are used to ensure safe passage of an entity through the request/response path. Transfer coding is not an entity property, as is content coding. Rather, transfer coding is applied to the entire entity message body. Transfer coding values indicate whether encoding has been or can be applied to the message. If transfer coding is applied to a message, the values must indicate whether the data has been chunked, or broken into more manageable pieces. This coding must not be applied more than once to the message body so that a client can accurately determine message transfer length. Any server that receives transfer coding values it does not understand should return a 501 (Not Implemented) response so that the client can request or apply a different encoding mechanism. The encoding formats used for transfer encoding are the same as those listed for content coding, with the addition of the chunked value.

Chunked Transfer Coding

Chunked transfer coding is used to modify the body of a message so that it can be sent as a series of smaller pieces. Each chunk is sent with its own size information and, possibly, entity headers, so that the recipient can accurately determine whether it has received all the chunks that comprise the message. HTTP/1.1-compliant applications are required to be capable of receipt and decoding of chunked transfer coding, and must ignore any extensions to transfer coding values that they do not understand.

HTTP Content Negotiation

Content negotiation is the process by which the HTTP client and server determine the preferred representation for a response. Client software might be able to interpret only specific entity media types, as might also be the case at the server. Additionally, user preferences for parameters, such as language and file format, affect the negotiation process. Content negotiation can be said to be agent-driven, wherein the client chooses the best media representation after receiving a response from a server; server-driven, meaning that the server specifies preferred representations; or transparent, which is a combination of both agent-driven and server-driven negotiation.

Agent-Driven Content Negotiation

In agent-driven negotiation, the client software, called a user agent, issues a request message to a server that indicates its own capabilities by means of header fields, such as Accept, Accept-Charset, and Accept-Language. When the server receives this request, it responds with a message indicating its own capabilities, and the user agent then responds with its own choice of representation from the list provided by the server.

Agent-driven negotiation is advantageous when the user wants to dictate certain content parameters, such as the language used to display a Web page, or when the server cannot ascertain the client's capabilities by analyzing its request message. Additionally, load balancing for heavily trafficked servers can be provided using public caches (additional servers or proxies that maintain cached copies of the information on the origin server), and agent-driven negotiation can be used to determine message parameters with the computers providing these caches. However, agent-driven negotiation has the disadvantage of requiring additional message transfer between client and server, because the client must first request a listing of the server's capabilities before choosing a message format.

Most Web client interactions, as illustrated in Capture 21-01, do not use agent-driven content negotiation.

Server-Driven Content Negotiation

In server-driven content negotiation, the server uses an algorithm to select what it considers to be the best format for messages between itself and the client. The server can base its determination on parameters it receives in the client's request message, as well as on its own capabilities, and even parameters such as the requesting client's network address. Server-driven negotiation allows the server to send the requested entity in its initial response package, as determined by its best guess as to which media type the client prefers (based on header fields such as Accept, Accept-Charset, and Accept-Language), rather than waiting for another request after the client has received a response and chosen from the list of available formats.

Server-driven negotiation reduces the number of messages that need to be transferred to determine acceptable media types, but also has several disadvantages. First, this type of negotiation requires the server to guess what the preferred format on the client's end is, which might or might not match what the user prefers. Second, because this negotiation type requires the user agent to describe its capabilities in each request it makes to the server, it might be inefficient and might also violate the user's privacy. Third, the server must perform additional processing for each response to determine the optimal format for that response. Last, servers that are providing public caches to facilitate load balancing for another server might not be able to service requests for different users from their caches, because each user agent might provide different parameters.

Transparent Content Negotiation

Transparent content negotiation provides a combination of both agent-driven and server-driven negotiation mechanisms. When a cache receives a client request containing parameters that it can fulfill, it can negotiate the content format itself, rather than forwarding the request parameters to the origin server. In this case, the cache is acting as the server would in server-driven negotiation, thus saving work at the server that originally provides a resource. HTTP/1.1 does not provide any guidelines for transparent negotiation, although many implementations provide their own mechanisms as extensions of HTTP/1.1.

HTTP Caching

To make HTTP as efficient as possible, clients, servers, proxies, and gateways can cache content retrieved as part of the request/response process. The server specifies caching in HTTP/1.1. The originating server for any resource decides whether or not othermachines along the path should cache a message. Servers might specify that a message cannot be cached by any computer, must be cached by these computers, or can be cached based on variables, such as the age of the message. Originating servers do not, however, always assign dictates as to whether or not, or for how long a message can be cached, so caches also use a mechanism called heuristic expiration. In heuristic expiration, a cache uses information contained in message headers to estimate the point at which the message can be considered stale, and thus would need to be re-retrieved, should it berequested. Microsoft ISA Server, for example, enables the administrator to set time to live (TTL) values on cache content based on headers or static settings.

Because this is inherently unreliable in providing accurate gauges for message freshness, HTTP/1.1 strongly encourages servers to provide explicit expiration on any responses they send. RFC 2616 outlines implementation details for caching and content expiration.

FTP

FTP, defined in RFC 959, is one of the older application protocols. Although HTTP is perhaps better known, FTP has been a mainstay of file transfer on the Internet for a considerable time. Microsoft IIS 6 implements a full-featured FTP server service and Windows .NET Server 2003 includes a simple command-line FTP client and FTP functionality in Internet Explorer. Although additions to FTP have been proposed in several RFCs and drafts, its core functionality remains largely unchanged. This chapter does not provide instruction in how to use the FTP client, but instead defines the protocol itself.

  More Info

User documentation and instructions for using FTP are found in RFCs 412, 959, and 1635, which can be found in the Rfc folder on the companionCD-ROM.

FTP is used to share and transfer files between computers, as well as use other computers for remote storage purposes. As is the case with HTTP, FTP is an Application Layer protocol that relies on TCP to ensure guaranteed delivery of data. Although the RFC definition of FTP does not provide any true method for recovering a lost connection and picking up file transfer where it left off, IIS 6 implements a process called FTP restart to add this functionality.

FTP is an inherently nonsecure protocol because it transmits user passwords as clear text. To help provide secure mechanisms to be used in FTP transmissions, security extensions to the FTP protocol are defined in RFC 2228, although IIS 6 does not implement these.

As with HTTP, FTP has its own unique terminology, some of which is defined in the following list:

These components are shown in Figure 21-1.

Figure 21-1: FTP components.

FTP Operation

FTP is a client/server protocol that uses two connections between client and server, unlike SMTP, HTTP, and NNTP, which use a single connection. With FTP, a control connection is established to communicate between client and server, and data connections areestablished to transfer files between an FTP client and server.

FTP Connections

FTP connections are established between an FTP client and an FTP server. An FTP session between client and server can be initiated either by a user, through an FTP client interface, or programmatically on Windows Server 2003 using the Win32 application programming interface (API). In any case, the user PI initiates the actual connection. The user PI is responsible for opening a TCP connection to the FTP server and for sending a command to the server PI requesting that an FTP connection be opened between them. The server PI listens for connection requests on TCP port 21 by default, and after receiving a connection request from a user PI, begins the process of establishing a control connection.

Every FTP session actually consists of two separate connections—a control connection and a data connection. The control connection follows Telnet specifications, and it is used to negotiate communication parameters, issue commands and responses, and monitor the status of any data connection that is opened between the two computers. The task of opening and monitoring the data connection is handled by the DTP components on both the FTP client and server. The data connection is the actual mechanism over which data transfer occurs. Whereas a data connection can be dynamically opened and closed in a single session between the two computers, the control connection always remains open for the duration of a single session.

A user can initiate an FTP session between a client and server by using FTP client software. Windows Server 2003 includes a simple command-line FTP client, and Internet Explorer provides GUI-based FTP features. The client FTP program includes both the user PI and user DTP components shown in Figure 21-1. This enables the client to initiate a control connection between the client and the FTP server (comprised of the server PI and server DTP).

In Windows Server 2003, you can use a Web browser, such as Internet Explorer, to make and utilize an FTP connection to an FTP server. You can also use the built-in FTP tool simply by typing FTP <hostname or IP address of an FTP server> from the command prompt.

After the control connection has been established between FTP client and server, the user can issue commands to the server that open a data connection between the two computers. Data is then passed bidirectionally (full-duplexed) over this connection. When the data transfer is complete, the data connection can be closed, although the control connection remains open until the user initiates its disconnection and the server performs the actual process of closing the connection. Figure 21-2 diagrams this process.

Figure 21-2: A client/server FTP session.

The following trace shows a simple FTP session between an FTP client and server, in which the client connects, downloads a single file, and then disconnects:

C:>ftp kapoho10.kapoho.com Connected to kapoho10.kapoho.com. 220-Microsoft FTP Service 220 KAPOHO10 FTP User (kapoho10.kapoho.com:(none)): ftp 331 Anonymous access allowed, send identity (e-mail name) as password. Password: 230-Welcome to FTP service at Kapoho10 230 Anonymous user logged in. ftp> get rebecca.txt 200 PORT command successful. 150 Opening ASCII mode data connection for rebecca.txt(16 bytes). 226 Transfer complete. ftp: 16 bytes received in 0.00Seconds 16000.00Kbytes/sec. ftp> quit 221 C:>

The trace for this short FTP session is in Capture 21-02 on the companion CD-ROM. In theory, an FTP client can request an FTP server to create a data connection to another FTP server. This process is allowed in the FTP RFC, but it is rarely implemented. Most popular FTP clients create data connections only between the FTP client and server systems.

FTP Data

FTP data transfer occurs over the data connection between the FTP server and FTP client. The control connection is reserved for passing and receiving FTP commands that control the session, as well as exchanging data transfer parameters. The sender and receiver in any FTP session negotiate the data transfer format to ensure that the receiving end can correctly reconstruct the data that is sent.

Because each computer can store its data in its own specific logical byte sizes (the number of bits that comprise a data byte on the disk), mechanisms must be in place to ensure that the data sent to the receiving computer is transmitted in an agreed-on format. The FTP specification provides for specific data structures and data type representation, although most FTP servers and clients only use ASCII or binary data transfer.

Data Structures

FTP classifies three different data structures, or characteristics, of a file stored on the computer. Some systems store data as a series of fixed-length records, whereas other computers store data as a series of characters and separators. Because of the disparity in storage formats, transfers of data must be made in a format that the sending and receiving computers can both reconstruct and write appropriately to their own disks. In FTP,these structures are defined as file structure, record structure, and page structure, asdescribed here:

When page structure data is sent between two hosts using FTP, each page must be sent with a page header comprised of 1-byte fields that provide informational parameters. Each header begins with a Header Length field that defines the number of bytes comprising the header. Following this is a page index, or number, that identifies this page's place in the overall file, and a Data Length field that specifies the length of the page data itself. The next field in the header identifies the page type, which can be a normal data page; a descriptor page that defines properties for the file; an access-controlled page, which provides information about access control to the file; or the last page in the file. There might also be optional header fields that define properties, such as access control to the individual page.

Data Types

Within each of the data structures defined in FTP, data can be stored as different data types. These data types define the on-disk byte size of the file data. Some data types provide explicit definition of data byte size, whereas others provide implicit definition by using logical byte size. Acceptable data types in IIS FTP transfers are defined in the following list:

Connections and Transmission Modes

To transmit data between a client and server using FTP, both a control connection and a data connection must be established. The control connection is used both to setparameters for the data connection and to monitor the passage of data over the data connection. Although the control connection remains open during the entire client/server FTP session, multiple data connections can be dynamically opened and closed.

Data Connection Establishment and Management

FTP connections between client and server are initiated by the user PI. The FTP server listens on TCP port 21 (by default) for connection requests. When the server PI hears a connection request from a user PI, it opens a control connection. The use of nondefault ports for connections can only be initiated by the user PI, and not the server PI. When the client issues a command that initiates data transfer, the server DTP opens a data connection and data transfer begins. Both user DTP and server DTP monitor the connection to determine which computer might be sending or receiving at any given time. The initial connection, logon, and data transfer process, based on the earlier FTP session and shown in Capture 21-02, is as follows:

1 1.791683 FTP Client DNS Server DNS 0xA2A:Std Qry for kapoho10.kapoho.com. of type Ho 10.10.1.68 10.10.1.200 2 1.791683 DNS Server FTP Client DNS 0xA2A:Std Qry Resp. for kapoho10.kapoho.com. of t 10.10.1.200 10.10.1.68 3 1.801697 FTP Client FTP Server TCP ....S., len: 0, seq: 476820006- 476820006, ack: 10.10.1.68 10.10.1.74 4 1.811712 FTP Server FTP Client TCP .A..S., len: 0, seq:2161805281- 2161805281, ack 10.10.1.74 10.10.1.68 5 1.811712 FTP Client FTP Server TCP .A...., len: 0, seq: 476820007- 476820007, ack: 10.10.1.68 10.10.1.74 6 1.811712 FTP Server FTP Client FTP Resp. to Port 2413, '220-Microsoft FTP Service' 10.10.1.74 10.10.1.68 7 2.012000 FTP Client FTP Server TCP .A...., len: 0, seq: 476820007- 476820007, ack: 10.10.1.68 10.10.1.74 8 2.032029 FTP Server FTP Client FTP Resp. to Port 2413, '220 KAPOHO10 FTP' 10.10.1.74 10.10.1.68 9 2.212288 FTP Client FTP Server TCP .A...., len: 0, seq: 476820007- 476820007, ack: 10.10.1.68 10.10.1.74 10 3.053497 FTP Client FTP Server FTP Req. from Port 2413, 'USER ftp' 10.10.1.68 10.10.1.74 11 3.063512 FTP Server FTP Client FTP Resp. to Port 2413, '331 Anonymous access allowed 10.10.1.74 10.10.1.68 12 3.213728 FTP Client FTP Server TCP .A...., len: 0, seq: 476820017- 476820017, ack: 10.10.1.68 10.10.1.74 13 9.953419 FTP Client FTP Server FTP Req. from Port 2413, 'PASS tfl@kapoho.com' 10.10.1.68 10.10.1.74 14 9.963433 FTP Server FTP Client FTP Resp. to Port 2413, '230-Welcome to FTP service a 10.10.1.74 10.10.1.68 15 10.123664 FTP Client FTP Server TCP .A...., len: 0, seq: 476820038- 476820038, ack: 10.10.1.68 10.10.1.74 16 10.123664 FTP Server FTP Client FTP Resp. to Port 2413, '230 Anonymous user logged in 10.10.1.74 10.10.1.68 17 10.323952 FTP Client FTP Server TCP .A...., len: 0, seq: 476820038- 476820038, ack: 10.10.1.68 10.10.1.74 18 15.180936 FTP Client FTP Server FTP Req. from Port 2413, 'PORT 10,10,1,68,9,111' 10.10.1.68 10.10.1.74 19 15.190950 FTP Server FTP Client FTP Resp. to Port 2413, '200 PORT command successful. 10.10.1.74 10.10.1.68 20 15.190950 FTP Client FTP Server FTP Req. from Port 2413, 'RETR rebecca.txt' 10.10.1.68 10.10.1.74 21 15.190950 FTP Server FTP Client FTP Resp. to Port 2413, '150 Opening ASCII mode data 10.10.1.74 10.10.1.68 22 15.200965 FTP Server FTP Client TCP ....S., len: 0, seq:1569488684- 1569488684, ack 10.10.1.74 10.10.1.68 23 15.200965 FTP Client FTP Server TCP .A..S., len: 0, seq:3313623390- 3313623390, ack 10.10.1.68 10.10.1.74 24 15.200965 FTP Server FTP Client TCP .A...., len: 0, seq:1569488685- 1569488685, ack 10.10.1.74 10.10.1.68 25 15.200965 FTP Server FTP Client FTP Data Transfer To Client, Port = 2415, size 16 10.10.1.74 10.10.1.68 26 15.200965 FTP Server FTP Client TCP .A...F, len: 0, seq:1569488701- 1569488701, ack 10.10.1.74 10.10.1.68 27 15.200965 FTP Client FTP Server TCP .A...., len: 0, seq:3313623391- 3313623391, ack 10.10.1.68 10.10.1.74 28 15.220993 FTP Client FTP Server TCP .A...F, len: 0, seq:3313623391- 3313623391, ack 10.10.1.68 10.10.1.74 29 15.231008 FTP Server FTP Client TCP .A...., len: 0, seq:1569488702- 1569488702, ack 10.10.1.74 10.10.1.68 30 15.331152 FTP Client FTP Server TCP .A...., len: 0, seq: 476820079- 476820079, ack: 10.10.1.68 10.10.1.74 31 15.331152 FTP Server FTP Client FTP Resp. to Port 2413, '226 Transfer complete.' 10.10.1.74 10.10.1.68 32 15.531440 FTP Client FTP Server TCP .A...., len: 0, seq: 476820079- 476820079, ack: 10.10.1.68 10.10.1.74 33 17.654493 FTP Client FTP Server FTP Req. from Port 2413, 'QUIT' 10.10.1.68 10.10.1.74 34 17.654493 FTP Server FTP Client FTP Resp. to Port 2413, '221 ' 10.10.1.74 10.10.1.68 35 17.664507 FTP Client FTP Server TCP .A...F, len: 0, seq: 476820085- 476820085, ack: 10.10.1.68 10.10.1.74 36 17.674521 FTP Server FTP Client TCP .A...., len: 0, seq:2161805598- 2161805598, ack 10.10.1.74 10.10.1.68 37 17.674521 FTP Server FTP Client TCP .A...F, len: 0, seq:2161805598- 2161805598, ack 10.10.1.74 10.10.1.68 38 17.674521 FTP Client FTP Server TCP .A...., len: 0, seq: 476820086- 476820086, ack: 10.10.1.68 10.10.1.74

When an active transfer is occurring, the DTP on the receiving computer is passive and the DTP on the sending computer is actively controlling data transfer. Because the data connection is automatically closed after completion of a data transfer, a data connection can be kept open either by negotiating a nondefault port before beginning transfer, or by switching to a different transfer mode for the file(s) in question.

FTP defines the three transfer modes covered in the following list.

FTP Restart

FTP restart provides a mechanism for resuming file transfers that were interruptedbefore completion. FTP restart is a feature of the FTP service in Windows Server 2003. FTP itself provides little in terms of recovery mechanisms, but it does provide for the insertion of restart markers in both block and compressed data modes. When an FTP server implements FTP restart, it periodically sends a restart marker (essentially a place marker) in the data being transferred to the receiving computer.

The receiver collects these restart markers, and if the data connection is lost, it can use them to resume the file transfer where it was interrupted. After resuming the connection between itself and the server, the client first queries the server to determine if the file has changed since the interruption of the transfer. If the file has not changed, the client issues the last restart marker value to the server and requests that the transfer be resumed at that point. If the file has changed since the data transfer interruption, the client requests that the entire file be transmitted from the beginning.

Although the server inserts restart markers only in transmission of block or compressed data, stream data can also be recovered if the connection is lost. Because stream data is sent simply as a series of bytes, the client need only calculate the byte offset of the last data it received and then request the server to resume the file transfer at that offset point.

FTP restart commands are not used only when a connection has been broken. Some FTP clients send the command REST 0 before downloading a new file from a server. This command is used to ensure that the data transfer begins with the first byte of the requested file.

FTP Commands and Responses

FTP communication between an FTP client and server is transmitted as a series of commands that the client needs fulfilled, and responses from the server. This process is similar to the HTTP request/response functionality in HTTP communication.

FTP Commands

The user PI issues commands over the control connection to initiate file transfer. FTP commands can be commands that verify the user's identity with the FTP server, such as USER, ACCT, and PASS; commands that navigate the file system on the remote host, such as CDUP, XCUP, and CWD; session origination and termination commands, such as QUIT, BYE, and REIN; or commands that control the parameters of file transfer as well as the transfer itself, such as PORT, TYPE, MODE, GET, PUT, RETR, and STOR.

FTP Replies

Similar to other Internet type services, FTP servers issue reply codes in response to client commands. These extensible reply codes are sent in the form of a three-digit number, with the value of the first and second digits indicating the type of response.

Tables 21-8 and 21-9 describe the representations of the first and second digits in response codes. As an example of response-code meanings, a server issuing code 250 wouldindicate that the requested file system action was successfully completed; the 2 indicates success, and the 5 indicates a file system operation.

Table 21-8: FTP Response Codes—First-Digit Values

First-Digit Value

Indicator

Description

1yz

Positive PreliminaryRepl y

The requested action is being processed; another command cannot be sent until another reply is received from the server.

2yz

Positive Completion Reply

The requested action has been completed and a new command can now be sent.

3yz

Positive Intermediate Reply

The requested action is accepted, and continues processing pending further information from the client, which should now be issued.

4yz

Transient Negative Completion Reply

A temporary error has occurred that prevented processing of the command, and the user should now reissue the command (or command sequence).

5yz

Permanent Negative Completion Reply

An error has occurred that prevented process- ing of the command; the command should not be reissued without modification. This modifi cation might be as simple as correcting a mis- spelling, or this error might indicate a nontransient server error.-

Table 21-9: FTP Response Codes—Second-Digit Values

Second-Digit Value

Indicator

Description

x0z

Syntax

There is a syntax error, the issued command is unimplemented by the server, or the server does not recognize the command category.

x1z

Information

Replies to information requests, such as help.

x2z

Connections

Refers to control and data connections.

x3z

Authentication and Accounting

Replies to user logon or accounting procedures.

x4z

Unspecified

Unspecified.

x5z

File System

File system status as it relates to the requested transfer or command.

NNTP

NNTP provides a user with the ability to submit news articles to a news server. NNTP also enables these articles to be propagated to other news servers, thus providing peer-to-peer discussion groups that are global in nature.

The network news facility was a feature of ARPANET, which preceded today's Internet. Almost all Internet service providers (ISPs) today allow access to some form of news, and there are third-party companies that run news servers for individual subscribers as well as for smaller ISPs that outsource their news provision.

In the early days, "news" included bulletins about the ARPANET, details supportinginformation, and actual data. Today, there are tens of thousands of newsgroups available that cover just about any human activity. Some of the news servers are private andrequire specific permission to access. Some news servers are provided by individual com panies (for example, news.microsoft.com) as part of the support of their products. Most ISPs provide news access to their customers.

The basic operation of NNTP is described in RFC 977. Some common extensions to NNTP are described in RFC 2980. The format of NNTP news articles is described in RFC 1036. Although never issued as a formal RFC, Henry Spencer's "Son of 1036" is consideredbest practice for newsreaders and servers in terms of message formats. You can obtain this document from ftp://ftp.zoo.toronto.edu/pub/news.txt.Z (for the text version) orftp://ftp.zoo.toronto.edu/pub/news.ps.Z (for the PostScript version).

NNTP is a client/server protocol that runs over TCP. It has many similarities to FTP and SMTP. NNTP is used in two different situations. The first enables an NNTP client program to post and retrieve news messages from a news server's news base. The second enables two NNTP servers to transfer articles, thus propagating them across a network of NNTP servers.

As with other Internet protocols, NNTP has its own unique terminology, some of which is defined in the following list:

NNTP Operation

The basic operation of NNTP is similar to SMTP. Connections are made between a news client and server or between two servers. Once this connection is established, the client (or the initiating server) sends commands to the server, which might include data, such as a news message the client is posting to the server. The server attempts to perform the command and then sends back a response, which might include data, such as a news article that is retrieved and sent to the client.

NNTP Connections

As noted earlier, NNTP connections are initiated over TCP either by a news client (to post or retrieve news) or between two servers (to propagate articles and control messages). NNTP clients use an ephemeral TCP port, whereas the NNTP server port is TCP port 119. When a server contacts a peer, it uses a local ephemeral TCP port as well.

In Windows Server 2003, you can use the built-in program Outlook Express as a news client. Outlook Express is installed automatically, but before you can use it as an NNTP client, you need to configure it using the built-in configuration wizard. There are a large range of Windows-based news clients, including both free and commercial products.

After the client has created the TCP connection with the server, it can issue commands to the news server. Once the client has completed sending commands and receivingresponses, it can issue a final command to the server to terminate the news session.

The following trace shows a simple NNTP session between a client and a news server:

Collecting Usenet News from news2.kapoho.net <- 200 NNTP Service 6.0.3621.0 Version: 6.0.3621.0 Posting Allowed -> MODE READER <- 200 NNTP Service 6.0.3621.0 Version: 6.0.3621.0 Posting Allowed -> XHDR Message-ID -> XOVER -> GROUP mct.mentor.announce -> GROUP mct.mentor.discussion -> STAT 80 <- 211 0 5 4 mct.mentor.announce <- 211 38 44 81 mct.mentor.discussion <- 223 80 USTw+JYDu7b9EAXe@mail.psp.co.uks -> GROUP mct.mentor.discussion -> XHDR Message-ID 81 -> NEWGROUPS 020911 230405 GMT <- 211 38 44 81 mct.mentor.discussion <- 221 Xhdr information follows <- 81 <- . -> ARTICLE <- 231 New newsgroups follow. <- . -> QUIT <- 220 0 article <- 205 closing connection - goodbye! News connection to news2.kapoho.net closed - filing continues News from news2.kapoho.net completed, 1 articles fetched, 0 posted

This session is from the log of a news client. The corresponding NNTP traffic can be seen in Capture 21-03 on the companion CD-ROM.

NNTP Commands and Responses

NNTP communication occurs between a news client or news server and another news server. The communication comprises one or more commands sent from the initiating system to the news server plus responses received back from the server. The communication can also include news articles being sent from client to server, from server to client, and from server to server.

NNTP Commands

The NNTP client issues commands to the server across the active connection. NNTP commands are a single word followed by, in some cases, a parameter. In command lines that have parameters, the command is separated from the parameter by one or more space or tab characters. The command line must be complete with all parameters and must not contain more than one command. Each command line is terminated by a CRLF pair. A command line can be, at most, 512 characters long (510 characters plus the CRLF pair), and there is no provision for the continuation of command lines.

Table 21-10 sets out the NNTP commands defined in RFC 850, RFC 977, and RFC 2980 and describes their functions.

Table 21-10: NNTP Commands

Command and RFC Where Defined

Purpose/Function

ARTICLE (RFC 977)

Sent by the news client to the server to retrieve an article. If the message exists, the server responds with the article's header, a blank line, and then the body of the article.

There are two forms of this command:

ARTICLE [nnn]

This form requests the server to request a specific article based on the internal article number. For example:

ARTICLE 23

This form of the command sets the internal current article pointer to the article if it exists.

ARTICLE [message-id]

This form of the command requests the message with the specified message ID. For example:

ARTICLE q31vXMSJx8f9EAji@mail.psp.co.uk

This form of the command does not change the internal current article pointer.

AUTHINFO (RFC 2980)

This command informs a news server of the identity of a user of the server. News clients are required to provide this information when requested by the server and servers are not required to accept authentication information that is volunteered by the client. News clients must be able to accommodate servers that reject any authentication information volunteered by the client.

There are several forms of this command as listed next.

AUTHINFO USER (RFC 2980)

This command is used by a news client to provide the user name; for example:

AUTHINFO USER thomaslee

AUTHINFO PASS (RFC 2980)

This command is used to transmit a user's password to the news server. This password is transmitted in clear text; for example:

AUTHINFO PASS !sugarmagnolia

AUTHINFO SIMPLE (RFC 2980)

This version of AUTHINFO was a refinement of the original AUTHINFO command and was part of a proposed NNTP version 2 specification. This specification, which was started in 1991, was never completed. This command is implemented in some servers and clients, but it is not supported by IIS 6 or used by Outlook Express.

BODY (RFC 850)

The BODY command is identical to the ARTICLE command except that it returns only the body of the news article.

GROUP (RFC 850)

This command is sent by the client to the news server to inform the server to which specific news-groups any future commands relate. The server responds with the article numbers of the first and last articles in the group and an estimate of the total number of messages that are held by the server for that group. This command also sets the internal current article pointer to point to the first article in the group. For example:

group mct.mentor.discussion 211 42 44 85 mct.mentor.discussion

CHECK (RFC 2980)

The CHECK command is used by a news peer to discover if an article with a specified message ID should be sent to the server using the TAKETHIS command. For example:

CHECK

HEAD (RFC 850)

The HEAD command is identical to the ARTICLE command except that it returns only the article's header.

IHAVE (RFC 850)

The IHAVE command is sent by the client to the server to inform the server that the client has an article that it can offer to the server. If the news server does not have the article (and is configured to request it), the server returns a response requesting the entire article (header and body). If the server does not want the article (for example, if that article is already held by the news server), a suitable response is sent.

This command is similar to POST, but differs in that it is intended to be used between servers for propagation. Outlook Express, for example, does not use this command. When this command is used to transfer news articles, changes are made in the PATH: header to ensure that the path the article travels is recorded.

LAST (RFC 850)

This command moves the current article pointer to the previous article in the current group. If the pointer is already at the start of the group, an error is returned.

LIST (RFC 850)

This command is sent to the news server to retrieve a list of all newsgroups held by the server as well as information about the messages held for each group.

The server returns the information as a series of single lines of information per newsgroup held. The format of this information is:

group-name last first p

where group is the full name of the newsgroup, last is the article number of the last article held in that group, first is the article number of the first article held in that group, and p is set to either Y or N depending on whether the server will allow posting to the group (Y) or not (N).

LIST ACTIVE (RFC 2980)

This command is the same as the LIST command except with an additional parameter. If the optional matching parameter is specified, the list is limited to only the groups that match the pattern. When the display is completed, the server sends a period on a line by itself. If no groups match the pattern, an empty list is returned. For example:

200 NNTP Service 6.0.3621.0 Version: 6.0.3621.0 Posting Allowed list active 215 list of newsgroups follow control.cancel 0 1 y control.newgroup 0 1 y control.rmgroup 0 1 y mct.mentor.announce 4 5 y mct.mentor.discussion 85 44 y

LIST ACTIVE.TIMES (RFC 2980)

The active.times file is used by some news servers to contain information about who created a particular news-group and when

IIS 6 does not support this command..

LIST DISTRIBUTIONS (RFC 2980)

Some news servers maintain a distributions file that contains information about valid values for the Distribution: line in a news article header and what the values mean. Each line contains two fields, the value and a short explanation of the meaning of the value. When the display is completed, the server sends a period on a line by itself. If the information is not available, the server returns the 503 error response.

IIS 6 does not keep this information, and it returns a 503 message when the command is sent.

LIST DISTRIB.PATS (RFC 2980)

The distrib.pats file is maintained by some news transport systems to contain default values for the Distribution: line in a news article header when posting to particular newsgroups.

IIS 6 does not keep this information.

LIST NEWSGROUPS (RFC 2980)

Most news servers maintain a newsgroups file that contains the name and description of each newsgroup active on the news server. Each line in the file contains two fields, the newsgroup name and a short explanation of the purpose of that newsgroup. When the display is completed, the server sends a period on a line by itself.

For example:

list newsgroups mct* 215 descriptions follow mct.mentor.announce MCT Mentoring Announcements mct.mentor.discussion MCT Mentoring generaldiscussion group

LIST OVERVIEW.FMT (RFC 2980)

The overview.fmt file is maintained by some news transport systems to contain the order in which header information is stored in the overview databases for each newsgroup. When the display is completed, the server sends a period on a line by itself.

For example:

list overview.fmt 215 Order of fields in overview database. Subject: From: Date: Message-ID: References: Bytes: Lines: Xref:full

LIST SUBSCRIPTIONS (RFC 2980)

This command is used to get a default subscription list for new users of this server.

This command is not supported by IIS 6.

LISTGROUP (RFC 2980)

This command is used to get a listing of all the article numbers in a particular newsgroup. When a valid group is selected by means of this command, the internal current article pointer is set to the first article in the group. The successful selection response is a list of the article numbers in the group followed by a period on a line by itself.

For example:

listgroup mct.mentor.discussion 211 84 85

MODE READER (RFC 2980)

This command is used by a news client to indicate to the server that it is a newsreading client. Some news servers can make use of this information to reconfigure themselves for better performance in responding to newsreader commands. This command is similar to the SLAVE command described in RFC 977, which was not widely implemented.

MODE STREAM (RFC 2980)

MODE STREAM is used by a news peer to indicate to the server that it would like to suspend the lock-step conversational nature of NNTP and send commands in streams. IIS 6 does not support this command.

NEWGROUPS (RFC 977)

This command returns a list of newsgroups created since a specified date and time. The server lists this information in the same format as the LIST command.

For example:

NEWGROUPS 020911 230405 GMT 231 New newsgroups follow. tcpip.news 0 1 y

NEWNEWS (RFC 977)

This command returns a list of message IDs of the articles posted or received to the specified newsgroup since a specific date. The format of the listing is one message ID per line with a single line consisting solely of one period followed by CRLF at the end of the list.

For example:

NEWNEWS * 020911 000000 GMT 230 list of new articles by message-id follows. <3BqIqTeIrZg9EAL8@mail.kapoho.com>

POST (RFC 977)

The POST command is used to post a news article to a news server.

QUIT (RFC 977)

This command is sent by the client to the news server to complete the news session and to tear down the TCP connection between server and client. No further commands are accepted after the server receives this command.

SLAVE (RFC 977)

Indicates to the server that this client connection is to a slave server, rather than a user. This command was meant to help a server separate end user connections from peer server connections.

IIS 6 does not support this command.

STAT (RFC 977)

The STAT command is similar to the ARTICLE command except that no text is returned. When selecting by message number within a group, the STAT command sets the current article pointer without sending text. The returned acknowledgment response contains a message ID, which might be of some value to the news client. Most news clients do not use this command.

TAKETHIS (RFC 2980)

News servers use this command to send articles to a peer server when in streaming mode. The header and body are sent immediately after the peer sends the TAKETHIS command. The sender of the TAKETHIS command peer does not have to wait for a response from the peer server before sending the next command and the associated article. During transmission of the article, the peer sends the entire article, including header and body, in the manner specified for text transmission from the server.

XGTITLE (RFC 2980)

The XGTITLE command is used to retrieve newsgroup descriptions for specific newsgroups. This command is not recognized by IIS 6.

XHDR (RFC 2980)

The XHDR command is used to retrieve specific headers from specific articles. This command takes two parameters. The first is a name of a header line in a news article. The second parameter, which is optional, is either the message ID of a specific message or a range of articles, expressed as an article number, followed by a dash (to indicate all following articles) or an article number followed by a dash followed by another article number.

Each line returned contains an article number (or message ID, if a message ID was specified in the command), one or more spaces, and then the value of the requested header in that article. Once the output is complete, a period is sent on a line by itself.

XINDEX (RFC 2980)

The XINDEX command is used to retrieve an index file in the format originally created for use by the TIN newsreader.

This command is not supported by IIS 6.

XOVER (RFC 2980)

The XOVER command returns information from the overview database for the articles specified. Each line of output is formatted with the article number, followed by each of the headers in the overview database or the article itself (when the data is not available in the overview database) for that article separated by a tab character. The sequence of fields must be in this order: subject, author, date, message ID, references, byte count, and line count.

For example:

group tcpip.news 211 3 1 3 tcpip.news xover 224 Overview information follows 1 test Thomas Lee Fri, 13 Sep 2002 12:22:50 +0100 <7vqrhnyKqcg9EA4f@mail.kapoho.com>

XPAT (RFC 2980)

The XPAT command is used to retrieve specific headers from specific articles, based on pattern matching on the contents of the header.

XPATH (RFC 2980)

The XPATH command is used to determine the file names in which an article is filed. This command is not supported by IIS 6.

XREPLIC (RFC 2980)

The XREPLIC command makes it possible to exactly duplicate the news spool structure of one server in another server. This command works similarly to the IHAVE command specified in RFC 977 and uses the same response codes. The command-line arguments consist of entries separated by a single comma. Each entry consists of a newsgroup name, a colon, and an article number.

This command is generally only used in servers when a receiving server is being fed by only one other server.

XROVER (RFC 2980)

The XROVER command returns reference information from the overview database for the articles specified.

This command is not supported by IIS 6.

XTHREAD (RFC 2980)

The XTHREAD command is used to retrieve threading information in a format originally created for use by the TRN newsreader.

This command is not supported by IIS 6.

NNTP Responses

A news server issues two types of responses to client commands: textual messages and numeric status codes. The intention of command/status responses was that the news client program would interpret them before any possible display is done. The text messages, on the other hand, are meant to be displayed by the news client at the user's computer.

The textual responses are sent only after a numeric status response line has been sent to the client to indicate that text will follow. Text is sent as a series of successive lines of textual matter, each terminated with a CRLF pair. Usually just one line is sent, but RFC 977 allows for longer textual responses. A single line containing only a period (.) is sent to indicate the end of the text (that is, the server sends a CRLF pair at the end of the last line of text, a period, and another CRLF pair). You can see the text messages in the command examples previously shown in Table 21-10.

Like other Internet-type services, NNTP servers issue numeric codes in response to client commands. These extensible reply codes are sent in the form of a three-digit number, with the value of the first and second digits indicating the type of response.

Table 21-11: NNTP Response Codes—First-Digit Values

First-Digit Value

Indicator

Description

1yz

Informative message.

These are general information messages.

2yz

Command OK.

The command was accepted and has been acted on. A 2yz command might be followed by output.

3yz

The server indicates that the command is OK so far and that the remainder can be sent.

The command was accepted but the server is expecting more, which can now be sent. For example, after a POST command is sent by a client, the server is waiting for the rest of the news article. For example:

POST 340 Continue posting - terminate with period

4yz

Command was correct, but could not be per- formed for some reason.

These are usually transient errors.

5yz

Command unimplemented, incorrect, or a serious program error occurred.

These are either transient server errors (for example, no more news can be posted because the server is temporarily out of disk space) or more serious (for example, if a client sends a server a command that is not implemented to which IIS 6 responds with a500 Command Not Recognized response).

Tables 21-11 and 21-12 list and describe the first- and second-digit values.

Table 21-12: NNTP Response Codes—Second-Digit Values

Second-Digit Value

Indicator

Description

x0z

Connection, setup, and miscellaneous messages

Typically issued when a news client connects to a news server, for example at initial connection or after a mode reader command is sent.

For example:

mode reader 200 NNTP Service 6.0.3621.0 Version: 6.0.3621.0 Posting Allowed

x1z

Newsgroup selection

Relates to an article and is seen, for example, when sending the GROUP command or LIST command.

For example:

group tcpip.news 211 3 1 3 tcpip.news

x2z

Article selection

Relates to commands that select a particular article, such as the STAT command.

For example:

stat 223 1 <7vqrhnyKqcg9EA4f@mail.psp.co.uk>

x3z

Distribution functions

These are typically seen relating to distribu- tion type commands.

For example:

NEWGROUPS 020911 230405 GMT 231 New newsgroups follow. tcpip.news 3 1 y .

x4z

Posting

These reply codes relate to the process of posting a news message, such as with the POST command.

For example:

POST 340 Continue posting - terminate with period . 240 Article Posted OK

x8z

Private vendor extensions

These replies allow vendors to create additional commands.

x9z

Debugging output

These replies are for debugging purposes, typically for debugging news server programs.

SMTP

For many users, e-mail has replaced traditional mail as a form of communication. The contents of this book were sent among the authors, editors, and publisher using e-mail. E-mail is seen by some to be the killer application for the Internet. SMTP is used to transmit mail between mail servers and from mail clients to mail servers. Most mail clients use a different protocol, either Post Office Protocol 3 (POP3) or Internet Message Access Protocol (IMAP), to retrieve mail.

SMTP is designed to do exactly what its name implies—provide reliable, efficient mechanisms for the transfer of electronic mail. SMTP transfers messages from a mail client to a mail server and between mail servers. SMTP is not, however, responsible for managing mailboxes or for allowing a client to download incoming mail. RFC 821 defines SMTP, although features and refinements have been added in numerous subsequent RFCs. Although SMTP uses familiar terminology, a few terms might be unknown and are defined here:

SMTP Operation

Although numerous enhancements have been added to SMTP since RFC 821, it remains a fairly simple client/server protocol. SMTP, like HTTP and FTP, is an Application Layer protocol that relies on underlying protocols to ensure data delivery. Although SMTP can in theory utilize other protocols, TCP is the most common and can be assumed to be the underlying protocol throughout this section.

SMTP communication is initiated by a user's mail system, the SMTP client. The SMTP client establishes a mail session with an SMTP server by opening a TCP connection to the server, then issuing either a HELO or EHLO SMTP command to begin a session. Extended implementations of SMTP, such as that included with IIS 6, can be configured to require the client to provide authentication credentials that verify the client is permitted to use the SMTP server. Most often, these are simply a user name and password that are recognized by the receiving system.

After the transmission channel has been established, the SMTP client issues a MAIL command that informs the SMTP server that it wants to send mail. If the server is capable of receiving mail at that time, it responds with an OK reply. The SMTP client then issues one or more RCPT commands that identify the recipients of the messages it wants to send; each RCPT command represents a single mail recipient. The recipients can be other users in the same mail system or users in external domains.

If the SMTP server is capable of receiving mail addressed to the recipient named in the RCPT command, it issues an OK reply to the client and the client is free to issue another RCPT command. If the SMTP server is not capable of delivering mail to the designated recipient, it returns an error reply to the client and the client can then move on to the next command. The command/reply sequence is strictly ordered; the client must receive a single reply before the server can issue another command, and a server is not permitted to issue more than one reply to any command.

Because not all recipients can be using the same SMTP system, the client must provide the name of the ultimate destination host as well as the mailbox name in that mail system. The syntax of SMTP mail addresses is the familiar username@domain format, where information to the right of the "@" symbol identifies the destination host, and the user name identifies the name of the mailbox to which the mail should be delivered. SMTP differentiates between sending and mailing: if mail is sent, the client is designating that the mail should be delivered immediately to the recipient's mail interface, provided the recipient is online and using a mail system that uses this functionality. More often, however, mail is mailed, which designates that it should be delivered to the recipient's mailbox on a receiving server. In addition, the send functionality is not a required SMTP implementation, and it can be assumed that this chapter refers to the mail functionality unless specified otherwise.

SMTP mail has both a forward-path and a reverse-path. The forward-path is the path that the mail must take to reach its final destination, whether it uses a direct path or a series of relays. It is important not to confuse SMTP relays with routers; SMTP relays are SMTP servers that can receive mail from one SMTP host and forward that mail to another SMTP host, independent of underlying routing mechanisms. The reverse-path in an SMTP mail message is the name of the sender, which can be as simple as username@domain.

It could otherwise consist of a list of relay hosts between the original sender and the current receiver SMTP. The MAIL command uses the reverse-path as its argument, and the RCPT command uses the forward-path. If multiple recipients' mailboxes reside on the same SMTP host, SMTP encourages the sending of a single copy of the mail to the destination SMTP host.

Once the receiver SMTP has accepted the recipient addresses and provided the appropriate reply, the client is free to begin issuing the DATA command, which informs the server of its intent to begin transferring the mail message. The server replies with a code accepting the sender's intent, and the client then issues the data. The mail data includes not only the body of the mail, but also the memo header information, such as the To:, cc:, bcc:, and Subject lines. If the transfer of the mail data is successful, the server replies with a message indicating receipt and processing, and the client can then issue commands to terminate the transmission connection.

If the sender specifies invalid destination information in the forward-path of the mail, but the server knows the correct destination, the server can reply to the sender with a message allowing the client to correct the error. When the client wishes to terminate the SMTP session, it indicates this by issuing the QUIT command, and the server then closes the transmission connection.

A typical SMTP session might look similar to this:

220-mail.kapoho.com; Sun, 2115 Sep 2001 15:36:29 -0100 (BST) HELO kapoho10.kapoho.com 250 Pleased to meet you MAIL FROM: 250 ... Sender ok RCPT TO: 250 ... Recipient ok RCPT TO: 250 ... Recipient ok DATA 354 Enter mail, end with "." on a line by itself This is a test post. . Message accepted for delivery QUIT 221 mail.kapoho.com closing connection

SMTP Commands

The SMTP client issues the SMTP commands to the SMTP server. The SMTP command follows a straightforward syntax, as shown here (brackets indicate optional command parameters):

[ ]

The SMTP client process issues commands to perform functions, such as opening a transmission channel or initiating a mail transfer. In turn, the SMTP server process attempts to execute the command and then returns the responses. Commands can be issuedindividually or as part of a series of commands, but each command must be followed by a reply from the SMTP server. Table 21-13 lists common SMTP commands, their descriptions, and their syntax.

Table 21-13: Common SMTP Commands, Descriptions, and Syntax

Command

Description

Syntax

ATRN

AUTHENTICATED TURN. If the sessionbetween an SMTP client and SMTP serverhas been authenticated (the user has pro-vided valid identification credentials),this specifies that the SMTP client musteither return an OK reply and assume therole of sender for the mail, or return arefusal (Bad Gateway, 502) and retain therole of SMTP client.

ATRN [ domain name [","domain name]]

AUTH

AUTHENTICATE. Used to begin an authen-ticated mail transfer session (where a usercan provide a user name and password tothe SMTP server to continue the session).

AUTH LOGIN

DATA

DATA. The lines following this commandare specified as mail data from the SMTPclient to server.

DATA

EHLO

EXTENDED HELLO. A client that supportsSMTP extensions issues this commandrather than the HELO command wheninitiating a session. If the SMTP serverreceiving this command supports SMTPextensions, it returns a 250 (RequestedMail Action Okay, Completed) response.If the SMTP server receiving the messagedoes not support SMTP extensions, itreturns a 500 (Syntax Error, CommandUnrecognized) message, which indicatesto the sender that it cannot useextended SMTP commands.

EHLO

ETRN

EXTENDED TURN. An extended SMTPcommand that requests the serverto begin processing its mail queues formessages waiting at the server to bedelivered to the client.

ETRN []

EXPN

EXPAND. Asks the SMTP server to verifythat the argument passed is a mailing list.If the argument does represent a mailinglist, the membership of the list is returnedto the SMTP client, in the form of users'full names and mailboxes. This canrepresent a security risk and many SMTPservers allow the administrator to disablethis command.

EXPN

HELO

HELLO. Used to identify the SMTP clientto the SMTP server and begin a newsession.

HELO

HELP

HELP. Causes the SMTP server to returnhelp information to the sender; thiscommand might or might not containarguments.

HELP [ ]

MAIL

MAIL. Used to initiate a mail transactionbetween SMTP client and server. It clearsthe reverse-path buffer, forward-pathbuffer, and mail buffer, and inserts thereverse-path argument from this com-mand into the reverse-path buffer.

MAIL

NOOP

NO OP. Has no effect on any buffers andspecifies no action other than that theSMTP client return an OK reply.

NOOP

QUIT

QUIT. Specifies that the receiver returnan OK reply and close the transmissionchannel.

QUIT

RCPT

RECIPIENT. Identifies the recipient of themail being sent; multiple recipients arespecified by repeated issuing of thecommand.

RCPT TO:

RSET

RESET. Specifies that the current mailtransaction be aborted and all buffers becleared. The SMTP server responds withan OK message.

RSET

SAML

SEND AND MAIL. Initiates a transaction speci-fying mail data be delivered to any recipientnamed who is actively connected and capableof receiving mail, as well as delivering to themailboxes of the specified recipients. Clearsthe reverse-path buffer, the forward-pathbuffer, and the mail buffer, and inserts thereverse-path information provided with thecommand into the reverse-path buffer.

SAML FROM:

SEND

SEND. Initiates a transaction specifying thatmail data be immediately delivered to anyrecipient named who is actively connectedand capable of receiving mail. If a recipientis not connected or capable of receiving mail,a 450 (Mailbox Unavailable) response is returned.This clears the reverse-path buffer, the forward-path buffer, and the mail buffer, and inserts thereverse-path information provided with the com-mand into the reverse-path buffer.

SEND FROM:

SIZE

SIZE. Allows the sender to specify themail size that it wants to send, which theserver can refuse if the size is too large. Only valid in SMTP implementations that support service extensions.

SIZE 1000000 or MAIL FROM: SIZE = 100000

SOML

SEND OR MAIL. Initiates a transactionspecifying that mail data be immediatelydelivered to any recipient named who isactively connected and capable of receiving mail. If a recipient is not connectedor capable of receiving mail, specifiesdelivery to the recipient's mailbox. Clearsthe reverse-path buffer, the forward-pathbuffer, and the mail buffer, and inserts thereverse-path information provided withthe command into the reverse-path buffer.

SOML FROM:

TURN

TURN. Specifies that the SMTP server musteither return an OK reply and assume therole of sender (SMTP client) for the mail,or return a refusal (502) and retain therole of SMTP server.

TURN

VRFY

VERIFY. Requests that the receiver SMTPverify the user name specified in the argu-ment. If the user name is valid, the fullname and mailbox of the user are returnedThis has no effect on reverse-path buffer,forward-path buffer, or mail buffer.

VRFY

SMTP Replies

SMTP replies are issued by the SMTP server in response to commands sent by the SMTP client. Every command must generate one and only one reply. Similar to the response codes issued by FTP servers, SMTP receivers issue a three-digit code number followed by descriptive text. As in FTP, the first digit of the response code indicates the general type of response, and the second digit provides additional information within thatresponse category. Tables 21-14 and 21-15 list the values and meanings of both first- and second-digit values.

Table 21-14: SMTP Response Codes—First-Digit Values

First- Digit Value

Indicator

Description

1yz

Positive Preliminary Reply

The command has been acceptedand is awaiting confirmation ofthis reply and further instructionas to whether the receiver SMTPshould continue or abort process-ing. However, there are no SMTPcommands that allow this type ofreply, so there are no continue orabort commands.

2yz

Positive Completion Reply

The requested action has beencompleted and another commandcan now be issued.

3yz

Positive Intermediate Reply

The command has been acceptedand is being held, pending receiptof further information from theSMTP client.

4yz

Transient NegativeCompletion Reply

A transient error has occurred that prevented processing of the com-mand, and the SMTP client shouldreissue the command (or commandsequence).

5yz

Permanent NegativeCompletion Reply

An error has occurred that prevented processing of the com-mand; the command (or commandsequence) should not be reissuedwithout modification. This modifi-cation can be as simple as correct-ing a misspelling, or this errormight indicate a nontransientserver error.

Table 21-15: SMTP Response Codes—Second-Digit Values

Second- Digit Value

Indicator

Description

x0z

Syntax

There is a syntax error, the command issued isunimplemented by the server, or the server doesnot recognize the command category.

x1z

Information

Replies to information requests, such as "Help."

x2z

Connections

Replies, referring to the transmission channel.

x5z

Mail System

Mail system status as it relates to the requestedtransfer or command.

Since RFC 821 was published, extensions to the protocol have been introduced, including the following:

Summary

IIS 6 provides enhanced support for Web-based services through its implementations of the HTTP/1.1, NNTP, SMTP, and FTP protocols. HTTP/1.1 improves on earlier implementations of HTTP in its support for multiple requests over a single connection, header compression, authentication mechanisms, and enhanced caching and proxy definitions. NNTP allows the transmission, storage, and replication of news messages. SMTP allows an IIS server to send and receive electronic messages on behalf of the clients it serves, facilitating e-mail communication for companies that might not require a full-fledged messaging system.

In IIS 6, SMTP is implemented as a secure protocol, allowing for authentication and verification mechanisms. The implementation of FTP in IIS 6 has also been improved, providing FTP restart, which allows a lost download connection to be resumed at the point at which it left off. By complying with the most recent standards for each of these protocols, IIS 6 ensures that Web services can be provided in the most quick and efficient manner possible.

Категории