Email Topics
Internet email is a complex subject with many aspects. There are important principles that apply when administering an email system regardless of the MTA you are working with. This section presents a few concepts that will help in understanding later explanations in the book, but you are urged to learn as much about Internet email as possible from the many resources available in books and online.
2.2.1 RFCs
RFCs, or Request for Comments documents, define the standards for the Internet. There are several RFCs relating to Internet email, all of which are relevant to you if you are administering an email system on the Internet. The two most commonly referenced RFCs for email are RFC 821 and RFC 822, which deal with how email messages are transferred between systems, and how email messages should appear. These documents were put into effect more than 20 years ago. They were updated in April 2001 with the proposed standards RFC 2821 and RFC 2822, although you will still see many references to the original documents. RFC documents are maintained by the Internet Engineering Task Force, whose site is available at http://www.ietf.org/.
2.2.2 Email Agents
Chapter 1 introduced several of the email agents involved in message composition to final delivery. For convenience, Table 2-1 contains a summary of these agents.
Agent |
Name |
Purpose |
---|---|---|
MUA |
Mail User Agent |
Email client software used to compose, send, and retrieve email messages. Sends messages through an MTA. Retrieves messages from a mail store either directly or through a POP/IMAP server. |
MTA |
Mail Transfer Agent |
Server that receives and delivers email. Determines message routing and possible address rewriting. Locally delivered messages are handed off to an MDA for final delivery. |
MDA |
Mail Delivery Agent |
Program that handles final delivery of messages for a system's local recipients. MDAs can often filter or categorize messages upon delivery. An MDA might also determine that a message must be forwarded to another email address. |
2.2.3 The Postmaster
An email administrator is commonly referred to as a postmaster. An individual with postmaster responsibilities makes sure that the mail system is working correctly, makes configuration changes, and adds/removes email accounts, among other things. You must have a postmaster alias at all domains for which you handle email that directs messages to the correct person or persons. RFC 2142 specifies that a postmaster address is required.
2.2.4 Reject or Bounce
If a receiving MTA determines during the SMTP conversation (see Section 2.2.8 later in the chapter) that it will not accept the message, it rejects the message. At that point the sending system should generate an error report to deliver to the original sender. Sometimes the MTA accepts a message and later discovers that it cannot be deliveredperhaps the intended recipient doesn't exist or there is a problem in the final delivery. In this case, the MTA that has accepted the message bounces it back to the original sender by sending an error report, usually including the reason the original message could not be delivered.
The MTA that accepts a message takes responsibility for the message until it is delivered or handed off to another MTA. When a system is responsible for a message and cannot deliver or relay it, the responsible system informs the sender that the mail is undeliverable.
2.2.5 Envelope Addresses and Message Headers
A common source of confusion for email users is the fact that the To: address in email message headers has nothing to do with where a message is actually delivered. The envelope address controls message delivery. In practice, when you compose a message and provide your MUA with a To: address, your MUA uses that same address as the envelope destination address, but this is not required nor is it always the case. From the MTA's point of view, message headers are part of the content of an email message. The delivery of a message is determined by the addresses specified during the SMTP conversation. These addresses are the envelope addresses, and they are the only thing that determine where messages go. See Section 2.2.8 later in the chapter for an explanation of the SMTP protocol.
Mailing lists and spam are common examples of when the envelope destination address differs from the To: address of the message headers. For more information, see RFC 2821 and RFC 2822. Also see Section 2.2.7 later in the chapter for more information about the format of email messages. If you follow the SMTP session in Example 2-2, try substituting any address you want in the To: field of the message contents to see that it has no effect on where the message is delivered.
2.2.6 Local Parts of Email Addresses
RFC 2822 describes the format of email addresses in great detail. It specifies how things such as quoting and comments should work in email addresses. If we ignore the more obscure details, a simple email address is generally composed of three parts: the local part (which is usually a username), the @ separator, and the domain name. The local part might also be an alias to another address or to a mailing list. The local part is sometimes referred to as the lefthand side (LHS), and the domain is sometimes called the righthand side (RHS). For more information, see RFC 2822.
2.2.7 Email Message Format
Since RFC 822 was the document that originally described how Internet email messages should be formatted, messages are commonly referred to as "in the RFC 822 format" or as an "RFC 822 message." You should understand the basics of the format since it is referred to in this book and you will likely see it elsewhere. I'll use the newer proposed standard and refer to "RFC 2822 messages."
2.2.7.1 RFC 2822 messages
RFC 2822 specifies the format of both email messages and email addresses as they appear in message headers (but not envelope addresses). The specification describes the format for transmission, but many implementations use the same or a similar format to store messages. A message is comprised of two parts: the header and the body. The header contains specific fields with names such as To, From, or Subject followed by a colon (:). After the colon comes the contents of the field. One message header field can span multiple lines. Lines that continue a field start with whitespace characters (space or tab characters) to show that they are continuations of the previous line.
The standard document provides a lot of detail about the header fields and what they should be used for. There are rules about how fields relate to each other and when one or another must be used, but in the simplest case, the only required fields are the Date: and the From: fields. The standard also provides for customized fields that a particular email implementation might want to create for its own use.
The header fields are separated from the message body by an empty line. The body of a message contains the contents of the message itself. The body is purposely free-form, but should contain only ASCII characters. Some defined headers have a prescribed structure that is more restricted than the body. Binary files, such as images or executables, must be converted in some way to ASCII characters, so they can be sent in compliance with the standard. Other standards such as MIME encoding or traditional uuencoding deal with converting such files for mailing. Example 2-1 shows a typical message with headers and body.
Example 2-1. Email message format
Return-Path: Delivered-To: kdent@mail.example.com Received: from mail.oreilly.com (mail.oreilly.com [192.168.145.34]) by mail.example.com (Postfix) with SMTP id 5FA26B3DFE for ; Mon, 8 Apr 2003 16:40:29 -0400 (EDT) Date: Mon, 8 Apr 2003 15:38:21 -0500 From: Customer Service To: Reply-To: Message-ID: <01a4e2238200842@mail.oreilly.com> Subject: Have you read RFC 2822? This is the start of the body of the message. It could continue for many lines, but it doesn't.
The fields in the example are mostly self-explanatory. The Received: header is not required by RFC 2822, but every MTA that handles a message normally prepends a Received: header to the message, as discussed in RFC 2821, which is described in the following section.
2.2.8 The SMTP Protocol
The SMTP protocol is defined in RFC 2821. The protocol is actually quite simple to follow, and was designed to be easily comprehensible both to humans and computers. A client connects to an SMTP server, whereupon the server begins the SMTP conversation, which consists of a series of simple commands and replies, including the transmission of the email message. The best way to understand the protocol is to see it in action. You can easily try it yourself once you have your mail server set up. Using a Telnet client, you can pose as a delivering MTA. Example 2-2 shows the steps and the basic commands to deliver a message.
Example 2-2. Email message delivery
$ telnet mail.example.com 25 Trying 10.232.45.151 Connected to mail.example.com. Escape character is '^]'. 220 mail.example.com ESMTP Postfix HELO mail.oreilly.com 250 mail.oreilly.com MAIL FROM: 250 Ok RCPT TO: 250 Ok DATA 354 End data with . Date: Mon, 8 Apr 2003 15:38:21 -0500 From: Customer Service To: Reply-To: Message-ID: <01a4e2238200842@mail.oreilly.com> Subject: Have you read RFC 2822? This is the start of the body of the message. It could continue for many lines, but it doesn't. . 250 Ok: queued as 5FA26B3DFE quit 221 Bye Connection closed by foreign host. $
The SMTP session depicted in Example 2-2 is actually the delivery that produced the sample message in Example 2-1. To follow the example yourself, start by using a Telnet client to connect to the mail server on port 25 at mail.example.com. You should connect to your own Postfix server and type in your own email addresses for the envelope addresses. Port 25 is the well-known port for SMTP servers. After the Telnet messages:
Trying 10.232.45.151 Connected to localhost. Escape character is '^]'.
the server greets you with its banner:
220 mail.example.com ESMTP Postfix
SMTP server replies, such as the greeting message, always start with a three-digit response code, usually followed by a short message for human consumption. Table 2-2 provides the reply code levels and their meanings. The first digit of the response code is enough to know the status of the requested command. In documentation the response codes are often written as 2xx to indicate a level 200 reply.
Code level |
Status |
---|---|
2xx |
The requested action was successful. The client may continue to the next step. |
3xx |
Command was accepted, but the server expects additional information. The client should send another command with the additional information. |
4xx |
The command was not successful, but the problem is temporary. The client should retry the action at a later time. |
5xx |
The command was not successful, and the problem is considered permanent. The client should not retry the action. |
After receiving the welcome banner, introduce yourself with the HELO command. The hostname after the HELO command should be the name of the system you're connecting from:
HELO mail.oreilly.com
The server replies with a success. So you may continue:
250 mail.oreilly.com
Indicate who the message is from with the MAIL FROM command:
MAIL FROM:
The server accepts the sending address:
250 Ok
Indicate who the message is to with the RCPT TO command:
RCPT TO:
The server accepts the recipient address:
250 Ok
Now you are ready to send the content of the message. The DATA command tells the server that you have an RFC 2822 message ready to transfer:
DATA
The server replies that it accepts the command and is expecting you to begin sending data:
354 End data with .
At this point, you can transfer the entire contents of your message. The contents of messages start with the message headers. When the message itself is finished, indicate the end by sending a single period on a line by itself.
The server acknowledges the end of your message and replies that the transfer was successfully completed:
250 Ok: queued as 5FA26B3DFE
At this point the server has taken responsibility for the message. If you wanted to continue with more commands, you could do so now. Since you have no other messages to deliver to this server, you can start to disconnect with the quit command:
quit
The server replies with a success and disconnects:
221 Bye
Finally, the Telnet client tells you that the connection has ended returns to the command prompt:
Connection closed by foreign host. $
This was, of course, the simplest example of an SMTP transaction. The basic protocol provides additional commands and has been extended to allow for many enhancements. RFC 1869 provides a framework for adding additional features to the basic SMTP protocol. The enhanced protocol is referred to as ESMTP. A client indicates its willingness to use the enhanced protocol by beginning with the EHLO command instead of HELO. If the server also supports enhancements, it replies with a list of the features it provides.
Many enhancements have been specified in various RFCs. You can learn about them by searching for SMTP information on the IETF web site (http://www.ietf.org/). There are many other resources available on the Web regarding the SMTP and ESMTP protocols.