Security Technologies for the World Wide Web, Second Edition

11.2    CGI

Having Figure 11.1 in mind, CGI refers to the interface between the Web server and the program running on the application server. This application program is usually called a CGI script. [3] Roughly speaking, CGI processing works as follows :

  1. A Web server receives an HTTP request message that invokes a CGI script.

  2. The Web server creates a new server-side process to take care of this request.

  3. The server-side process takes the input provided by the browser and passes it to the appropriate application program or CGI script.

  4. The CGI script computes the output and returns it back to the server-side process.

  5. The server-side process returns the CGI script s output to the client.

  6. The server-side process exits and the Web server waits for new incoming HTTP request messages.

Information is exchanged between the server-side process and the CGI script using environment variables that are sent and received using a mechanism for inter-process communication (e.g., pipes in a UNIX environment). Consequently, a CGI script must be able to read from standard input (i.e., stdin ) and write to standard output (i.e., stdout ). As long as this requirement is fulfilled, it can be written in any programming or scripting language. Consequently, most CGI scripts are written in interpreted scripting languages that are supposed to be fast and easy to use. Examples include Perl, [4] the Tool Control Language (Tcl), Java, or Python. [5] As of this writing, Perl is by far the most popular and widely deployed language for CGI programming or scripting.

The most important environment variables used for CGI programming are summarized in Table 11.1. Note that not all environment variables are set for all HTTP request messages, and that a browser may also send new HTTP headers. If a browser sent a new HTTP header to the Web server, the server (or the server-side process) would package the header into a new CGI environment variable. The environment variable, in turn , would be prefixed with ˜ ˜HTTP and any dash character (-) would be changed to an underscore character (_). The Web server (or the server-side process) need not handle all possible HTTP headers.

Table 11.1: CGI Environment Variables (in Alphabetical Order)

Environment Variable

Meaning

AUTH_TYPE

User authentication method used

CONTENT_LENGTH

Length of input data

CONTENT_TYPE

Internet media type of input data

GATEWAY_INTERFACE

CGI version

HTTP_ACCEPT

List of MIME types accepted by the client

HTTP_USER_AGENT

Software and version of browser

HTTP_REFERER

URL of referring document

MOD_PERL

Defined if running under mod_perl

PATH_INFO

URL part after the script identifier

PATH_TRANSLATED

PATH_INFO translated into filesystem

QUERY_STRING

Query string from URL (if present)

REMOTE_ADDR

IP address of the client

REMOTE_HOST

DNS name of the client

REMOTE_IDENT

Remote user identification (unreliable)

REMOTE_USER

Name of the authenticated user

REQUEST_METHOD

HTTP request method (e.g., GET)

SCRIPT_NAME

Virtual path of the script

SERVER_NAME

DNS name of the server

SERVER_PORT

Port number of the server

SERVER_PROTOCOL

Name and version of the protocol

SERVER_SOFTWARE

Server software name and version

In addition to the environment variables summarized in Table 11.1, some SSL/TLS-enabled Web servers also set additional environment variables when SSL or TLS is used. For example, Table 11.2 summarizes the additional environment variables set by an SSL/TLS-enabled Apache Web server (i.e., Apache-SSL or Apache with mod ssl). Other SSL/ TLS-enabled Web servers may set other environment variables. In either case, the environment variables may be used by the CGI scripts to provide security services. For example, a CGI script that provides access to a database with confidential material may abort, unless a certain type of cipher suite is used.

Table 11.2: Some Additional Environment Variables for SSL/TLS (in Alphabetical Order)

Environment Variable

Meaning

HTTPS

Set if HTTPS is being used

HTTPS_CIPHER

SSL/TLS cipherspec

HTTPS_KEYSIZE

Number of bits in the session key

HTTPS_SECRETKEYSIZE

Number of bits in the secret key

SSL_CIPHER

The same as HTTPS_CIPHER

SSL_CLIENT_DN

Distinguished name in client s certificate

SSL_CLIENT_<x509>

Component of client s distinguished name

SSL_CLIENT_I_DN

Distinguished name of issuer of client s certificate

SSL_CLIENT_I_<x509>

Component of client s issuer s distinguished name

SSL_PROTOCOL_VERSION

SSL protocol version

SSL_SERVER_DN

Distinguished name in server s certificate

SSL_SERVER_<x509>

Component of server s distinguished name

SSL_SERVER_I_DN

Distinguished name of issuer of server s certificate

SSL_SERVER_I_<x509>

Component of server s issuer s distinguished name

SSL_SSLEAY_VERSION

Version of the SSLeay library

According to Table 11.1, the server-side process running on the Web server may provide to the CGI script some information that is encoded in the QUERY STRING environment variable. This information is usually provided by the user and is the user s sole means for passing input data to the CGI script. It may contain, for example, a list of keywords for a search engine or an SQL expression for use by a database gateway.

In either case, a browser may send a query string to a Web server (or CGI script, respectively) in two different ways:

From a security point of view, the HTTP POST method is preferred because the query string does not appear in the requested URL. Note, however, that a determined attacker can still eavesdrop on the data traffic and extract any information he or she wants.

In addition, there are many concerns related to the security of CGI scripts. For example, many CGI scripts that had been distributed with Web server software packages in the past were later found to be flawed or buggy . The corresponding security flaws or software bugs could be exploited to attack the machines that hosted the CGI scripts. Fortunately, this problem is no longer relevant, because most Web server software packages are distributed either without CGI scripts or with CGI scripts that are not executable by default (i.e., they are configured with read privileges only). In either case, if a CGI script is found to be flawed or buggy, it must be removed from the Web server as soon as possible (it can also be corrected or replaced with a more secure script that provides the same or a similar functionality).

The adiministrator of a Web server has to make several decisions with regard to the installation and secure configuration of CGI scripts:

In either case, interpreters, shells , and other scripting engines must never be installed in a directory where they may be invoked by a request with user-supplied input data. This is particularly true for the directory that hosts the CGI scripts (i.e., the cgi-bin directory). Unfortunately, there are examples in which software vendors have shipped Web servers with a Perl interpreter installed in the CGI directory ( mainly to make it simpler to install and configure CGI scripts written in Perl). This is very dangerous. Imagine, for example, what happens if a Perl interpreter perl.exe and a Perl script search.pl are installed in the CGI directory of the Web site www.victim.com. In this case, any user can invoke the script by simply requesting the following URL:

http://www.victim.com/cgi-bin/perl.exe?search.pl

This is convenient . This configuration, however, does not only allow the Perl script search.pl to be executed, but to run arbitrary Perl commands on the Web server. For example, anybody can request the following URL from the Web server:

http://www.victim.com/cgi-bin/perl.exe?-e+%27unlink+%3C*%3E%27

Following the rules for unescaping URLs, the Web server transforms this expression into the shell command perl -e unlink < * > , which represents a Perl command to delete all files in the current directory. Whether the command is successful depends on whether the server s user permissions allow it to make the delete operations.

In practice, many security problems occur simply because the Web server administrators and CGI script programmers assume that users behave properly and play by the rules. This means that they often assume that users type in only valid input data, that file names only contain legal characters , that users don t peek at secret CGI parameters contained inside hidden form fields, and similar things. There are, however, many ways in which users may not play by the rules and try to exploit weaknesses or vulnerabilities. An example is given above. Another example crops up in Perl scripts designed to send an e-mail message to an address entered in a fill-out form. In UNIX, it s comparably easy to do this by opening a pipe to the mail command and printing the body of the e-mail message to this pipe. Assuming that param is a function that extracts named fields from the CGI query string, a Perl script segment may look as follows (the example is taken from [1]):

$address = param( address ); $subject = param( subject ); $message = param( message ); open (MAIL," /bin/mail -s $subject $address"); print MAIL $message; close MAIL;

The script segment first uses param to recover the e-mail address, subject line, and body of the message. It then opens a pipe to the mail command, using the -s flag to specify a subject line and passing the recipient s e-mail address on the command line. The script prints the body of the message to the pipe and closes it. When the pipe is closed, the mail command delivers the message. The script is intended to be called from a fill-out form that may look as follows:

< FORM ACTION="/cgi-bin/handle_mail" METHOD=POST > To: < INPUT TYPE="text" NAME="address" > < P > Subject: < INPUT TYPE="text" NAME="subject" > < P > Message: < TEXTAREA NAME="message" ROWS=5 >< /TEXTAREA > < P > < INPUT TYPE="submit" VALUE="Send Mail" > < /FORM >

If the user typed rolf.oppliger@esecurity.ch into the ˜ ˜To: field, and Test into the ˜ ˜Subject: field, the CGI script would run the following command:

/bin/mail -s Test rolf.oppliger@esecurity.ch In this case, everything works as anticipated and the e-mail message is sent to rolf.oppliger@esecurity.ch. Unfortunately, the script has a problem: it blindly trusts that the e-mail address and subject line supplied by the user are valid. Now consider what happens when a malicious user types the string rolf.oppliger@esecurity.ch; cat /etc/passwd into the e-mail address field. In this case, the shell command the script now executes looks as follows: [7]

/bin/mail -s Test rolf.oppliger@esecurity.ch; cat /etc/passwd The effect of this is to run the anticipated mail command and then execute cat/etc/passwd. This command prints the content of the password file to standard output, which is transferred to the requesting browser. Of course, there s no reason that the same or a similar technique couldn t be used to read the contents of any file on the server host, including HTML documents that are normally protected by access control mechanisms and encrypted in transmit through the SSL or TLS protocol. In fact, variants of this exploit can be used to do many (malicious) things on the Web server. Consequently, the most important thing to do from a security point of view is to validate user-supplied input data, and to perform some pattern-matching checks accordingly. If something suspicious if found, the input data must be modified or refused .

Simson Garfinkel and Eugene H. Spafford compiled a list of general principles and rules for safe CGI programming [5]. The principles and rules are summarized in Table 11.3; they should be kept in mind when designing and implementing CGI scripts. In the same book, the authors also provide rules for C, Perl, and Hypertext Proprocessor (PHP) programmers. These rules are not summarized here.

Table 11.3: General Principles and Rules for Safe CGI Programming*

No.

Principle or Rule

1

Carefully design the program before you start.

2

Show the specification to another person.

3

Write and test small sections at a time.

4

Check all values provided by the user.

5

Check arguments that you pass to operating system functions.

6

Check all return codes from system calls.

7

Have internal consistency-checking code.

8

Include lots of logging.

9

Some information should not be logged.

10

Make the critical portion of your program as small and as simple as possible.

11

Read through your code.

12

Always use full pathnames for any filename argument, for both commands and data files.

13

Rather than depending on the current directory, set it yourself.

14

Test your completed program thoroughly.

15

Be aware of race conditions.

16

Don t have your program dump core except during your testing.

17

Do not create files in world-writable directories.

18

Don t place undue reliance on the source IP address in the packets of connections you receive.

19

Include some form of load shedding or load limiting in your server to handle cases of excessive load.

20

Put reasonable time-outs on the real time used by your CGI script while it is running.

21

Put reasonable limits on the CPU time used by your CGI script while it is running.

22

Do not require the user to send a reusable password in plaintext over the network connection to authenticate herself.

23

Have your code reviewed by another competent programmer (or two, or more).

24

Whenever possible, reuse code.

*According to [5].

Last but not least, it is important to note that on some platforms and systems a wrapper may be used to more securely run CGI scripts. Historically, the term wrapper was first coined by Wietse Venema for a tool he named TCP wrapper. [8] The tool is heavily used on UNIX platforms. It provides some level of access control based on the source and destination of a TCP connection request and logging for successful and unsuccessful connections. More specifically , the TCP wrapper starts a filter program before the requested server process is started, assuming that the connection request is permitted by the access control lists. All messages about connections and connection attempts are logged via the syslog daemon (i.e., syslogd ). Similar to the TCP wrapper, a wrapper may be used to more securely run another program (e.g., a CGI script). The execution of the other program can be made more secure because the wrapper can be configured in a way that fully controls it and changes its permissions accordingly. For example, the suEXEC wrapper can be used on UNIX systems running the Apache Web server (since version 1.2). The wrapper provides the ability to run CGI script under user IDs different from the user ID of the calling Web server (normally, when a CGI script executes, it runs as the same user who is running the Web server). Further information about the suEXEC wrapper is available at http://httpd.apache.org/docs/suexec.html. Also, its installation and configuration is further addressed in [4].

[3] The term script is used because most of these programs are written in a simple scripting language, such as Perl.

[4] http://www.perl.com

[5] http://www.python.org

[6] Note that this URL is a fictitious example only.

[7] On UNIX systems, the semicolon is a metacharacter used to separate multiple commands.

[8] The tool can be downloaded from ftp://ftp.porcupine.org/pub/security .

Категории