The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities

Now that you understand the basic review strategies, some general guidelines for reviewing code are introduced. These guidelines aren't hard-and-fast rules; rather, they are invaluable techniques and tricks developed through years of experience. These techniques help to ensure thorough coverage and understanding of even the most subtle vulnerabilities. After all, it's easy to make mistakes and skip a line or two when assessing a massive codebase. Unfortunately, one or two lines can be the difference between safe code and vulnerable code. However, by carefully applying the strategies discussed earlier along with the following simple tactics, your effectiveness should improve quickly.

Internal Flow Analysis

In the previous discussion on code flow, the strategies addressed intermodule and interprocedural relationships. This code flow analysis is good for navigating between functions, but when analyzing a code fragment, you need to perform intraprocedural and intramodule analysis. These types of analysis require being sensitive to both control flow and data flow within a function, regardless of how you handle tracing outside the function. To see how this analysis works, walk through a fairly simple code path in the following C function:

char *ReadString(int fd, int maxlength) { int length; char *data; if(read_integer(fd, &length) < 0) return NULL; data = (char *)malloc(length + 1); if(data == NULL) return NULL; if(read(fd, data, length) < 0) { free(data); return NULL; } data[length] = '\0'; return data; }

This function simply reads a variable-length string from network input and returns a pointer to it. It does this by reading an integer value representing the length, and then reading a number of bytes equal to that value. However, even this simple function has several potential code paths to examine. First, say read_integer() fails. The code that runs would then look like this:

read_integer(fd, &length); return NULL;

Not much happens here, so look at where the call to read() fails instead:

read_integer(fd, &length); data = malloc(length + 1); read(fd, data, length); free(data); return NULL;

As you can see, there's a major difference between handling a failure in read_integer() and one in read(). This simple example shows how subtle changes can drastically affect a code path in a way that's not obvious. Functions in real-world applications are usually more complicated and contain many code paths. When examining a function you've identified and traversing the relevant code paths, minimizing your chances of missing vulnerabilities is important. Many code paths share common sections, so analyzing all the relevant ones isn't quite as much work as it seems. Also, you can usually handle reading several code paths at once. For example, reading the previous function, you can safely ignore most of the error-checking failures as not being relevant to security. However, be careful when you make the distinction between what is and isn't security relevant. Reviewers tend to overlook code paths containing serious vulnerabilities in these two areas: error-checking branches and pathological code paths.

Error-checking branches are the code paths that are followed when validity checks result in an error. They include the two paths shown in the preceding examples and typically cause a return from a function or exit from the program. In the examples, these simple code paths could be dismissed easily, but remember that they are still code paths. Even if triggering the error seems unlikely, it's important to see what happens when the error does occur because the error-handling code belongs to a code path that's hardly ever traversed and probably not as well tested and audited. This topic is discussed more in Chapter 7, "Program Building Blocks."

Pathological code paths describe functions with many small and nonterminating branches (that is, branches that don't result in abrupt termination of the current function). These functions create an exponential number of similar code paths and can be extremely difficult to trace. Going through these functions several times and examining each code path in isolation is a good idea, as some paths can be triggered by unexpected conditions. That is, it's possible to trigger paths that make no sense logically but aren't prohibited by the implementation.

Subsystem and Dependency Analysis

A common misconception is that security code review should be targeted at modules that deal directly with user input from a specified entry point. Although this approach sounds reasonable, it could fail to account for all possible control flows and data flows affected by the input. First, the application design might not allow easy separation of the entry point and data parsing from the rest of the codebase. For instance, the relevant data-parsing module might depend on several other system components. Second, the application might not be especially modular in its implementation. Both reasons result in the same problemyou can't just pick relevant code paths and examine them without much knowledge of the rest of the application. Therefore, you need to make an early effort to identify module subsystems and dependencies and familiarize yourself with their behavior.

For example, large applications commonly use their own memory allocation subsystems. These allocators might be wrappers to system memory allocators or complete replacements, which fall back on the system allocator only when requesting large blocks the application manages (the Apache Web server manages its memory in a similar manner). Any variance between the system allocator's and the custom allocator's behavior might be important, as you see later in Chapter 7.

In addition to allocators, you might need to review a variety of common subsystems more thoroughly, including the following:

  • String and binary data buffer handlers

  • String parsers

  • System API replacements (such as file manipulation APIs and network APIs)

  • Data storage subsystems (hash table classes, for example)

You also need to be familiar with the quirks of any standard system functionality in use. Later chapters cover these issues for both Windows and UNIX operating systems. However, many less used functions aren't mentioned. When you encounter system functions you don't know, learn exactly how that function works. After all, such functions can often provide you with new security relevant quirks to look for in the future.

Rereading Code

Even the simple act of reading tends to be an iterative process. Often you need to read the same code paths several times over to account for all the vulnerability classes you need to consider. For example, one approach is to focus on integer-related vulnerabilities, memory management vulnerabilities, and formatted data vulnerabilities in one pass. Then you make another pass to focus on functional audits (checking return values, error prone API calls, and so on). Finally, you could make a pass to identify any synchronization vulnerabilities.

There's no metric to determine how many passes a piece of code requires. For example, you don't need to consider synchronization vulnerabilities if the code doesn't run in a multithreaded context, deal with asynchronous events, or modify shared data. Exercise your own judgment in determining how many passes to make; however, at least two passes are recommended because with only one pass, you might miss subtle complexities in the code or make an obvious oversight.

Especially complex code can be difficult to wrap your brain around, so you might need several passes to understand what it's doing. Even after reaching a thorough understanding, it's a good idea to go back later and check that your comprehension of the code is correct as well as complete. Security vulnerabilities usually exist because of oversights in seemingly minor details that have a major impact on the code. You need to keep asking questions about even simple-looking code. Are global variables or structure members altered? Are return values or arguments not always initialized? Are return values ignored or misinterpreted because of typing errors or incorrect calls? These questions are just a few things you need to consider for each function you examine. The best way to make sure you cover all your bases is to evaluate some code and then go back and make sure you didn't miss anything. Even Santa has to check his list twice!

Desk-Checking

Sometimes you see code that's difficult to evaluate in your head. The code might have too many variables coming from different places and being reassigned, or peculiar code constructs with side effects that aren't obvious. In these cases, desk-checking is recommended. Desk-checking is a technique consisting of creating a table of all variables in a code fragment and then populating them with some initial values. They should be values that you think the code might not handle correctly (such as those gained from test cases, explained in the next section). Then you step through each line of the function, updating each value according to the code. To see how this technique works, first look at this simple code:

int read_line(int sock, char *buf, size_t length) { int i, c = 0, n; for(i = 0; ; i++){ n = read(sock, (void *)&c, 1); if(n != 1) return -1; if(c == '\n') break; if(i < length) buf[i] = c; } buf[i] = '\0'; return 0; }

This code isn't hard to understand just by looking at it, but it's fine for demonstration purposes.

Note

If you think the code is hard to understand, don't worry. After a little practice, you'll probably recognize constructs such as this one more readily and be able to understand the code more easily.

The function is supposed to read a line from a socket. It puts bytes it reads from the line into the buf variable while it isn't full, and then silently discards extraneous data at the end of the line, thus returning at most a buffer of length bytes. Say you aren't too sure about evaluating whether this piece of code is secure and want to verify your thoughts. You can do a desk-check of the function with a test case you expect to be faulty. In this case, you want to see whether buf overflows when you supply a long line, so you use the following test data:

buf = 4 byte buffer length = 4 line being read = "ABCDEF\n"

The desk-check of this function is shown in Table 4-19.

Table 4-19. Desk-Check of Algorithm

Statement

i

buf

c

for(i = 0;

0

-

-

n = read(sock, &c, 1);

0

-

A

if(i < length) buf[i] = c;

0

buf[0] = 'A'

A

i++;

1

-

A

n = read(sock, &c, 1);

1

-

B

if(i < length) buf[i] = c;

1

buf[1] = 'B'

B

i++;

2

-

B

n = read(sock, &c, 1);

2

-

C

if(i < length) buf[i] = c;

2

buf[2] = 'B'

C

i++;

3

-

C

n = read(sock, &c, 1);

3

-

D

if(i < length) buf[i] = c;

3

buf[3] = 'B'

D

i++;

4

-

D

n = read(sock, &c, 1);

4

-

E

if(i < length) buf[i] = c;

4

-

E

i++;

5

-

E

n = read(sock, &c, 1);

5

-

F

if(i < length) buf[i] = c;

5

-

F

i++;

6

-

F

n = read(sock, &c, 1);

6

-

\n

if(c == '\n') break;

6

-

\n

buf[i] = '\0'

6

buf[6] = '\0'

\n

The desk-check shows that the function does read at most length bytes into the buffer supplied and then silently discard data afterward; however, a glitch is still apparent in the last two lines of this desk-check. Can you see it? The NUL byte to terminate the buffer is appended at an out-of-bounds location depending on how big the supplied line is because the i variable is used incorrectly as an index for the NUL termination. Any desk-check you do should roughly follow the format shown in the table, with statements being executed on one side and columns for the state of each relevant variable when the statement has been executed. Some statements in the code were omitted for brevity when they didn't affect the test case.

As you can see, desk-checks can be a useful tool because they provide insight into how the algorithm operates. They can help you catch vulnerabilities that are easy to miss because they seem fine at first glance. However, desk-checks can be cumbersome, especially when your test cases are complicated and involve a lot of variables. Still, they are a necessary part of data validation, and you should use them whenever you're unsure of code you're reading. Using your own shorthand versions of desk-checking tables after you're familiar with them can be convenient. For example, you don't have to write the statements in the far-left column if you can keep track of them adequately.

Test Cases

Test cases are used for testing a program or small isolated part of code to see how it handles certain inputs. Test cases can be carried out in a number of different ways: writing software to interact with the program and supply the test data, entering values manually into a program using a debugger, or using desk-checking. The purpose of test cases is to determine whether the program handles certain inputs correctly or certain combinations of inputs. Checking every possible combination of inputs usually isn't feasible, so you need to choose cases that are the most useful. Often this means boundary cases, in which inputs are unexpected or are treated specially by the code in question. For example, say you have a function with this prototype:

int Connection::ConnectionRead(int len);

You want to test how well this function copes with unexpected input. To do this, you need to identify ranges of values input variables can take and choose values from those ranges to test the function. Some test cases might include the following:

  • Calling the ConnectionRead() function with len = small negative (-1, for example)

  • Calling the ConnectionRead()function with len = large negative value (0x80000000, for example)

  • Calling the ConnectionRead()function with len = 0

  • Calling the ConnectionRead()function with len = small positive value (10)

  • Calling the ConnectionRead()function with len = large positive value (0x7FFFFFFF, for example)

The test cases have been classified based on the range of values len can take: positive, negative, or 0.

Note

You have two tests for positive and negative values because you're testing values close to the boundary conditions that constrain integers. These constraints are discussed in depth in Chapter 6.

By using carefully chosen values from each range of possible values the input can take (in this case, positive, negative, or 0), you get the best value from your tests because you're covering both expected and unexpected cases with the fewest tests possible. After further inspection of the code, it might be more apparent that certain values seem like they're going to cause major problems, so you might add those values to your test cases later. For example, examine the function a little further:

class Connection { private: int sock; Buffer network_data; ... }; int Connection::ConnectionRead(int len) { int n; if(network_data.GrowBuffer(len) == 0) return -1; n = ::read(sock, network_data.BufferEnd(), len); return n; } class Buffer { private: unsigned char *data; size_t data_size, data_used; ... }; #define EXTRA 1024 int Buffer::GrowBuffer(size_t length) { size_t new_size; char *new_data; if(data_size_data_used >= length) return 1; new_size = length + data_used + EXTRA; if(new_size < length) // check for integer overflow return 0; new_data = (unsigned char *)myrealloc(data, new_size); if(new_data == NULL) return 0; data_size = new_size; data = new_data; return 1; } void *myrealloc(void *data, size_t new_size) { void *block; new_size = (new_size + 15) & 0xFFFFFFF0; block = realloc(data, new_size); return block; }

This fairly complicated code path has a subtle vulnerability. Specifically, an integer overflow can occur in myrealloc() when rounding up new_size (as shown in the bold line), but because of an integer overflow check in GrowBuffer(), only a select few values trigger the vulnerability. (Again, if the vulnerability isn't clear to you, don't worry. Integer overflows are covered in more detail in Chapter 6.) The exact value of len being passed to ConnectionRead() (or any function that calls the GrowBuffer() function) to trigger the integer overflow depends on what the data_used value is. If you assume it's 0, the previous test cases don't trigger the integer overflow because of the following code snippet from GrowBuffer():

new_size = length + data_used + EXTRA; if(new_size < length) // check for integer overflow return 0;

The EXTRA added to new_size causes an integer overflow when using the test case of len = -1, and the large negative value test case doesn't overflow and realloc() simply fails. To trigger the bug (assuming data_used = 0), you need to add a test case of something like len = 0xFFFFFBFF (the maximum representable integer with 1024 subtracted from it). The initial range of test cases you come up with need to be tailored to the code you're examining to make sure you catch all the artificially created boundary cases occurring in the way the code works as well as the logical boundary cases you originally devised.

Test Cases with Multiple Inputs

The previous example brings up an interesting point dealing with multiple inputs. Before you examined the code in some depth, you cared about only one input as far as test cases were concerned: the len variable passed to ConnectionRead(). However, in the real world, often you deal with multiple inputs to functions. The problem is that having multiple inputs multiplies the number of test cases you need, as shown in this formula:

tests = (set of cases)(number of inputs)

The number of test cases can increase quickly. Furthermore, additional test cases might surface; often variables that are multiple inputs to a function are related in some way, a concept called "variable relationships" (discussed in Chapter 7). Essentially, a lot of variables and inputs in a module are given meaning by how they relate to other variables, so you might need to establish test cases to deal with boundary cases for a relationship, in addition to boundary cases for variables in isolation. The code you looked at previously is an example of such a test case; you must test the boundary case for the relationship between len and data_used because both those values must operate together to trigger the potential vulnerability.

When building test cases for a function or code module, it's up to you to identify these relationships to make sure you have a complete set of test cases. The more you perform test cases, the more quickly you can identify the problem cases from looking at code, which speeds up the process. However, it's worth the time to work through all potential scenarios and verify whether the code handles them correctly. Spotting problems automatically isn't as thorough, and you might miss a case or two. In addition, the number of boundary conditions you have doesn't necessarily correspond to the number of inputs you supply to a code module because some variables take values indirectly from the input (such as data_used, presumably).

Say you have a large number of test cases and you want to get rid of some, if possible. How do you do that while ensuring you're testing all the necessary boundary conditions you want to verify? There are two ways to go about cutting out extraneous test cases: constraint establishment and extraneous input thinning, explained in the following sections.

Treat Input as Hostile

Often you encounter code that is dangerous because the developer thinks that certain externally supplied variables are safe and trusts their content implicitly. This approach is dangerous for several reasons:

  • A code path might exist that's not accounted for, so less stringent input sanitation is done; therefore, the vulnerable code can be reached with variables in an unexpected state.

  • A new code path might be introduced in the future in which less stringent input sanitation is done; therefore, the vulnerable code can be reached with variables in an unexpected state.

  • The input sanitation might not work as effectively as the developer expects because of a logic or implementation error, so the vulnerable code can be reached with variables in an unexpected state.

In general, you should be wary of input data from other modules. You don't need to assume the same level of danger as completely external input, but you should still be a bit suspicious of it. After all, it's just good practice for the developer to perform some internal consistency checking, especially in a general purpose library.

Constraint Establishment

Sometimes you have a large number of test cases that verify code for all sorts of boundary conditions, but a lot of these test cases might be useless to you. Why? Because the code module you're testing can't be reached with variables in certain states, so even if the test cases aren't handled correctly, it doesn't matter because they can never happen.

If you can verify that it's impossible for variables to exist in certain states, a number of the test cases become irrelevant, and you can discard them (noting down why you discarded them). This process is called constraint establishment. When you do this, you should ensure that sanitation checks on the input work as expected by doing separate test cases for the part of the code where the sanity checks occur. To see an example of where to discard test cases, go back to the ConnectionRead() function. Imagine that it's called from only a single place in the application, a function called ConnectionReadBuffer() that looks like this:

int Connection::ConnectionReadBuffer(int len) { return ((len > 0) ? ConnectionRead(len) : 0); }

This function is basically a wrapper to ConnectionRead(), except it ensures that len is a value greater than 0. That single check cuts out quite a few test cases; now you need to test only situations in which len is positive because ConnectionRead() can never be reached with len being 0 or negative.

Extraneous Input Thinning

Extraneous input thinning means getting rid of inputs that aren't a concern. For example, consider the following function prototype:

int read_data(int sock, unsigned char *buffer, size_t length, int flags);

This function is mostly a wrapper to recv(). The initial set of states for each variable when this function is called are shown in Table 4-20.

Table 4-20. Input Data States

Variable

States

sock

Valid socket descriptor

Invalid socket descriptor

buffer

NULL

Non-NULL (size equal to length)

Non-NULL (size not equal to length)

length

0

Small positive number

Huge positive number

flags

0

Valid flags

Invalid flags

Now you have a set of possible states you want to test for. (You should normally be more specific about what values the flags variable can take, but that isn't necessary for this example.) You can probably eliminate a couple of these states when you examine the constraints for this function. For example, it's highly unlikely the program will call this function with an invalid socket descriptor. Beyond this constraint, however, certain values are outside an attacker's control both directly and indirectly. For example, say the flags variable can be any valid flag or combination of flags that the recv() function accepts (and this rule is enforced in the code elsewhere), but the program sets that value based on input from a configuration file that only the administrator can access. In this case, you don't need to test every combination of possible values flags can take; the default configuration from the file is probably sufficient.

When eliminating test cases, be careful that you don't eliminate too many. Just because you can't control the value of a variable doesn't mean you can ignore it because the values that variable takes might influence how the function or module works, and you need to see how your input is dealt with in each circumstance. To summarize, you can ignore only input values that meet the following conditions:

  • You can't control them directly or indirectly.

  • The value of this variable doesn't significantly affect how data you do control is dealt with or how the module operates.

In addition, sometimes you see arguments with the sole purpose of being filled in by the function, so when the function is called, the values in these variables are irrelevant.

Unconstrained Data Types

This discussion of test cases hasn't addressed dealing with data inputs of types that aren't constrained to a strict subset or range of values. The examples so far have dealt primarily with integer types that can be in one of three states: negative value, positive value, or 0. What about character strings, however? String data can be an arbitrary length and contain arbitrary characters supplied by users. This makes it hard to write a strict set of test cases and ensure that you're covering all possible results when the application is running in a real-world environment. String data complicates your test case procedures. Furthermore, this type of data isn't rare; you'll need to make test cases for it frequently, so you must be able to deal with this input in a consistent and accurate fashion. To do this, you need to do be aware of some context surrounding the input. In other words, you must determine what the unconstrained data represents and how the program interprets it. A number of things happen to string data over the course of a program:

  • Transformations The data is converted from one representation to another.

  • Validations Checks are performed to verify whether certain data elements are present at certain locations, to do length checks on the data, and to perform other related validation procedures.

  • Parsing and extraction Data is parsed into constituent elements. For strings, parsing usually means locating element boundaries by searching for a delimiter (such as whitespace), and then copying elements as needed by the application.

  • System usage The data is actually used for retrieving some sort of system resource, such as supplied filenames being opened or passed to another program to send e-mail.

To provide effective string test cases, you should choose boundary cases for each transformation, validation, or parsing block that takes place. The best way to do this is by examining each operation performed on the data and classifying it into one of the three categories: transformation, validation, or parsing. Depending on the category, you decide what your goal is so that you can craft test cases accordingly.

If an operation is a transformation, your goals are to see whether there's a case where the transformation occurs incorrectly, see whether corruption of some kind can occur, and see whether the order of transformations versus data validation results in a logical security vulnerability (that is, a validation procedure checks for the absence or presence of some data, but the data is subsequently transformed before it's used). These issues are explained in more detail in Chapter 8, "Strings and Metacharacters."

If the operation is a validation procedure, your main goal is to determine whether this validation can be subverted in any cases or whether the validation is inadequate given the actions that follow. (This determination can include cases with no validation.) Again, these issues are discussed in Chapter 8.

When parsing and extraction is performed, you're concerned with issues related to parsing data incorrectly, usually resulting in some sort of memory corruption (covered extensively in several later chapters). After completing these steps, often you find cases in which the data is used to access a system resource. This is usually the final step of the data's handling because it should have been validated and parsed by this point. So a vulnerability exists if using this string to access a resource allows an attacker to circumvent the application's security policy or corrupt its internal state.

Категории