Researchers have explored many methods for making sense of program semantics. Some are sound, some aren't; some are built to detect specific classes of bugs, while others are flexible enough to read definitions for what they're supposed to detect. Some of the more recent tools are worth pondering. You really won't be able to download most of these research prototypes and merrily start finding bugs in your own code. Rather, the ideas from these tools are driving the current crop of commercial tools (not to mention the next round of research tools). BOON applies integer range analysis to determine whether a C program can index an array outside its bounds [Wagner et al. 2000]. While capable of finding many errors that lexical analysis tools would miss, the checker is still imprecise: It ignores statement order, it can't model interprocedural dependencies, and it ignores pointer aliasing. Inspired by Perl's taint mode, CQual uses type qualifiers to perform a taint analysis, which detects format string vulnerabilities in C programs [Foster, Terauchi, and Aiken 2002]. CQual requires a programmer to annotate a few variables as either tainted or untainted and then uses type inference rules (along with pre-annotated system libraries) to propagate the qualifiers. Once the qualifiers are propagated, the system can detect format string vulnerabilities by type checking. The xg++ tool uses a template-driven compiler extension to attack the problem of finding kernel vulnerabilities in Linux and OpenBSD [Ashcraft and Engler 2002]. It looks for locations where the kernel uses data from an untrusted source without checking it first, methods by which a user can cause the kernel to allocate memory and not free it, and situations in which a user could cause the kernel to deadlock. The Eau Claire tool uses a theorem prover to create a general specification-checking framework for C programs [Chess 2002]. It can help find common security problems like buffer overflows, file access race conditions, and format string bugs. Developers can use specifications to ensure that function implementations behave as expected. MOPS takes a model-checking approach to look for violations of temporal safety properties [Chen and Wagner 2002]. Developers can model their own safety properties, and some have used the tool to check for privilege management errors, incorrect construction of chroot jails, file access race conditions, and ill-conceived temporary file schemes. Splint extends the lint concept into the security realm [Larochelle and Evans 2001]. By adding annotations, developers can enable splint to find abstraction violations, unannounced modifications to global variables, and possible use-before-initialization errors. Splint can also reason about minimum and maximum array bounds accesses if it is provided with function pre- and postconditions. Modern Security Rules Schema The schema shown in this table associates several distinct fields with each rule. This schema was developed at Cigital and is the skeleton of one of many knowledge catalogs. Though not all of the fields are relevant to static analysis per se, they do help in organizing and categorizing rules, which can then be consumed by a tool. Fieldname | Field Description | Selection Choices | |
---|
Number | Unique rule descriptor. | | | ID | Shorthand label for the rule. | | | Title | Short rule descriptor. | | | Identification Difficulty | How hard is it to apply this rule? Do we need simple text scanning? A complete type tree in an AST? Data flow analysis | ScanText scanning ASTAST parse tree analysis flow analysis? | | Accuracy | How likely is this rule to be accurate? Will there be a large number of false positives? | False negatives High false positives False positives Low false positives | | Priority | How important is this rule? | LowLook at instances of the rule if there is time MediumAll instances should be examined, but not always fixed HighAll instances should be fixed InfoSimply flagged for info | | Attack Category | What typical types of attacks does this rule help expose and/or mitigate? | Denial of service Spoofing Impersonation Log forging None Path spoofing or confusion problem Resource injection Setting manipulation SQL injection | | Vulnerability Kingdom | What types of vulnerabilities are exposed by this rule? (See Chapter 12.) | Input validation and representation API abuse Security features Time and state Error handling Code quality Encapsulation Environment | | Software Context | In what area of software implementation does the rule have likely impact? | | | Context | Software implementation context of impact for this rule. | Authorization Critical sections Cryptography Debug API File creation File I/O File management Filename management File path management Handle duplication Impersonation Inheritance Internet ISAPI Memory management OLE registration National language support | Process management Security Shell functions String conversion macros String formatting String management String parsing Sundry platform pitfalls Temporary file management Threads and processes Using named kernel objects in services Other | Other Context | New software development contexts that are not in the Context list. | | | Location | Header file, class, or module where this rule's APIs live. | | | Description | Full explanation of the rule, things to search for, and (potentially) context of what can reduce the level of false positive hits on this rule. | | | APIs | Which APIs does this rule apply to? | | | Function Name | API name. | | | Comments | Comments describing any special conditions of how this rule applies to the API. | | | Method of Attack | Context/motivation of how this rule is important to an attacker. How would the attacker leverage this weakness to exploit the software? | | Exception Criteria | Under what conditions is it okay to ignore the triggering of this rule? | | | Solution | What needs to be done to fix the code to avoid this rule and therefore improve the security of the code? What should be changed? | | | Solution Applicability | A natural language explanation of when it is appropriate to consider this solution. | | | Solution Description | Description of the proposed actions or steps for this solution. | | | Solution Efficacy | A natural language explanation of the efficacy of this particular solution. | | | Signature Details | What specific code signature will indicate that this rule is relevant for the code being analyzed? | | | Code Examples Negative | Specific code examples that exhibit this rule in failure mode. | | | Code Examples Positive | Specific code examples that exhibit this rule in solution mode. | | | Source References | Any supporting bibliography entries (sources) for this rule. | | | Recommended Resources | Recommended resources for better understanding the context, nature, and implications of this rule. | | | Resource Name | Name of the resource being recommended. | | | Resource Link | URL link to the resource (if applicable). | | | Maturity | What is the state of maturity of the definition of this rule? | Draft, low, medium, high | |
|
A Complete Modern Rule Given the schema shown in the previous table, Cigital has collected, categorized, and fleshed out many rules. The table here is an example of a complete rule for catgets(). Reading an entire set of rules, even if they are presented with this advanced schema, is difficult and no fun. (Try it for yourself by perusing Appendix B.) A static analysis tool can enforce rules like these without forcing every developer in the world to internalize all possible potential vulnerabilities. In fact, by applying these rules with a tool during development (especially when the tool is completely integrated into an IDE), developers can more naturally internalize the rules. Title | catgets |
---|
Attack Category | Path spoofing or confusion problem | Vulnerability Kingdom | Input validation and representation | | Format string Buffer overflow
| Software Context | National language support | Location | nl_types.h | Description | Text obtained from message catalogs may not be trustworthy, and care must be exercised in how it is used. | | The function catopen() opens a message catalog file located either according to a supplied path (containing a / character) or by searching for a named catalog (with no /) by referencing the values of the NLSPATH, LANG, and LC_MESSAGES environment variables. Subsequently, the catgets() function may be used to obtain message text from the catalog. If an attacker can influence the environment in which the program runs, he or she can cause a program to load strings from arbitrary files. | | Careless use of text returned by catgets() can create vulnerabilities that can be exploited by an attacker who manages to substitute text. Depending on how the text is used, buffer overflow or format vulnerabilities may be present, which, if exploited, could result in the execution of arbitrary code. |
APIs | Function Name | Comments |
---|
| catopen | Opens message catalog based on environment | | catgets | Returns arbitrary length string | Method of Attack | Attacker can manipulate NLSPATH and related environment variables to control what gets returned by catopen() and catgets(). Alternatively, the standard catalog file could be overwritten if catalog directories are not secure. By installing a custom catalog of messages, the attacker can cause arbitrarily long strings to be returned and/or can include format string information (e.g., %s) into the string, which may be interpreted if the text is used as a format string. In many cases, setuid programs access locale-specific message catalogs to print messages. If this is not done with due care, an attacker can use this to cause arbitrary code execution. | Exception Criteria | catgets() is safe if the standard catalog directory is secure and the catalog descriptor received from catopen() was opened using a fully specified path containing a / character, or NLSPATH and other environment variables are validated before being used. catgets() is also safe if the returned message text is used in a safe fashion. | Solution | Solution Applicability | Solution Description | Solutions Efficacy |
---|
| Particularly applicable to setuid programs for which the user can control the environment. | Validate that catopen() will return an authentic message catalog. | Effective, but hard to implement correctly. Best used in combination with the solution of using text safely. | | This requires either certainty that an attacker could not manipulate the program environment (not necessarily an option for a setuid program) or that the information used to locate the particular message catalog file was validated before catopen() was called. Specifying a fully qualified catalog path containing a / character would work, but it largely defeats the purpose of using a message catalog. The alternative is to examine NLSPATH and related environment variables to confirm that they correspond only to the expected secure directories. This also requires that the message catalog locations be constrained, with those constraints known to the program at compilation time. | | | Particularly applicable to setuid programs for which the user can control the environment. | Use text obtained from catgets() safely, in a way that reflects its untrustworthy nature. | Effective. Validating text to be used as a format string could be tricky unless rigid constraints are enforced. | | Text obtained from catgets() is typically used in printed or displayed messages. This should not be used as a format string, as in printf (text), but should instead be used as a data string, as in printf "%s", text). If the text must be used as a format string, it should be parsed and validated as being safe before it is used. | | | | If text obtained from catgets() is placed in a placed in a buffer, care must be exercised to ensure that buffer overflows cannot occur. | | Signature Details | Any use of catopen() or catgets() should be examined. If no Details checks are done on NLSPATH and usage of catgets() result matches the signature for a potential format string problem or buffer overflow problem, a problem exists. Most relevant for setuid programs, for which the user can control the execution environment. |
| |
---|
Code Examples Negative | nl_catd catd = catopen("MyCatalog", 0); char *text = catgets(catd, 2, 10, "Default text."); printf(text); // vulnerable to format string attack strcpy(buffer, text); // vulnerable to BO attack | Code Examples Positive | // Verify an expected secure path will be searched if (!nlsPathIsSafe()) exit(EXIT_FAILURE); nl_catd catd = catopen("MyCatalog", 0); // Ensure safe usage of retrieved text char *text = catgets(catd, 2, 10, "Default text."); printf("%s", text); strncpy(buffer, text, bufferSize); | Source References | N/A |
Recommended Resources | Resource Name | Resource Link |
---|
| catgets(3) man page | <http://www.freebsd.org/cgi/man.cgi?query=catgets&sektion=3> | | catopen(3) man page | <http://www.freebsd.org/cgi/man.cgi?query=catopen&sektion=3> | Discriminant Set | Operating System | | | Language | |
| Many static analysis approaches hold promise but have yet to be directly applied to security. Some of the more noteworthy ones include ESP (a large-scale property verification approach) [Das, Lerner, and Seigle 2002], model checkers such as SLAM and BLAST (which use predicate abstraction to examine program safety properties) [Ball and Rajamani 2001; Henzinger et al. 2003], and FindBugs (a lightweight checker with a good reputation for unearthing common errors in Java programs) [Hovemeyer and Pugh 2004]. Academic work on static analysis continues apace, and research results are published with some regularity at conferences such as USENIX Security, IEEE Security and Privacy (Oakland), ISOC Network and Distributed System Security, and Programming Language Design and Implementation (PLDI). Although it often takes years for results to make a commercial impact, solid technology transfer paths have been established, and the pipeline looks good. Expect great progress in static analysis during the next several years. |