Validating Web Input
18.7.1 Problem
After extracting the parameters supplied to a script, it's a good idea to check them to be sure they're valid.
18.7.2 Solution
Web input processing is one form of data import, so after you've extracted the input parameters, you can validate them using the techniques discussed in Chapter 10.
18.7.3 Discussion
One phase of form processing is to extract the input that comes back when the user submits the form. It's also possible to receive input in the form of parameters at the end of a URL. But no matter the input source, if you're going to store it in your database, it's a good idea to check it to be sure it's valid.
When a client sends input to you over the Web, you don't really know what they're sending. If you present a form for users to fill out, most of the time they'll probably be nice and enter the kinds of values you expect. But a malicious user can save the form to a file, modify the file to allow form options you don't intend, reload the file into a browser window, and submit the modified form. Your form-processing script won't know the difference. If you write it only to process the kinds of values that well-intentioned users will submit, the script may misbehave or crash when presented with unexpected inputor perhaps even do bad things to your database. (Recipe 18.8 discusses what kinds of bad things can happen.) For this reason, it's prudent to perform some validity checking on web input before using it to construct database queries.
Preliminary checking is a good idea even for non-malicious users. If you require a field to be filled in and the user forgets to provide a value, you'll need to remind the user to supply one. This can involve a simple "Is the parameter present?" check, or it may be more involved. Typical types of validation operations include the following:
- Checking content format, such as making sure a value looks like an integer or a date. This may involve some reformatting for acceptability to MySQL (for example, changing a date from MM/DD/YY to ISO format).
- Determining whether or not a value is a member of a legal set of values. Perhaps the value must be listed in the definition for an ENUM or SET column, or must be present in a lookup table.
- Filtering out extraneous characters such as spaces or dashes from telephone numbers or credit card numbers.
Some of these operations have little to do with MySQL, except in the sense that you want values to be appropriate to the types of the columns you'll store them in or perform matches against. For example, if you're going to store a value in an INT column, you can make sure it's an integer first, using a test like this (shown here using Perl):
$val =~ /^d+$/ or die "Hey! '" . escapeHTML ($val) . "' is not an integer! ";
For other types of validation, MySQL is intimately involved. If a field value is to be stored into an ENUM column, you can make sure the value is one of the legal enumeration values by checking the column definition with SHOW COLUMNS.
Having described some of the kinds of web input validation you might want to carry out, I won't further discuss them here. These and other forms of validation testing are described in Chapter 10. That chapter is oriented largely toward bulk input validation, but the techniques discussed there apply to web programming as well, because processing form input or URL parameters is, in essence, performing a data import operation.