Processing File Uploads

18.9.1 Problem

You want to allow files to be uploaded your web server and stored in your database.

18.9.2 Solution

Present the user with a web form that includes a file field. Use a file field in a web form. When the user submits the form, extract the file and store it in MySQL.

18.9.3 Discussion

One special kind of web input is an uploaded file. A file is sent as part of a POST request, but it's handled differently than other POST parameters, because a file is represented by several pieces of information such as its contents, its MIME type, its original filename on the client, and its name in temporary storage on the web server host.

To handle file uploads, you must send a special kind of form to the user; this is true no matter what API you use to create the form. However, when the user submits the form, the operations that check for and process an uploaded file are API-specific.

To create a form that allows files to be uploaded, the opening

tag should specify the POST method and must also include an enctype (encoding type) attribute with a value of multipart/form-data:

 

If you don't specify this kind of encoding, the form will be submitted using the default encoding type (application/x-www-form-urlencoded) and file uploads will not work properly.

To include a file upload field in the form, use an element of type file. For example, to present a 60-character file field named upload_file, the element looks like this:

 

The browser displays this field as a text input box into which the user can enter the name manually. It also presents a Browse button for selecting the file via the standard file-browsing system dialog. When the user chooses a file and submits the form, the browser encodes the file contents for inclusion into the resulting POST request. At that point, the web server receives the request and invokes your script to process it. The specifics vary for particular APIs, but file uploads generally work like this:

This section discusses how to create forms that include a file upload field. It also demonstrates how to handle uploads using a Perl script, post_image.pl. The script is somewhat similar to the store_image.pl script for loading images from the command line (Recipe 17.7). post_image.pl differs in that it allows you to store images over the Web by uploading them, and it stores images only in MySQL, whereas store_image.pl stores them in both MySQL and the filesystem.

This section also discusses how to obtain file upload information using PHP and Python. It does not repeat the entire image-posting scenario shown for Perl, but the recipes distribution contains equivalent implementations of post_image.pl for PHP and Python.

18.9.4 Perl

You can specify multipart encoding for a form several ways using the CGI.pm module. The following statements are all equivalent:

print start_form (-action => url ( ), -enctype => "multipart/form-data"); print start_form (-action => url ( ), -enctype => MULTIPART ( )); print start_multipart_form (-action => url ( ));

The first statement specifies the encoding type literally. The second uses the CGI.pm MULTIPART( ) function, which is easier than trying to remember the literal encoding value. The third statement is easiest of all, because start_multipart_form( ) supplies the enctype parameter automatically. (Like start_form( ), start_multipart_form( ) uses a default request method of POST, so you need not include a method argument.)

Here's a simple form that includes a text field for assigning a name to an image, a file field for selecting the image file, and a submit button:

print start_multipart_form (-action => url ( )), "Image name:", br ( ), textfield (-name =>"image_name", -size => 60), br ( ), "Image file:", br ( ), filefield (-name =>"upload_file", -size => 60), br ( ), br ( ), submit (-name => "choice", -value => "Submit"), end_form ( );

When the user submits an uploaded file, begin processing it by extracting the parameter value for the file field:

$file = param ("upload_file");

The value for a file upload parameter is special in CGI.pm because you can use it two ways. You can treat it as an open file handle to read the file's contents, or pass it to uploadInfo( ) to obtain a reference to a hash that provides information about the file such as its MIME type. The following listing shows how post_image.pl presents the form and processes a submitted form. When first invoked, post_image.pl generates a form with an upload field. For the initial invocation, no file will have been uploaded, so the script does nothing else. If the user submitted an image file, the script gets the image name, reads the file contents, determines its MIME type, and stores a new record in the image table. For illustrative purposes, post_image.pl also displays all the information that the uploadInfo( ) function makes available about the uploaded file.

#! /usr/bin/perl -w # post_image.pl - allow user to upload image files via POST requests use strict; use lib qw(/usr/local/apache/lib/perl); use CGI qw(:standard escapeHTML); use Cookbook; print header ( ), start_html (-title => "Post Image", -bgcolor => "white"); # Use multipart encoding because the form contains a file upload field print start_multipart_form (-action => url ( )), "Image name:", br ( ), textfield (-name =>"image_name", -size => 60), br ( ), "Image file:", br ( ), filefield (-name =>"upload_file", -size => 60), br ( ), br ( ), submit (-name => "choice", -value => "Submit"), end_form ( ); # Get a handle to the image file and the name to assign to the image my $image_file = param ("upload_file"); my $image_name = param ("image_name"); # Must have either no parameters (in which case that script was just # invoked for the first time) or both parameters (in which case the form # was filled in). If only one was filled in, the user did not fill in the # form completely. my $param_count = 0; ++$param_count if defined ($image_file) && $image_file ne ""; ++$param_count if defined ($image_name) && $image_name ne ""; if ($param_count == 0) # initial invocation { print p ("No file was uploaded."); } elsif ($param_count == 1) # incomplete form { print p ("Please fill in BOTH fields and resubmit the form."); } else # a file was uploaded { my ($size, $data); # If an image file was uploaded, print some information about it, # then save it in the database. # Get reference to hash containing information about file # and display the information in "key=x, value=y" format my $info_ref = uploadInfo ($image_file); print p ("Information about uploaded file:"); foreach my $key (sort (keys (%{$info_ref}))) { printf p ("key=" . escapeHTML ($key) . ", value=" . escapeHTML ($info_ref->{$key})); } $size = (stat ($image_file))[7]; # get file size from file handle print p ("File size: " . $size); binmode ($image_file); # helpful for binary data if (sysread ($image_file, $data, $size) != $size) { print p ("File contents could not be read."); } else { print p ("File contents were read without error."); # Get MIME type, use generic default if not present my $mime_type = $info_ref->{'Content-Type'}; $mime_type = "application/octet-stream" unless defined ($mime_type); # Save image in database table. (Use REPLACE to kick out any # old image with same name.) my $dbh = Cookbook::connect ( ); $dbh->do ("REPLACE INTO image (name,type,data) VALUES(?,?,?)", undef, $image_name, $mime_type, $data); $dbh->disconnect ( ); } } print end_html ( ); exit (0);

18.9.5 PHP

To write an upload form in PHP, include a file field. If you wish, you may also include a hidden field preceding the file field that has a name of MAX_FILE_SIZE and a value of the largest file size you're willing to accept:

Image name: Image file:

Be aware that MAX_FILE_SIZE is advisory only, because it can be subverted easily. To specify a value that cannot be exceeded, use the upload_max_filesize configuration setting in the PHP initialization file. There is also a file_uploads setting that controls whether or not file uploads are allowed at all.

When the user submits the form, file upload information may be obtained as follows:

$_FILES is a superglobal array (global in any scope). $HTTP_POST_FILES and $HTTP_POST_VARS must be declared with the global keyword if used in a non-global scope, such as within a function.

To avoid having to fool around figuring out which array contains file upload information, it makes sense to write a utility routine that does all the work. The following function, get_upload_info( ), takes an argument corresponding to the name of a file upload field. Then it examines the $_FILES, $HTTP_POST_FILES, and $HTTP_POST_VARS arrays as necessary and returns an associative array of information about the file, or an unset value if the information is not available. For a successful call, the array element keys are "tmp_name", "name", "size", and "type" (that is, the keys are the same as those in the entries within the $_FILES or $HTTP_POST_FILES arrays.)

function get_upload_info ($name) { global $HTTP_POST_FILES, $HTTP_POST_VARS; unset ($unset); # Look for information in PHP 4.1 $_FILES array first. # Check the tmp_name member to make sure there is a file. (The entry # in $_FILES might be present even if no file was uploaded.) if (isset ($_FILES)) { if (isset ($_FILES[$name]) && $_FILES[$name]["tmp_name"] != "" && $_FILES[$name]["tmp_name"] != "none") return ($_FILES[$name]); return (@$unset); } # Look for information in PHP 4 $HTTP_POST_FILES array next. if (isset ($HTTP_POST_FILES)) { if (isset ($HTTP_POST_FILES[$name]) && $HTTP_POST_FILES[$name]["tmp_name"] != "" && $HTTP_POST_FILES[$name]["tmp_name"] != "none") return ($HTTP_POST_FILES[$name]); return (@$unset); } # Look for PHP 3 style upload variables. # Check the _name member, because $HTTP_POST_VARS[$name] might not # actually be a file field. if (isset ($HTTP_POST_VARS[$name]) && isset ($HTTP_POST_VARS[$name . "_name"])) { # Map PHP 3 elements to PHP 4-style element names $info = array ( ); $info["name"] = $HTTP_POST_VARS[$name . "_name"]; $info["tmp_name"] = $HTTP_POST_VARS[$name]; $info["size"] = $HTTP_POST_VARS[$name . "_size"]; $info["type"] = $HTTP_POST_VARS[$name . "_type"]; return ($info); } return (@$unset); }

See the post_image.php script for details about how to use this function to get image information and store it in MySQL.

The upload_tmp_dir PHP configuration setting controls where uploaded files are saved. This is /tmp by default on many systems, but you may want to override it to reconfigure PHP to use a different directory that's owned by the web server user ID and thus more private.

18.9.6 Python

A simple upload form in Python can be written like this:

print "

" % (os.environ["SCRIPT_NAME"]) print "Image name:

" print "" print "

" print "Image file:

" print "" print "

" print "" print "

"

When the user submits the form, its contents can be obtained using the FieldStorage( ) method of the cgi module. (See Recipe 18.6.) The resulting object contains an element for each input parameter. For a file upload field, you get this information as follows:

form = cgi.FieldStorage ( ) if form.has_key ("upload_file") and form["upload_file"].filename != "": image_file = form["upload_file"] else: image_file = None

According to most of the documentation that I have read, the file attribute of an object that corresponds to a file field should be true if a file has been uploaded. Unfortunately, the file attribute seems to be true even when the user submits the form but leaves the file field blank. It may even be the case that the type attribute is set when no file actually was uploaded (for example, to application/octet-stream). In my experience, a more reliable way to determine whether a file really was uploaded is to test the filename attribute:

form = cgi.FieldStorage ( ) if form.has_key ("upload_file") and form["upload_file"].filename: print "

A file was uploaded

" else: print "

A file was not uploaded

"

Assuming that a file was uploaded, access the parameter's value attribute to read the file and obtain its contents:

data = form["upload_file"].value

See the post_image.py script for details about how to use this function to get image information and store it in MySQL.

Категории