Writing Library Files
2.4.1 Problem
You notice that you're writing similar code for common operations in several programs.
2.4.2 Solution
Put functions to perform those operations in a library file. Then you write the code only once.
2.4.3 Discussion
This section describes how to put code for common operations in library files. Encapsulation (or modularization) isn't really a "recipe" so much as a programming technique. Its principal benefit is that you don't have to repeat code in each program you write; instead, you just call a function that's in the library. For example, by putting the code for connecting to the cookbook database into a library function, you need not write out all the parameters associated with making that connection. Simply invoke the function from your program and you're connected.
Connection establishment isn't the only operation you can encapsulate, of course. Later on in the book, other utility functions are developed and placed in library files. All such files, including those shown in this section, can be found under the lib directory of the recipes distribution. As you write your own programs, you'll probably identify several operations that you perform often and that are good candidates for inclusion in a library. The techniques demonstrated in this section will help you write your own library files.
Library files have other benefits besides making it easier to write programs. They can help portability. For example, if you write connection parameters into each program that connects to the MySQL server, you have to change each program if you move them to another machine where you use different parameters. If instead you write your programs to connect to the database by calling a library function, you localize the changes that need to be made: it's necessary to modify only the affected library function, not all the programs that use it.
Code encapsulation also can improve security in some ways. If you make a private library file readable only to yourself, only scripts run by you can execute routines in the file. Or suppose you have some scripts located in your web server's document tree. A properly configured server will execute the scripts and send their output to remote clients. But if the server becomes misconfigured somehow, the result can be that your scripts get sent to clients as plain text, thus displaying your MySQL username and password. (And you'll probably realize it too late. Oops.) If the code for establishing a connection to the MySQL server is placed in a library file that's located outside the document tree, those parameters won't be exposed to clients. (Be aware, though, that if you install a library file to be readable by your web server, you don't have much security should you share the web server with other developers. Any of those developers can write a web script to read and display your library file, because by default the script will run with the permissions of the web server and thus will have access to the library.)
The recipes that follow demonstrate how to write, for each API, a library file that contains a routine for connecting to the cookbook database on the MySQL server. The Perl, PHP, and Python routines are written to return the appropriate type of value (a database handle, a connection identifier, or connection object), or to exit with an error message if the connection cannot be established. (The error-checking techniques used by these routines are those discussed in Recipe 2.3.) The Java connection routine demonstrates a different approach. It returns a connection object if it succeeds and otherwise throws an exception that the caller can deal with. To assist in handling such exceptions, the library also includes utility functions that return or print an error message that includes the error information returned by MySQL.
Libraries are of no use by themselves; the way that each one is used is illustrated by a short "test harness" program. You can use any of these harness programs as the basis for creating new programs of your own: Make a copy of the file and add your own code between the connect and disconnect calls.
Library file writing involves not only the question of what to put in the file, but also subsidiary issues such as where to install the file so it can be accessed by your programs and (on multiuser systems such as Unix) how to set its access privileges so its contents aren't exposed to people who shouldn't see it. Writing the library file and setting up your language processor to be able to find it are API-specific issues; they're dealt with in the language-specific sections to follow. By contrast, questions about file ownership and access mode are more general issues about which you'll need to make some decisions no matter which language you use (at least if you're using Unix):
- If a library file is private and contains code to be used only by you, the file can be placed under your own account and made accessible only to you. Assuming a library file mylib is already owned by you, you can make it private like this:
% chmod 600 mylib
- If the library file is to be used only by your web server, you can install it in a server library directory and make the file owned by and accessible only to the server user ID. You may need to be root to do this. For example, if the web server runs as wwwusr, these commands make the file private to that user:
# chown wwwusr mylib # chmod 600 mylib
- If the library file is public, you can place it in a location that your programming language searches automatically when it looks for libraries. (Most language processors search for libraries in some default set of directories.) You may need to be root to install files in one of these directories. Then you can make the file world readable:
# chmod 444 mylib
The example programs in this section assume that you'll install library files somewhere other than the directories the language processors search by default, as an excuse to demonstrate how to modify each language's search algorithm to look in a directory of your choosing. Many of the programs written in this book execute in a web context, so the library file installation directories used for the examples are the perl, php, python, and java directories under /usr/local/apache/lib. If you want to put the files somewhere else, just adjust the pathnames in the programs appropriately, or else take advantage of the facility that many programming languages provide for specifying where to look for library files by means of an environment or configuration variable. For our API languages, these variables are listed in the following table:
Language |
Variable name |
Variable type |
---|---|---|
Perl |
PERL5LIB |
Environment variable |
PHP |
include_path |
Configuration variable |
Python |
PYTHONPATH |
Environment variable |
Java |
CLASSPATH |
Environment variable |
In each case, the variable value is a directory or set of directories. For example, if under Unix I put Perl library files in the /u/paul/lib/perl directory, I can set the PERL5LIB environment variable for tcsh like this in my .login file:
setenv PERL5LIB /u/paul/lib/perl
Under Windows, if I put Perl library files in D:libperl, I can set PERL5LIB as follows in AUTOEXEC.BAT:
SET PERL5LIB=D:libperl
In each case, the variable setting tells Perl to look in the specified directory for library files, in addition to whatever other directories it would search by default. The other environment variables (PYTHONPATH and CLASSPATH) are specified using the same syntax. For more information on setting environment variables, see Recipe 1.9.
For PHP, the search path is defined by the value of the include_path variable in the PHP initialization file (typically named php.ini or php3.ini). On my system, the file's pathname is /usr/local/lib/php.ini; under Windows, the file is likely to be found in the Windows system directory or under the main PHP installation directory. The value of include_path is defined with a line like this:
include_path = "value"
The value is specified using the same syntax as for environment variables that name directories. That is, it's a list of directory names, with the names separated by colons under Unix and semicolons under Windows. For example, if you want PHP to look for include files in the current directory and in the lib/php directory under the web server root directory /usr/local/apache, include_path should be set like this under Unix:
include_path = ".:/usr/local/apache/lib/php"
If you modify the initialization file and PHP is running as an Apache module, you'll need to restart Apache to make the change take effect.
Now let's construct a library for each API. Each section here demonstrates how to write the library file itself, then discusses how to use the library from within programs.
2.4.4 Perl
In Perl, library files are called modules, and typically have an extension of .pm ("Perl module"). Here's a sample module file, Cookbook.pm, that implements a module named Cookbook. (It's conventional for the basename of a Perl module file to be the same as the identifier on the package line in the file.)
package Cookbook; # Cookbook.pm - library file with utility routine for connecting to MySQL use strict; use DBI; # Establish a connection to the cookbook database, returning a database # handle. Dies with a message if the connection cannot be established. sub connect { my $db_name = "cookbook"; my $host_name = "localhost"; my $user_name = "cbuser"; my $password = "cbpass"; my $dsn = "DBI:mysql:host=$host_name;database=$db_name"; return (DBI->connect ($dsn, $user_name, $password, { PrintError => 0, RaiseError => 1})); } 1; # return true
The module encapsulates the code for establishing a connection to the MySQL server into a function connect( ), and the package identifier establishes a Cookbook namespace for the module, so you invoke the connect( ) function using the module name:
$dbh = Cookbook::connect ( );
The final line of the module file is a statement that trivially evaluates to true. This is needed because Perl assumes something is wrong with a module and exits after reading it if the module doesn't return a true value.
Perl locates module files by searching through the directories named in its @INC array. This array contains a default list of directories. To find out what they are on your system, invoke Perl as follows at the command line:
% perl -V
The last part of the output from the command shows the directories listed in the @INC array. If you install a module file in one of those directories, your scripts will find it automatically. If you install the module somewhere else, you'll need to tell your scripts where to find it by including a use lib statement. For example, if you install the Cookbook.pm module file in /usr/local/apache/lib/perl, you can write a test harness script harness.pl that uses the module as follows:
#! /usr/bin/perl -w # harness.pl - test harness for Cookbook.pm library use strict; use lib qw(/usr/local/apache/lib/perl); use Cookbook; my $dbh = Cookbook::connect ( ); print "Connected "; $dbh->disconnect ( ); print "Disconnected "; exit (0);
Note that harness.pl does not have a use DBI statement. It's not necessary, because the Cookbook module itself imports the DBI module, so any script that uses Cookbook also gets DBI.
Another way to specify where Perl should look for module files (in addition to the directories that it searches by default) is to set the PERL5LIB environment variable. If you do that, the advantage is that your scripts won't need the use lib statement. (The corresponding disadvantage is that every user who runs scripts that use the Cookbook module will have to set PERL5LIB.)
2.4.5 PHP
PHP provides an include statement that allows the contents of a file to be read into and included as part of the current script. This provides a natural mechanism for creating libraries: put the library code into an include file, install it in one of the directories in PHP's search path, and include it into scripts that need it. For example, if you create an include file named Cookbook.php, any script that needs it can use a statement like this:
include "Cookbook.php";
The contents of PHP include files are written like regular scripts. We can write such a file, Cookbook.php, to contain a function, cookbook_connect( ), as follows:
Although most PHP examples throughout this book don't show the and ?> tags, I've shown them as part of Cookbook.php here to emphasize that include files must enclose all PHP code within those tags. The PHP interpreter doesn't make any assumptions about the contents of an include file when it begins parsing it, because you might include a file that contains nothing but HTML. Therefore, you must use and ?> to specify explicitly which parts of the include file should be considered as PHP code rather than as HTML, just as you do in the main script.
Assuming that Cookbook.php is installed in a directory that's named in PHP's search path (as defined by the include_path variable in the PHP initialization file), it can be used from a test harness script, harness.php. The entire script looks like this:
If you don't have permission to modify the PHP initialization file, you can access an include file by specifying its full pathname. For example:
include "/usr/local/apache/lib/php/Cookbook.php";
PHP also provides a require statement that is like include except that PHP reads the file even if the require occurs inside a control structure that never executes (such as an if block for which the condition is never true). PHP 4 adds include_once and require_once statements. These are like include and require except that if the file has already been read, its contents are not processed again. This is useful for avoiding multiple-declaration problems that can easily occur in situations where library files include other library files.
A way to simulate single-inclusion behavior under PHP 3 is to associate a unique symbol with a library and process its contents only if the symbol is not already defined. For example, a library file, MyLibrary.php, might be structured like this:
2.4.6 Python
Python libraries are written as modules and referenced from scripts using import or from statements. To put the code for connecting to MySQL into a function, we can write a module file Cookbook.py:
# Cookbook.py - library file with utility routine for connecting to MySQL import sys import MySQLdb # Establish a connection to the cookbook database, returning a connection # object. Dies with a message if the connection cannot be established. def connect ( ): host_name = "localhost" db_name = "cookbook" user_name = "cbuser" password = "cbpass" try: conn = MySQLdb.connect (db = db_name, host = host_name, user = user_name, passwd = password) return conn except MySQLdb.Error, e: print "Cannot connect to server" print "Error code:", e.args[0] print "Error message:", e.args[1] sys.exit (1)
The filename basename determines the module name, so the module is called Cookbook. Module methods are accessed through the module name, thus you would invoke the connect( ) method of the Cookbook module like this:
conn = Cookbook.connect ( );
The Python interpreter searches for modules in directories named in the sys.path variable. Just as with Perl's @INC array, sys.path is initialized to a default set of directories. You can find out what those directories are on your system by running Python interactively and entering a couple of commands:
% python >>> import sys >>> sys.path
If you put Cookbook.py in one of the default directories, you can reference it from a script using an import statement and Python will find it automatically:
import Cookbook
If you install Cookbook.py somewhere else, you can add the directory where it's installed to the value of sys.path. Do this by importing the sys module and invoking sys.path.insert( ). The following test harness script, harness.py, shows how to do this, assuming the Cookbook module is installed in the /usr/local/apache/lib/python directory:
#! /usr/bin/python # harness.py - test harness for Cookbook.py library # Import sys module and add directory to search path import sys sys.path.insert (0, "/usr/local/apache/lib/python") import MySQLdb import Cookbook conn = Cookbook.connect ( ) print "Connected" conn.close ( ) print "Disconnected" sys.exit (0)
Another way to tell Python where to find module files is to set the PYTHONPATH environment variable. If you set that variable to include your module directory, scripts that you run need not modify sys.path.
It's also possible to import individual symbols from a module using a from statement:
from Cookbook import connect
This makes the connect( ) routine available to the script without the need for the module name, so you'd use it like this:
conn = connect ( )
2.4.7 Java
Java library files are similar to Java programs in most ways:
- The class line in the source file indicates a class name.
- The file should have the same name as the class (with a .java extension).
- You compile the .java file to produce a .class file.
However, unlike regular program files, Java library files have no main( ) function. In addition, the file should begin with a package identifier that specifies the location of the class within the Java namespace. A common convention is to begin package identifiers with the reverse domain of the code author; this helps make identifiers unique and avoid conflict with classes written by other authors.[2] In my case, the domain is kitebird.com, so if I want to write a library file and place it under mcb within my domain's namespace, the library should begin with a package statement like this:
[2] Domain names proceed right to left from more general to more specific within the domain namespace, whereas the Java class namespace proceeds left to right from general to specific. Thus, to use a domain as the prefix for a package name within the Java class namespace, it's necessary to reverse it.
package com.kitebird.mcb;
Java packages developed for this book will be placed within the com.kitebird.mcb namespace to ensure their naming uniqueness.
The following library file, Cookbook.java, defines a Cookbook class that implements a connect( ) method for connecting to the cookbook database. connect( ) returns a Connection object if it succeeds, and throws an exception otherwise. To help the caller deal with failures, the Cookbook class also defines getErrorMessage( ) and printErrorMessage( ), utility routines that return the error message as a string or print it to System.err.
// Cookbook.java - library file with utility routine for connecting to MySQL package com.kitebird.mcb; import java.sql.*; public class Cookbook { // Establish a connection to the cookbook database, returning // a connection object. Throws an exception if the connection // cannot be established. public static Connection connect ( ) throws Exception { String url = "jdbc:mysql://localhost/cookbook"; String user = "cbuser"; String password = "cbpass"; Class.forName ("com.mysql.jdbc.Driver").newInstance ( ); return (DriverManager.getConnection (url, user, password)); } // Return an error message as a string public static String getErrorMessage (Exception e) { StringBuffer s = new StringBuffer ( ); if (e instanceof SQLException) // JDBC-specific exception? { // print general message plus any database-specific message s.append ("Error message: " + e.getMessage ( ) + " "); s.append ("Error code: " + ((SQLException) e).getErrorCode ( ) + " "); } else { s.append (e + " "); } return (s.toString ( )); } // Get the error message and print it to System.err public static void printErrorMessage (Exception e) { System.err.println (Cookbook.getErrorMessage (e)); } }
The routines within the class are declared using the static keyword, which makes them class methods rather than instance methods. That's because the class is used directly rather than by creating an object from it and invoking the methods through the object.
To use the Cookbook.java file, compile it to produce Cookbook.class, then install the class file in a directory that corresponds to the package identifier. This means that Cookbook.class should be installed in a directory named com/kitebird/mcb (or comkitebirdmcb under Windows) that is located under some directory named in your CLASSPATH setting. For example, if CLASSPATH includes /usr/local/apache/lib/java under Unix, you could install Cookbook.class in the /usr/local/apache/lib/java/com/kitebird/mcb directory. (See Recipe 2.2 for more information about the CLASSPATH variable.)
To use the Cookbook class from within a Java program, you must first import it, then invoke the Cookbook.connect( ) method. The following test harness program, Harness.java, shows how to do this:
// Harness.java - test harness for Cookbook library class import java.sql.*; import com.kitebird.mcb.Cookbook; public class Harness { public static void main (String[ ] args) { Connection conn = null; try { conn = Cookbook.connect ( ); System.out.println ("Connected"); } catch (Exception e) { Cookbook.printErrorMessage (e); System.exit (1); } finally { if (conn != null) { try { conn.close ( ); System.out.println ("Disconnected"); } catch (Exception e) { String err = Cookbook.getErrorMessage (e); System.out.println (err); } } } } }
Harness.java also shows how to use the error message routines from the Cookbook class when a MySQL-related exception occurs. printErrorMessage( ) takes the exception object and uses it to print an error message to System.err. getErrorMessage( ) returns the error message as a string. You can display the message yourself, write it to a log file, or whatever.
2 5 Issuing Queries and Retrieving Results
|