Python/PostgreSQL Interface Architecture

Client 3Query Processing

To execute a SQL command with Python's DB-API, you must first create a cursor. Don't confuse this cursor with a cursor created by PostgreSQL's DECLARE CURSOR command; they have some similarities, but they are certainly not the same thing, as you will see in this section.

You create a cursor object by calling a connection's cursor() function[1]. For example, if you have a connection named connect, you would create a cursor like this:

[1] It is possible, but extremely unlikely, that a call to connect.cursor() can throw a pgOperationalError exception. In fact, the only way that can happen is if somebody is messing around with the internals of a connection object; and we would never do that, would we?

cur = connect.cursor()

Notice that the cursor() function expects no arguments. You can create multiple cursor objects from the same connection; they operate independently, except that a commit() or rollback() executed on the connection will affect all cursors open on that connection.

The next client application (client3.py) shows the steps required to create a cursor, execute a command, and print the results (see Listing 17.3).

Listing 17.3. client3.pymain()

1 #!/usr/bin/python 2 # 3 # File: client3.py 4 5 import pgdb 6 import string 7 8 ############################################################ 9 def main( ): 10 try: 11 connection = pgdb.connect( database = "movies", 12 user = "bruce", 13 password = "cows" ) 14 15 except Exception, e: 16 print str( e ) 17 exit 18 19 cur = connection.cursor() 20 21 try: 22 cur.execute( "SELECT * FROM customers" ) 23 process_results( cur ) 24 25 except StandardError, e: 26 print str( e ) 27 28 cur.close() 29 connection.close() 30 exit

Listing 17.3 shows the main() procedure from client3.py. It starts by calling pgdb.connect() to establish a connection to the movies database. Lines 15 through 17 take care of any exceptions thrown by pgdb.connect(). I'll take a shortcut here by defining a single exception handler that can catch proper DB-API exceptions as well as the (apparently) erroneous exception thrown by the PostgreSQL interface.

At line 19, main() creates a new cursor object by calling connection.cursor(). It is very unlikely that this call to cursor() will fail, so we won't bother catching any exceptions. If cursor() does fail, Python will print a stack trace and an error message and abort the application.

Next, main() uses the cursor.execute() function to execute a simple SELECT command. If something goes wrong with this command, execute() will throw an exception. The text of the error message is encapsulated in the exception parameter (specifically, e.args). If the command completes without error, main() calls the process_result() function (see Listing 17.4) to display the result set.

After you have finished with the cursor object, you should close it by calling cur.close(). This is not strictly required because Python closes this object for you during garbage collection, but it's usually a good idea.

You should also close the connection object when you are done with it. Even though you can ignore the cursor.close() function, you should get into the habit of closing connection objects. In fact, before you call connection.close(), you should call connection.commit(). Why? Because the PostgreSQL DB-API interface does not run in "auto-commit" mode. When you first call pgdb.connect() to establish a connection, the connect() function silently executes a BEGIN command for you. That means that all commands that you execute belong to a single multistatement transaction until you either connection.commit() or connection.rollback(). If you fail to commit before you close a connection, any changes made in the most recent transaction are rolled back. Watch out for this-it will bite you if you aren't careful.

Now, let's look at the process_results() function (see Listing 17.4). This function is responsible for formatting and displaying the result of the SELECT command. We don't actually do any of the grunt work in process_results(); instead, I've factored the details into three helper functions.

Listing 17.4. client3.pyprocess_results()

32 ############################################################ 33 def process_results( cur ): 34 35 widths = [] 36 rows = cur.fetchall() 37 cols = cur.description 38 39 compute_widths( cur, widths, rows, cols ) 40 print_headers( cur, widths, cols ) 41 print_values( cur, widths, rows )

process_results() starts by defining an (empty) array that holds the display width for each column in the result set. I'll pass this array to the helper functions, so I'll define it here.

Next, process_results() calls the cursor.fetchall() function to retrieve all rows from the result set. The cursor.fetchall() function returns a sequence of sequences[2]. Each member of this sequence represents a single row. So, to get to the second column in the third row, you would use the following:

[2] If you're not familiar with Python, think of a "sequence of sequences" as "an array of arrays" or maybe as a "list of lists." They are not completely analogous, but close enough to understand that fetchall() returns a collection of collections.

print rows[2][1] # sequence indexes start at 0, not 1

Besides cursor.fetchall(), there are two other functions that return all or part of a result set. The cursor.fetchone() function fetches the next row in a result set. fetchone() returns a sequence or returns None if you have exhausted the result set. The cursor.fetchmany( [size=n] ) function returns the next n rows in the result set. If you omit the size parameter, fetchmany() will assume that n=5. If there are fewer than n rows remaining in the result set, fetchmany() will return all remaining rows. If the result set has been exhausted, fetchmany() will return None. Like fetchall(), fetchmany() returns a sequence of one or more sequences.

Notice that there is no way to go backward in the result set. You can't re-fetch a row after you have gone past it, nor can you "rewind" the result set to the beginning. If you need to move around in the result set, use fetchall() or declare a PostgreSQL cursor (not a DB-API cursor) and execute the FETCH commands yourself.

After you have retrieved all the rows in the result set, process_results() nabs the column metadata from cursor.description. Notice that cursor.description is a public data member, not a function. cursor.description is a list of seven-element lists. Table 17.2 shows the meaning of each sublist.

Table 17.2. cursor.description Metadata Values

Element

Meaning

0

Column name

1

Data type

2

Maximum display size

3

Server size (in bytes)

4

Precision (not used)

5

Scale (not used)

6

Null allowed? (not used)

Currently, the PyGreSQL DB-API implementation does not use the last three elements in the table (precision, scale, and null allowed?); they are always set to None. The data type member does not conform to the DB-API specification, but it's probably more useful that way. Data types are reported by their PostgreSQL names (char, oid, float4, and so on). The display size and server size elements are set to -1 for any variable-sized columns.

We will be using the column names a little later, so we store them in the local variable cols.

Now that you have access to the data (rows) and the metadata (cols), process_results() calls each of the helper functions in the right order. compute_widths() computes the width of each column name, storing the result in the widths array (see Listing 17.5). Next, print_headers() prints column headings. Finally, print_values() prints the entire result set.

Listing 17.5. client3.pycompute_widths()

43 ############################################################ 44 def compute_widths( cur, widths, rows, cols ): 45 46 c = 0 47 48 for col in cols: 49 widths.append( len( col[0] )) 50 c = c + 1 51 52 r = 0 53 54 for row in rows: 55 c = 0 56 57 for col in row: 58 if( len( str( col )) > widths[c] ): 59 widths[c] = len( str( col )) 60 c = c + 1 61 r = r + 1

The compute_widths() function computes the width of each column in the result set.

It starts by walking through the list of column names and appending the length of each name to the widths[] array. Remember, the caller (process_results()) gave you a complete metadata array in the cols parameter. Element 0 of each metadata list is the column name.

Next, compute_widths() has to find the widest value in each column of the result set. The caller gave it a list of all the rows in the result set in the rows parameter. As it processes each column in each row of the result set, compute_widths() increases the corresponding element in the widths[] array to its maximum required width.

Notice (in lines 58 and 59) that this function converts each data value into string form before calling the len() function. The result set can contain integer values, string values, float values, and so on. You can't invoke the len() function on a numeric value so you must convert them into string form first.

You can view the actual Python data types using the type() function:

>>> cur.execute( "SELECT * FROM pg_class" ) >>> c = 0 >>> for col in cur.fetchone(): ... print cur.description[c][0], ' ', col, ' ', type(col) ... c = c+1 ... relname pg_type reltype 71L relowner 1 relam 0L relfilenode 1247L relpages 2 reltuples 143.0 reltoastrelid 0L reltoastidxid 0L relhasindex 1 relisshared 0 relkind r relnatts 17 relchecks 0 reltriggers 0 relukeys 0 relfkeys 0 relrefs 0 relhasoids 1 relhaspkey 0 relhasrules 0 relhassubclass 0 relacl None

Listing 17.6 shows the print_headers() function.

Listing 17.6. client3.pyprint_headers()

63 ############################################################ 64 def print_headers( cur, widths, cols ): 65 66 c = 0; 67 68 for col in cols: 69 print string.center( col[0], widths[c] ), 70 c = c + 1 71 print 72 73 c = 0; 74 75 for col in cur.description: 76 print '-' * widths[c], 77 c = c + 1 78 print

print_headers() centers each column name within the width calculated by compute_widths(). You may have noticed that I have a dangling comma at the end of line 69 (and again at the end of line 76). Those aren't typosa dangling comma suppresses the new-line character that print would otherwise emit. I want all the column names to appear on the same line, so I'll suppress all new-lines until you get to line 71 (or 78 in the case of the second loop).

Following the column names, print_headers() prints a line of separator characters (hyphens). When you apply the multiply operator (*) to a string, as in line 76, the result is a string of repeated characters. I'll create the separator strings my "multiplying" a dash by the width of each column.

Listing 17.7 shows the remaining code in client3.py. The print_values() function loops through each row and column in the result set (rows). At line 89, convert each value to string form, left-justify it within the proper column, and print it.

Listing 17.7. client3.pyprint_values() and mainline

80 ############################################################ 81 def print_values( cur, widths, rows ): 82 83 r = 0 84 85 for row in rows: 86 c = 0 87 88 for col in row: 89 print string.ljust( str(col), widths[c] ), 90 c = c + 1 91 r = r + 1 92 print 93 94 95 ############################################################ 96 97 main()

The mainline code (that is, the entry point for your client application) is at line 97we just call the main() function and exit when main() returns.

Now, run this application:

$ chmod a+x client3.py $ ./client3.py id customer_name phone birth_date -- -------------------- -------- ---------- 1 Jones, Henry 555-1212 1970-10-10 2 Rubin, William 555-2211 1972-07-10 3 Panky, Henry 555-1221 1968-01-21 4 Wonderland, Alice N. 555-1122 1969-03-05 7 Grumby, Jonas None 1984-02-21

At this point, you know how to connect to a PostgreSQL server from Python, how to intercept errors, and how to process SELECT commands. In the next section, we'll develop an interactive command processor using Python and the Tkinter GUI module.

Категории