Python Cookbook
Recipe 7.11. Storing a BLOB in a PostgreSQL Database
Credit: Luther Blissett Problem
You need to store a BLOB in a PostgreSQL database. Solution
PostgreSQL 7.2 and later supports large objects, and the psycopg module supplies a Binary escaping function: import psycopg, cPickle # Connect to a DB, e.g., the test DB on your localhost, and get a cursor connection = psycopg.connect("dbname=test") cursor = connection.cursor( ) # Make a new table for experimentation cursor.execute("CREATE TABLE justatest (name TEXT, ablob BYTEA)") try: # Prepare some BLOBs to insert in the table names = 'aramis', 'athos', 'porthos' data = { } for name in names: datum = list(name) datum.sort( ) data[name] = cPickle.dumps(datum, 2) # Perform the insertions sql = "INSERT INTO justatest VALUES(%s, %s)" for name in names: cursor.execute(sql, (name, psycopg.Binary(data[name])) ) # Recover the data so you can check back sql = "SELECT name, ablob FROM justatest ORDER BY name" cursor.execute(sql) for name, blob in cursor.fetchall( ): print name, cPickle.loads(blob), cPickle.loads(data[name]) finally: # Done. Remove the table and close the connection. cursor.execute("DROP TABLE justatest") connection.close( ) Discussion
PostgreSQL supports binary data (BYTEA and variations thereof), but you should be careful when communicating such data via SQL. Specifically, when you use a normal INSERT SQL statement and need to have binary strings among the VALUES you're inserting, you have to escape some characters in the binary string according to PostgreSQL's own rules. Fortunately, you don't have to figure out those rules for yourself: PostgreSQL supplies functions that do all the needed escaping, and psycopg exposes such a function to your Python programs as the Binary function. This recipe shows a typical case: the BYTEAs you're inserting come from cPickle.dumps, so they may represent almost arbitrary Python objects (although, in this case, we're just using them for a few lists of characters). The recipe is purely demonstrative and works by creating a table and dropping it at the end (using a TRy/finally statement to ensure finalization is performed even if the program should terminate because of an uncaught exception). Earlier PostgreSQL releases limited to a few kilobytes the amount of data you could store in a normal field of the database. To store really large objects, you had to use roundabout techniques to load the data into the database (such as PostgreSQL's nonstandard SQL function LO_IMPORT to load a data file as an object, which requires superuser privileges and data files that reside on the machine running the Postgre-SQL Server) and store a field of type OID in the table to be used later for indirect recovery of the data. Fortunately, none of these techniques are necessary anymore: since Release 7.1 (the current release at the time of writing is 8.0), PostgreSQL embodies the results of project TOAST, which removed the limitations on field-storage size and therefore the need for peculiar indirection. Module psycopg supplies the handy Binary function to let you escape any binary string of bytes into a form acceptable for placeholder substitution in INSERT and UPDATE SQL statements. See Also
Recipe 7.10 and Recipe 7.12 for MySQL-oriented and SQLite-oriented solutions to the same problem; PostgresSQL's home page (http://www.postgresql.org/); the Python/PostgreSQL module (http://initd.org/software/psycopg). |