Hack 5. Solve a Crossword Puzzle Using SQL

You can use SQL to solve the kinds of pattern-matching riddles that are typical of crossword puzzles. But first you have to load in a dictionary.

Suppose you have a table called words that contains a few thousand words. If you know some characters in some positions, you can use the underscore (_) wildcard. For example, say you are looking for an eight-letter word with the following pattern:

Second letter: a

Fourth letter: l

Seventh letter: o

An underscore means "any character" in LIKE:

mysql> SELECT * FROM words WHERE word LIKE '_a_l_ _o_'; +------+----------+ | id | word | +------+----------+ | 3823 | ballroom | | 3826 | ballyhoo | | 7255 | Carleton | | 7480 | cauldron | +------+----------+ 4 rows in set (0.04 sec)

The ANSI standard allows % and _ as the two wildcards. % is used to represent a string of any length and _ represents any single character. In Access, you use * and ?, respectively.

Here's how to find words in your dictionary that have the same three letters at the beginning and at the end:

mysql> SELECT word FROM words -> WHERE word LIKE CONCAT('%',SUBSTR(word,1,3)) -> AND LENGTH(word) > 3; +---------------+ | word | +---------------+ | Ababa | | antiformant | | booboo | | Einstein | | entertainment | | Giorgio | | Ionicization | | murmur | | Oshkosh | | redeclared | | restores | | restructures | | Tsunematsu | | underground | +---------------+ 14 rows in set (0.09 sec)

 

1.5.1. SQL Server Variation

SELECT word FROM words WHERE (word LIKE '%' + SUBSTRING(word,1,3)) AND LEN(word) > 3

 

1.5.2. Access Variation

SELECT word FROM words WHERE (word LIKE '*' + LEFT(word,3)) AND LEN(word) > 3

 

1.5.3. PostgreSQL Variation

PostgreSQL will accept the ANSI standard syntax:

SELECT word FROM words WHERE word LIKE '%' || SUBSTR(word,1,3) AND LENGTH(word) > 3

 

1.5.4. Filling a Table with Words

To perform word searches you need to build the words table. You can create it with a statement such as CREATE TABLE WORDS (word VARCHAR(255)). If you start with a plain-text file and you want to put it into the database, you have many options. Perhaps the simplest is to "top and tail" each line to make it into an INSERT statement. You need to go from this:

Aarhus Aaron Ababa aback O'Brien

to this:

INSERT INTO words VALUES ('Aarhus') INSERT INTO words VALUES ('Aaron') INSERT INTO words VALUES ('Ababa') INSERT INTO words VALUES ('aback') INSERT INTO words VALUES ('O''Brien')

Notice that the single quote must be "escaped." The name O'Brien becomes O''Brien. The following Perl one-liner will take care of that (you could pipe it into your SQL command-line utility [Hack #1] if you want):

$ perl pe "s/'/''/g;s/.*/INSERT INTO words VALUES ('$&');/" words

This command assumes words is a text file containing a list of words, such as /usr/share/dict/words found on most Linux, Unix, and Mac OS X systems. Various word lists are available from http://wordlist.sourceforge.net.

Another approach is to use a spreadsheet such as Excel to manipulate the data, as shown in Figure 1-2.

Figure 1-2. Using Excel to preprocess SQL

The first column, A, contains the original data from a text file. You can enter this data using the copy and paste tools or by selecting Fileimages/U2192.jpg border=0>Open. Column B uses the SUBSTITUTE function to escape the single quotes:

=SUBSTITUTE(A1,"'","''")

Column C uses the append operator, &, to construct the required SQL INSERT statement:

="INSERT INTO words VALUES ('" & B1 & "');"

When you've copied both formulas down the whole word list, you can copy and paste column C into your SQL command prompt or into a .sql file for later use.

Категории