Hack 44. Plot Statistics Against Shapes

Easily render demographic maps in SVG from shapes and CSV or Excel files.

In late 2003, we helped a bit with tech support on the Matt Gonzalez mayoral election campaign in San Francisco. Among the many different groups of volunteers were a couple of academic GIS specialists. Adorning the walls of the Decision Support Room were fascinating maps, showing statistics plotted against voting precincts: voter turnout, voter registration rates, first and second choice votes in the previous round of the election and in previous elections. These maps were used to identify and target wards where "Get Out the Vote" and other canvassing initiatives would have the best effect on our candidate's chances.

The maps were produced by the high priesthood of GIS: crafted in ArcInfo and Illustrator, and exported to PDF for view on the Web, making real-time analysis of exit-poll data difficult. Well, we thought, the data is in the public domain, and we can make analysis maps with free software for ourselves. Now so can you, using the Perl module SVG::Shapefile, available from CPAN (http://search.cpan.org/).

The demo that accompanies this module and shows the uses of it is available at http://locative.us/indymapper/.

4.11.1. The Web Interface

Indymapper allows you to take your own shapefiles and spreadsheetsin CSV or Excel format, or the .dbf format that comes with shapefilesand make maps out of your data via a simple web interface.

Shapefiles (.shp) are a format created by ESRI for storing map data. They're one of the most common kinds of map file you'll find.

We built a web service at http://locative.us/indymapper/ where you can upload your own data and generate SVG maps with PNG screenshots. Indymapper was designed to become a part of Indyvoter, the political-social networking software at http://indyvoter.org/. Figure 4-15 shows the core of the Indymapper map-making interface.

Figure 4-15. Plotting stats against shapes on the Web

You can upload your own datafiles or select from a few sample mapkits, which consist of a shape, a data table, and various views on that table. Once you've selected or created a mapkit, the program looks at the columns in the shapefile's .dbf metadata and the column names from the spreadsheet. It presents you with several lists, with which you can match up the ID of each shape to a key column in your spreadsheet and plot the values of a second column against them. You can plot a range of valuese.g., a continuous numeric sequence that will be shaded between two colorsor a group of values, where you choose a different color for each distinct value in your data.

A choropleth map is a map that has appropriate shadings (value, texture, intensity) assigned to each area defined on the map. Choropleth shading works best for numbers with a consistent scale: the values should be ratios (e.g., "Number of Voters per 100,000 Inhabitants"), not absolute values (e.g., "Number of Voters"). Look carefully at choropleth maps you see in the press; what they represent can easily be illusory! Not all data should be uploaded to Indymapper; if you can normalize it, for example, by turning "Polling place incidents" into "Polling place incidents per 1,000 registered voters," then do so. Absolute values are best represented on a proportional symbol map, showing an icon shrink or grow according to the mapped value.

The rest of this hack should give you a clearer idea of what SVG::Shapefile and Indymapper can do, and illustrate how to "roll your own" map with code available from the Perl CPAN repository.

4.11.2. Rolling Your Own

Installing modules from CPAN is easy. As your system's root user, type:

#> perl -MCPAN -e shell cpan> install SVG::Shapefile cpan> install DBD::Excel

For more in-depth instructions on installing modules from CPAN, see [Hack #97] .

You can use SVG::Shapefile to plot a range of colors: one representing a low value, the other a high value, with a blend to and from white between them. You can also use it to plot specific values using a color palette that you supply. The first technique is the more useful here.

For this hack, we've provided a sample shapefile, voting precincts in San Francisco as of 2003, and an Excel file that has data that maps to it. You should be able to use any freely available political shape you find, if you can find or create statistics that share a key column with it. SVG::Shapefile reads .csv and .xls files using the Perl DBI interface, so you can hook it up directly to an SQL database if you want.

4.11.3. The Code

This script assumes you have the zipped-up shapefile and the Excel file in the same directory you run it from. Get them from http://mappinghacks.com/data/PRECINCTS.zip and http://mappinghacks.com/data/SF_runoff.xls. Type unzip PRECINCTS.zip to extract the contents of the shapefile, and then run this script:

#!/usr/bin/perl use strict; use lib qw(/home/jo/indymapper); use SVG::Shapefile; my $svg = SVG::Shapefile->new(ShapeFile => 'PRECINCTS.shp', PolygonID => 'PRECINCT', DataFile => 'SF_runoff.xls', KeyColumn => 'PRECINCTS', ValueColumn => 'TURNOUT', Colors => [[0,255,0],[255,0,0]] ); $svg->render('map.svg');

Reading large shapefiles and converting them to SVG geometry can take some time, so don't worry if the script waits for quite a few seconds before returning.

The PolygonID option specifies the identifier in the shapefile to use as the ID for each shape in the SVG. If it's not supplied, the default OBJECTID from the shapefile will be used. The values for the PolygonID column must match the values in the KeyColumn from the Excel or CSV file.

If you're making maps from your own shapes and aren't sure which values they identify, ogrinfo -a shapefile.shp is a quick way to see all their metadata.

The KeyColumn and DataColumn options allow you to choose a key that corresponds to each shape and a value that you want to color the shapes with. The Colors option allows you to specify two colors as RGB values: the first color is the lowest in the range, and the second color is the highest. The two colors are both shaded to white.

The results of running this script are illustrated in Figure 4-16.

Figure 4-16. Map showing voter turnout in San Francisco (note the central "progressive crescent" of political engagement)

You don't have to be a GIS geek to make your own political maps (though you probably still have to be a spreadsheet geek). Hopefully, once you're done with this book, it'll be a no-brainer.

Категории