Hack 92. Map Wardriving (and other!) Data with MapServer
Use shapelib and a little script to convert wardriving data to shapefiles, and then map them!
I collect a lot of wardriving data from just about every commute, cruise, and highway adventure I take. Naturally, I want to plot my findings and publish them on the Web. MapServer provides a great way to do this, but the wardriving data first needs to be converted to something that MapServer can display. We can do this with a quick script.
8.7.1. The Shapefile of Things to Come
Shapelib is a library for handling ESRI Shapefiles, written by Frank Warmerdam. It includes utility programs that allow us to create and dump the contents of shapefiles and their matching database files from the command line. The program is available at http://shapelib.maptools.org/. There is a binary version for Windows and an RPM for Shapelib available at http://freegis.org/, which also features other neat GIS-related software and links.
If you decide to build from source, the instructions suggest editing the Makefile to suit your environment, but I was able to decompress the file and successfully compile the library without manual configuration under FreeBSD, Mac OS X, and Fedora Core 2. Type make to compile the library and the associated binaries. This will create the utility programs shpcreate, shpadd, dbfcreate, dbfadd, shpdump, and dbfdump.
To create a shapefile, you need to generate .shp, .shx, and .dbf files. For each item in your input data, you need to add the latitude and longitude to the .shp and .shx files, and add any associated attribute data to the .dbf file. For wardriving data, attributes might take the form of the ESSID, or name, of the networks that you find, or the time that you recorded the network's location. It's very important to line up the data properly, as the order of the entries is what links a single record across the .dbf and .shp/.shx files.
In practice, you can accomplish this by using the shpcreate command to specify the filename to create (without extension) and the type of shapefile to create. Use dbfcreate to create the .dbf file and specify the attribute(s) that you want to associate with this shapefile:
$ shpcreate myshape point $ dbfcreate myshape -s name 24
This example creates the files myshape.shp, myshape.shx, and myshape.dbf. The .dbf file contains one attribute, a string field called name that is 24 characters wide.
Then walk through your wardriving datafile and add points to the shapefile and attributes to the .dbf for each network you found:
$ shpadd myshape -122.09089 37.45894 $ dbfadd myshape "NoCat" $ shpadd myshape -122.15489 37.49283 $ dbfadd myshape "QueenFleeWee"
You can check the shapefile with the shpdump and dbfdump commands that come with Shapelib:
$ shpdump myshape Shapefile Type: Point # of Shapes: 2 File Bounds: ( -122.155, 37.459,0,0) to ( -122.091, 37.493,0,0) Shape:0 (Point) nVertices=1, nParts=0 Bounds:( -122.091, 37.459, 0, 0) to ( -122.091, 37.459, 0, 0) ( -122.091, 37.459, 0, 0) Shape:1 (Point) nVertices=1, nParts=0 Bounds:( -122.155, 37.493, 0, 0) to ( -122.155, 37.493, 0, 0) ( -122.155, 37.493, 0, 0) $ dbfdump myshape name NoCat QueenFleeWee
|
MapServer more or less requires you to provide the geographic extents of the area you want to map. If you were creating a map file from the previous example, you would take your extents from the File Bounds section, and your extent line would look like this:
EXTENT -122.155 37.493 -122.091 37.459
You can also check a shapefile's extents with ogrinfo, a utility included with the GDAL library. The command ogrinfo -al -summary myshape will show the metadata from all layers in a shapefile. See [Hack #68] for more on the OGR and GDAL utilities.
8.7.2. From Wardriving to Shapefiles
Netstumbler, the network scanning software I use, supports a logging format called WiScan. WiScan is a tab-delimited ASCII text format with a simple header, which our script will ignore. You can experiment on your own with handling the other parameters in the file:
# $Format: wi-scan summary with extensions # Latitude Longitude ( SSID ) Type ( BSSID ) Time (GMT) [ SNR Sig Noise ] # ( Name ) Flags Channelbits BcnIntvl # $DateGMT: 2004-09-29 N 37.418873 W 121.064453 W ( belkin54g ) BSS ( 00:11:50:0a:91:85 ) 22:10:23 (GMT) [ 22 22 0 ] # ( ) 0011 0040 0 N 37.371231 W 122.044531 ( NETGEAR ) BSS ( 00:09:5b:e9:5d:54 ) 22:10:03 (GMT) [ 25 25 0 ] # ( ) 0001 0040 0
Here is an extremely simplistic script to create a shapefile from a WiScan file. The shpadd program uses negative numbers for West longitude and South latitude. Unfortunately, the WiScan format specifies GPS coordinates with N and W, instead of + and -. This script ignores the directional letters completely and assumes northern latitude, and western longitudeso it will only work in my part of the world:
#!/bin/sh shpcreate wiscan point dbfcreate wiscan -s name 24 grep -v "^#" $1 | awk '{print "-" $4 " " $2 " ; dbfadd "" $6 $7 $8 $9}' | awk -F) '{print "shpadd wiscan " $1 """}' | sh $ wiscan my_wiscan.txt
The script creates the .shp, .shx, and .dbf files and then it uses grep and awk to create a series of shell commands that it then pipes to the shell, which executes the commands to add the location of each wireless node to the shapefile wiscan.
You can see what the script is doing by copying this fragment into the file demo:
grep -v "^#" $1 | awk '{print "-" $4 " " $2 " ; dbfadd wiscan "" $6 $7 $8 $9}' | awk -F) '{print "shpadd wiscan " $1 """}' $ demo my_wiscan.txt
The first line runs grep against the file you enter on the command line and removes all the comments from the file (comments are lines that start with #). It then sends the cleaned-up file to the awk command:
awk '{print "-" $4 " " $2 " ; dbfadd wiscan "" $6 $7 $8 $9}'
awk treats the text within single quotes as a command. The braces { } tell awk to apply the rest of the command to each line in the file. awk then prints a minus sign followed by the fourth field in the file, which is the longitude. This is where the longitude is hardcoded as West longitude. It then prints a space and then the second parameter, the latitude. Then it prints the string: ; dbfadd wiscan, followed by fields 6-9:
-122.15488 37.49284 ; ./dbfadd wiscan "QueenFleeWee)BSS(
That is getting close to what we want, but we're not quite there:
awk -F) '{print "shpadd wiscan " $1 """}'
This line takes the output of the previous line from the script and creates our final command:
./shpadd wiscan -122.15488 37.49284 ; ./dbfadd wiscan "QueenFleeWee"
It does this by printing the shpadd WiScan, followed by the rest of the first "field" that was passed in. The default field separator is whitespace, but you can use -F to set the field separator. In this case, we set it to ), so we get everything up to the )BSS(.
8.7.3. Map This Shapefile!
Throw the shapefiles into your data directory [Hack #91], and then add a layer to an existing map file:
layer name wiscan type point status default data "wiscan.shp" labelitem "name" class label size tiny color 255 0 0 end color 0 0 255 end end
This layer contains only simple parameters, enough to get the layer to display. Note the labelitem entry, and the label class. These control the label display. In our case, we are using the SSID name, which is extracted from the .dbf file and stored in the name field. The label will be red (R = 255, G = 0, B = 0), and the dot marking the station will be blue (R = 0, G = 0, B = 255).
A thorough reference to map-file syntax can be found in the MapServer documentation (http://mapserver.gis.umn.edu).
8.7.4. Hacking the Hack: Adding Aerial Photographs
Once you've set up MapServer and have a few shapefiles going, you might want to use aerial photographs in your maps. There is an easy way to do this, but there are a couple of pitfalls. Once you have the right things in place, however, it just works.
The U.S. government produces aerial and satellite Earth imagery as Digital Orthographic Quads (DOQ) for various purposes. To make these images, either photos are taken from satellite or airplanes are sent with cameras over a given area. These images are homogenized and rectified so that each of them is georeferenced correctly. Some states, counties, and universities distribute this data, usually in the GeoTIFF format.
8.7.5. GeoTIFF
Aerial photographs usually come as GeoTIFFs, which are simply standard TIFF bitmap images with an additional file called a world file, which has a .wld or .tfw file extension. This world file is a text file that contains the georeferencing information used to align and position the image properly. [Hack #33]
explains how world files work and how they can be generated.
For this hack, we'll use the aerial photographs from Pittsburgh, Pennsylvania, where I live. I'll also build upon the existing map file from [Hack #91] . All of the spatial data comes from the same place, PASDA (ftp://pasda.cac.psu.edu). The specific aerial photographs are located in ftp://pasda.cac.psu.edu/pub/pasda/doq, and we'll use wget to fetch them. Make sure your current working directory is your data directory, and unzip them there:
#> wget ftp://pasda.cac.psu.edu/pub/pasda/doq/pittsburgh*
Be warned, each of the eight files is 40 MB in size and extracts to a 45 MB TIFF. It might be useful to convert the images into JPEGs or another high-compression format. You can do this with gdal_translate. ( [Hack #68] should get you started, and for more information, check GDAL's documentation, which is available at http://www.gdal.org.)
8.7.6. Create a Tile Index
Initially, you could add each of these GeoTIFFs as its own unique layer. To make things easier and treat all of the aerial photographs as one layer, we can use gdaltindex to create a tile indexa shapefile containing the file locations of the images and the spatial locations where each of the images lays. Run as follows; gdaltindex will create the shapefile called aerial and add all of the images.
gdaltindex aerial *.tif
8.7.7. Create a Map File
Now that you have created this tile index, you can reference it in your map file through tileindex and tileitem. tileindex points to the shapefiles we created, while tileitem specifies the column containing the file locations. Again, MapServer's web site explains all of this in more detail.
Throw this in your map file as the first layer. This will place the aerial photographs on the bottom and draw the shapefiles on top:
layer name aerial_photos type raster status default tileindex "aerial" tileitem "location" end
8.7.8. Projection Issues
In a sense-making world, our shapefiles and GeoTIFFs would be using the same coordinate system, and we would be done at this stage. Unfortunately, even though our data comes from the same place, our images are being plotted thousands of miles apart (or more).
Fortunately, MapServer has the enormously powerful ability to reproject both raster and vector layers for us on the fly. If we define a projection section at the top level of our map file as follows:
projection "+proj=utm" "+zone=17" end
Then we can add a similar projection section within each of our latitude/longitude layers, like so:
projection "+proj=latlong" end
The combination of the two will prompt MapServer to dynamically reproject those layers into UTM Zone 17, as needed. Meanwhile, the DOQs, which were already in a UTM projection, will remain unchanged, and everything should magically line up. Of course, before that happens, we will first have to adjust the default extents of our map file, since our coordinate system has changed from degrees to easting and northing in meters. In this endeavor, the proj utility from the PROJ.4 toolkit is invaluable for converting lat/long extents to UTM and back. The token effort required to reproject the map extents by hand is more than handsomely repaid by the utter ease with which we can subsequently drop in new data sets from any source projection and have them *just work.*
Hacking MapServer map files is an art, but this is a good start!
Drew from Zhrodague