Hack 88. Load Your Waypoints into a Spatial Database
The powerful PostGIS database lets you make complex spatial queries about where you've been.
Spreadsheets provide power and flexibility in numerical analysis, but once you've used them for a while, you run into limitations and you start looking for a tool on the next rung of the ladder of power. Database systems have been the desktop power user's tool of choice for nearly 20 years.
In [Hack #11]
, we learned how to use a spreadsheet for geospatial analysis. And in [Hack #87], we learned how to set up a PostGIS spatial database and import existing spatial data sets. In this hack, we explore the world of "what if" with a database using GPS waypoints.
In this hack, we'll create a database schema to support waypoints and provide a small script to load waypoints into the database. We'll finish up by exploring some examples that illustrate the power of using a database for geospatial analysis.
8.3.1. Consider Your Database Design
If you ask three database developers about a database design, or schema, you'll get between zero and MAXINT opinions. For the purposes of this hack, let's just assume that you want the simplest useful schema: a table to store waypoints. To be even more simplistic, let's create a table for waypoints that were originally in GPX format. As discussed in [Hack #51], GPX is a simple XML format to store tracklog and waypoint information. Within a GPX file, a waypoint looks like this:
OREILY Old O'Reilly offices CRTD 19:34 31-AUG-00 Waypoint
8.3.2. Create Database and Tables
We need to write some SQL code that will create a waypoint table to match the attribute names used by the GPX format. The one hitch is that GPX uses for a description field, which is an SQL-reserved word, so we'll use descr instead. The GPX format defines additional elements that can exist in the format, but since my Garmin GPS doesn't support those elements, we can ignore them for now.
You can create a new table in an existing database, or create a new database. Recapping from [Hack #87], you can create a new database and add the spatial extensions in PostGIS with these commands:
$ createdb gpswork $ createlang plpgsql gpswork $ psql -d gpswork -f /usr/share/pgsql/postgis.sql $ psql -d gpswork -f /usr/share/pgsql/spatial_ref_sys.sql
Your copies of postgis.sql and spatial_ref_sys.sql might be in another directory.
These SQL statements will create the table to match the GPX format:
create table waypoint ( waypoint_id serial NOT NULL, name varchar(32), cmt varchar(255), descr varchar(255), sym varchar(255)); select AddGeometryColumn('gpswork', 'waypoint', 'location', 4326, 'POINT', 2);
The AddGeometryColumn function also inserts our column into the table geometry_columns. This is crucial to the workings of PostGIS.
If you put these statements into the file create_waypoint.sql, you can create the table from the command line:
$ psql -d gpswork -f create_waypoint.sql
The -d parameter specifies the database, and -f indicates the file containing SQL commands to execute.
8.3.3. Importing Waypoints
We need to create a series of SQL insert statements for our waypoints and tracklogs. I use GPSBabel [Hack #51] to read waypoints from my GPS and write them in GPX format.
The following Perl code will read a GPX file of waypoints and write out a set of SQL insert statements. This code uses the XML::Simple module. Install it from CPAN by typing perl -MCPAN -e "install XML::Simple" from a shell:
#!/usr/bin/perl use XML::Simple; my $gpx = XMLin($ARGV[0] , NormalizeSpace=>1 ); foreach my $wpt (keys %{$gpx->{wpt}}) { $p = $gpx->{wpt}->{$wpt}; print qq( insert into waypoint (name, cmt, descr, sym, location) values ( '$wpt', '$p->{cmt}', '$p->{desc}', '$p->{sym}', GeometryFromText('POINT($p->{lon} $p->{lat})', 4326) ); ); }
|
Save this code into a file (for example, parse_gpx_way.pl) and run it like this:
$ ./parse_gpx_way.pl gpx_waypoint_file.gpx > import.sql
The output of the script, directed into the file import.sql, will now contain a series of SQL insert statements that look like this.
INSERT INTO waypoint (name, cmt, descr, sym, location) VALUES ( 'OREILY', 'CRTD 19:34 31-AUG-00', 'CRTD 19:34 31-AUG-00', 'Waypoint', GeometryFromText('POINT(-122.818544 38.403423)', 4326) );
This is mostly straight SQL. The interesting bit is the location field: GeometryFromText('POINT(-122.818544 38.403423)', 4326). PostGIS stores spatial information in a special "geometry" type.
The 4326 is an OGC Spatial Reference System ID, or SRID, and refers to a record in the spatial_ref_sys table that includes the information about the datum, spheroid, and projection that PostGIS will use when working with this record. [Hack #100] explains the inner workings of the SRID in more detail.
|
We can combine the conversion of waypoints with loading the database using the Unix tool chain:
./parse_gpx_way.pl gpx_waypoints_file.gpx | psql -d gpswork
Congratulations! You now have waypoints loaded into a spatial database. Let's start PostGIS with the command psql gpswork and take a look at what we can do:
gpswork=# select name, cmt, descr from waypoint order by name; name | cmt | descr --------+----------------------+---------------------- 100 | 100 MILES | 100 MILES 588 | CRTD 14:58 06-SEP-04 | CRTD 14:58 06-SEP-04 ABSPOT | CRTD 10:15 08-SEP-01 | CRTD 10:15 08-SEP-01 AC1 | CRTD 15:04 16-AUG-00 | CRTD 15:04 16-AUG-00 ...
8.3.4. Calculating Distances in PostGIS
Let's say we want to calculate the distance from O'Reilly to the toll plaza of the Golden Gate Bridge. This query does it:
select distance_spheroid( GeometryFromText('POINT(-122.8412 38.4112)', 4326), GeometryFromText('POINT(-122.4750 37.8073)', 4326), 'SPHEROID["WGS_1984",6378137,298.257223563]' ) / 1609.344
It returns the absurdly overprecise answer: 46.1856112886502 miles.
PostGIS is incredibly powerful and flexible. Sometimes that means it is harder to do the easy things. In this case, calculating lat/long distances is a tad cumbersome. The distance_spheroid() function returns the distance between two points over an ellipsoidal model of Earth. You need to give the function two points as well as a description in "Well Known Text" format of the shape of the earth, shown here:
SPHEROID["WGS_1984",6378137,298.257223563]
This says that we want to use the spheroid called WGS_1984, which assumes an Earth radius of 6,378,137 meters (3,963 miles) and a flattening factor of about 298 meters.
Here is a query to return the distance between any two waypoints. ORA2 is the waypoint of the new O'Reilly campus, and TOLPLZ is the toll plaza for the Golden Gate Bridge. The query returns the distance in meters, so we divide the answer by 1609.344, which is the number of meters in a mile, to return the result in miles:
gpswork-# select distance_spheroid( w1.location, w2.location, 'SPHEROID["WGS_1984",6378137,298.257223563]' ) / 1609.344 from waypoint w1, waypoint w2 where w1.name = 'ORA2' and w2.name = 'TOLPLZ'; ?column? ------------------ 46.1867561160993 (1 row)
Here's a brief taste of what we can do with waypoints in PostGIS. This query calculates the distance from my house to all of my waypoints:
select distinct w1.name, w2.name, round(distance_spheroid( w1.location, w2.location, 'SPHEROID["WGS_1984",6378137,298.257223563]' ) / 1609.344) as my_dist from waypoint w1, waypoint w2 where w1.name = 'HOME' order by w2.name
The results should look like this (with 496 waypoints you probably don't care about omitted):
name | name | my_dist ------+--------+--------- HOME | 100 | 1032 HOME | 588 | 178 HOME | ABSPOT | 80
8.3.5. Not Quite a Cross Tab Query
Here are two queries that together calculate the distances between multiple waypoints. In this case, they calculate the distance from the first three waypoints to the last two waypoints. These are determined by the limit clauses within the sub-select queries:
create temporary table foo as select w1.name, w2.name as name2, w1.location as l1, w2.location as l2 from waypoint w1, waypoint w2 where w1.name in (select name from waypoint order by name limit 3 ) and w2.name in (select name from waypoint order by name desc limit 2); select name as from , name2 as to, distance_spheroid(l1, l2, 'SPHEROID["WGS_1984",6378137,298.257223563]' )/1607 as dist_miles from foo; from | to | dist_miles --------+-------+------------------ ABSPOT | ZOOSF | 127.989654055968 ABSPOT | ZOO | 1007.02197434096 100 | ZOOSF | 1029.38780724023 100 | ZOO | 93.6502054006563 AC1 | ZOOSF | 799.97009495917 AC1 | ZOO | 273.898428769347 (6 rows)
This is not quite the same as the distance grid created in [Hack #12], but it is the same idea.
You could also replace the sub-selects to specify particular queries. This example extracts the distances between three named waypoints and the last two waypoints by name in the table:
create temporary table foo as select w1.name, w2.name as name2, w1.location as l1, w2.location as l2 from waypoint w1, waypoint w2 where w1.name in ( select name from waypoint where name in ('HOME', 'TOLPLZ', 'MOM') order by name) and w2.name in (select name from waypoint order by name desc limit 2); select name as from, name2 as to, round(distance_spheroid(l1, l2, 'SPHEROID["WGS_1984",6378137,298.257223563]' )/1607) as dist_miles from foo; from | to | dist_miles --------+-------+------------ MOM | ZOOSF | 19 MOM | ZOO | 952 HOME | ZOOSF | 49 HOME | ZOO | 966 TOLPLZ | ZOOSF | 5 TOLPLZ | ZOO | 956
The idea behind these samples is to provide a hint of the power of doing spatial analysis with PostGIS. Many open source GIS viewing and publishing applications support PostGIS as a backend, so it's also an ideal vehicle for sharing your data on the geospatial web; we'll show how to do this with GeoServer and MapServer in the rest of this chapter.