Tags

, ,

This post provides a tutorial on how to create a cartogram using ScapeToad v1.1. In addition it describes how to work with a few common GIS file formats. Upon completion you will have created a cartogram that shows the per state population of the United States as well as learned a bit about the DBase and shape file formats. Along the way some simple Python programming will be required. All of the data files used for this tutorial as well as the Python script can be found on Git Hub here.

However, before we start it might be useful to get an idea of how cartograms help to visualize geographic information. Mark Newman’s pages are particularly good for understanding the importance of this data visualization method. Have a look at the 2008 U.S. Presidential Election Results, and also at World Mapper.

To begin the tutorial we will need a shape file that describes the state by state geometry of the United States. This can be downloaded at the Census Bureau’s website. Click the above link, then select “States (and equivalent)”, click “submit”, and then from the 2010 box, select the “all in one national file” option. Clicking on the download button will give you a zip file with the relevant information in it. Explore the other options in order to see what additional shape files are available.

Now that you have the zip file downloaded, unpack it. Assuming the zip file was named “tl_2010_us_state10.zip” you should have a single directory with five files in it. Each of the five files has the same base name as the directory itself, but each has its own file extension. For our purposes here we care about the shape file and the DBase file, which have extensions “shp” and “dbf” respectively.

The shape file itself contains geometric information, and can be thought of as a list of geometric entities, where each item corresponds to a particular state’s geometry. Wikipedia has a write-up worth reading. The detailed technical specification for the file format is here. Arc Explorer and Shape Viewer are two free (as in beer) programs for viewing shape files.

The DBase file is a table of properties where, by convention, each row in the table contains the attributes of the item in the shape file with the same index. For example, the 10th shape in the shape file is presumed to have attributes given by the 10th row in the DBase file.

In order to create a cartogram with Scape Toad we will have to supply an appropriate DBase file. In this case our DBase file will contain two columns. The first will be the state’s two letter postal abbreviation and the second will be its population. Scape Toad will ignore the first column, but will allow us to create a cartogram using the data in the second column.

Note that DBase files can be opened with Excel for viewing, and that there is also a Python library for manipulating them.

At this point you should have the following software installed.

  • Python (to create DBase files) (optional if you got the dbf files from Git Hub)
  • dbfpy (to create DBase files) (optional if you got the dbf files from Git Hub)
  • Scape Toad (to view shape files and create cartograms)
  • Shape Viewer (to view shape files – slightly better UI than Scape Toad)
  • Excel  (to view DBase files) (optional)

Next we’ll create a DBase file that contains the U.S. population data using the following Python script.

#!/bin/env python

from dbfpy import dbf

POP ={
    "CA" : 37691912, "TX" : 25145561, "NY" : 19465197, "FL" : 19057542,
    "IL" : 12869257, "PA" : 12742886, "OH" : 11544951, "MI" : 9876187,
    "GA" : 9815210, "NC" : 9656401, "NJ" : 8821155, "VA" : 8096604,
    "WA" : 6830038, "MA" : 6587536, "IN" : 6516922, "AZ" : 6482505,
    "TN" : 6403353, "MO" : 6010688, "MD" : 5828289, "WI" : 5711767,
    "MN" : 5344861, "CO" : 5116769, "AL" : 4802740, "SC" : 4679230,
    "LA" : 4574836, "KY" : 4369356, "OR" : 3871859, "OK" : 3791508,
    "PR" : 3706690, "CT" : 3580709, "IA" : 3062309, "MS" : 2978512,
    "AK" : 2937979, "KS" : 2871238, "UT" : 2817222, "NV" : 2723322,
    "NM" : 2082224, "WV" : 1855364, "NE" : 1842641, "ID" : 1584985,
    "HI" : 1374810, "ME" : 1328188, "NH" : 1318194, "RI" : 1051302,
    "MT" : 998199, "DE" : 907135, "SD" : 824082, "AR" : 722718,
    "ND" : 683932, "VT" : 626431, "DC" : 617996, "WY" : 568158,
    }

# The backup dbf. We'll need it because we need to
# preserve the state by state order of the rows
# in the new file.
olddb = dbf.Dbf("tl_2010_us_state10-orig.dbf")

# Our new DB file.
newdb = dbf.Dbf("tl_2010_us_state10.dbf", new=True)
newdb.addField(
    ("STATE", "C", 15),
    ("POPULATION", "N", 25, 0),
    )

for rec in olddb:
    # STUSPS10 is the key for the two letter state abbreviation
    # in the old file.
    abbrev = rec['STUSPS10']

    # Create a new record in our new db file
    # and assign the columns
    rec=newdb.newRecord()
    rec['STATE']=abbrev

    if POP.has_key(abbrev):
        rec['POPULATION']= POP[abbrev]
        pop = POP[abbrev]
    else:
        # Print a message if we cannot find the population
        # for a given record.
        print "BAD POP KEY:", abbrev
        rec['POPULATION']= 0

    rec.store()

olddb.close()
newdb.close()

The script itself should be run in the directory where the shape and DBase files are located, however before running the script, rename the file “tl_2010_us_state10.dbf” to “tl_2010_us_state10-orig.dbf”. We do this because the Python script uses the old DBase file to determine the order in which to write records into the new file, but in addition it overwrites the original location, since the DBase file to be used with any particular shape file must have the same base name as the shape file itself. Edit the script to account for any differences in file names.

Alternatively, you can skip running the script and download the appropriate DBase file from my Git Hub page.

At this point, if you have Excel you might also want to open both the original and new DBase files and see for yourself what is in them.

Now we can fire-up Scape Toad. When it comes up, click the “add layer” button in the tool bar. Navigate to the shape file and select it. If the shape file came in correctly you should see something like this on your screen.

Scape Toad Screenshot

Note that the DBF file you created must have the exact same base name as the shape file and that they must both be in the same directory. Otherwise we won’t be able to create a cartogram.

Next click the “Create cartogram” icon in the toolbar. Click “next”, “next”, and then ensure that POPULATION is selected in the drop down menu. Click “next” again. And again. And then “compute”. Now wait…

After the computation is finished you should see a cartogram that looks something like this on your screen.

Scape Toad Cartogram Screenshot

Unfortunately Scape Toad has no zoom feature, so to get a close up look at the cartogram you’ll want to export it as a shape file and bring it up in Shape Viewer. Unfortunately there you will lose the legend and will be left with just the distorted shapes. C’est la vie.

If you have gotten this far then congratulations! You have succeeded in creating a simple cartogram that shows how the population of the United States is spread across its geography.

About these ads