How to geocode thousands of addresses and make a Tableau custom polygon + point map, with a little help from FME

I recently had an opportunity to test the new spatial data connector in Tableau. It is a highly anticipated addition to the long list of connectors that allows you to work with ESRI Shapefiles, KML, MapInfo, and GeoJSON files directly in Tableau. The connector can interpret polygon and point entities (no lines as of yet) and is a big step towards making creation of maps in Tableau a much better experience. This long post is split into 2 parts, the first one describing preparation of data (which I did in FME) and the second part is about handling multiple shapefiles in Tableau to make the map work just the way we want it to.

PART 1 – data preparation

A little background about the project. I want to plot about 30,000 nonprofit organizations on a map of Washington State and classify them by the groups they belong to. The data I got to start with contained the ID of each nonprofit, its name, along with address, including ZIP code, and a code that can be related to the desired grouping.

What I wanted to end up with was a table with all orgs along with their lats and longs and corresponding county, congressional district and legislative district. FME to the rescue!

If you haven’t heard of FME yet, it is a program developed by Safe Software. FME stands for Feature Manipulation Engine and it lets you create visual, drag-and-drop workflows to reshape or translate your data. Similar to Alteryx but focused on spatial data (and a hell of a lot more affordable than Alteryx). FME can be used to transform any type of data but it really shines when applied to tough spatial problems. Our project is not anywhere close to being a tough job for FME, so we’ll just be scratching the surface of what it can do.

I created my FME workflow to geocode all locations, assign county, legislative and congressional district value, clean it up and output to a shapefile:

click image to view it full size

 

The workflow has 3 main sections:

Geolocation

  1. Reads in the spreadsheet with org names and addresses
  2. Reads in ZIP codes latitude/longitude lookup table
  3. Runs addresses through Bing geocoding transformer
  4. Passes records that Bing failed to geolocate (usually PO Box addresses) to a ZIP code lookup table
  5. Creates geometric points from lats and longs
  6. Removes unnecessary fields.

Administrative area assignment:

  1. Reads in shapefiles with boundaries of US counties, US congressional districts, and Washington state legislative districts
  2. Extracts WA state entities for US wide shapefiles
  3. Renames “NAME” attribute in each shapefile to the proper name of an administrative area (county, etc.)
  4. Performs and overlay of geolocated points on areas to assign each point attributes of the areas it is contained in.

Output:

  1. Combines geolocated points and points for which locations could not be found (points without coordinates will not show on the map but we still want them to be counted)
  2. Cleans up the file by removing extraneous fields
  3. Outputs the final table of points with area assignments to a shapefile.

An additional note about geocoding options in FME.

The software comes with prebuilt access to 13 geolocation services. Some are paid but about half offer a least a limited free geocoding. Below I am listing free ones with transaction limits:

Bing (125,000 annually)

FreeGeIP.net (15,000 per hour)

Google (2,500 per day)

Here (15,000 per month)

IPInfo.io (1,000 per day)

Mapzen (30,000 per day)

OpenCage Data (2,500 per day)

 

Although it’s not implemented in my workflow, FME has a Recorder and Player transformers that let you save to a file partial results of your workflow and replay it later in the same or another workflow. This is especially helpful when you geocode a lot of addresses and you don’t want to use up your quota by running the full workflow every time you make some changes and want to test it.

 

 

 

 

PART 2 – Tableau

The hard part is done, now let’s bring it all together in Tableau.

I joined all 4 shapefiles, on the left the geocoded points, on the right, boundaries of counties, congressional districts and legislative districts. The join clause, in all 3 joins, is the name of admin area. Remember that we have admin area assignments in our point shapefile. I used full outer joins to make sure I could display boundaries of admin areas even if there are no points within them. Conversely, if some points fell beyond Washington state boundary, I want to know about it.

Connecting a list of points as a shapefile, as opposed to text or Excel, is intentional. It solves a problem of significant data preparation that would otherwise be needed to create a combined polygon and point map. Tableau’s Alan Eldridge wrote excellent post about it, see the links in the resources section at the end of this post.

To give users the ability to switch between different administrative areas, we need a parameter and 2 calculated fields:

 

Admin boundaries based on user selection.

 

 

 

 

 

 

 

 

Level of detail based on user selection.

 

 

 

 

 

 

 

PAUSE

Let’s pause here for a moment. You probably have been wondering why we are bringing county and congressional district boundaries in; doesn’t Tableau have them built in? Yes, but we also need legislative districts which are not built in. Tableau interprets built in geographic fields (State, County, etc.) as strings and the Geomtery field from a shapefile as type geometry. It would not let us mix the two types in the Level calculated field. Hence the need to keep everything consistent with shapefiles.

 

 

END OF PAUSE

To make the map, drag Geometry field from the nonprofit points shapefile (2015-501c3.shp) to canvas – Tableau will draw a map of all points. Put Names field (same data source) on Detail to separate the points into individual marks, this will allow tooltip display for each individual point.

Duplicate Longitude (generated) field on Columns to create a dual axis map and put Level and Detail calculated fields on a Detail mark.

Lastly, we need to create nonprofit groups based on Ntee Cd field, and place the group on colour.

A few minor adjustments and we have an interactive polygon/point map with ability to switch between different administrative boundary options!

 

Summary

With Tableau 10.2 it is easy to create polygon/point maps, including multiple custom geographies. If needed, data preprocessing is a cinch with a tool like FME. I am just scratching the surface of what FME can do but I really got to enjoy working with it since I discovered the tool a couple of months ago. As I explore mapping in Tableau and discover FME, more blog posts are bound to come soon.

Additional Resources and References

Points and Polygons (Alan Eldridge, June 23, 2014)

Tackle your geospatial analysis with ease in Tableau 10.2 (Kent Marten, February 14, 2017)

Points and Polygons in Tableau 10.2 (Alan Eldridge, January 18, 2017)

Using GEOMETRY Fields in Calculations (Alan Eldridge, February 21, 2017)

 

Your email is never published or shared. Required fields are marked *

*

*

Back to top|Contact me