JMP has the rather powerful capability to construct visualizations based on geographic data. However, in most cases JMP requires latitude and longitude coordinates for each item in order to map it. That many data sets lack this information limits the visualizations you can construct without individually finding coordinates for each item. For example, JMP would not be able to show a map of the following data, despite each row having a location clearly defined.
Gross Metropolitan Product (millions of dollars)
New York, NY
Los Angeles, CA
This data is pulled from a large set of gross metropolitan products for major US metro areas from the US Bureau of Economic Analysis and is used throughout this post.
In order to use the mapping capabilities of JMP on this data, it is necessary to transform these location names or addresses to latitude and longitude coordinates, a process called geocoding that can be tedious to do manually.
This summer, a JMP intern wrote an add-in that automates the geocoding process by parsing the location text to find relevant information -- like country and city name -- and attempting to match that information against either built-in tables or the online services of OpenStreetMap.org.
As an example of the geocoder's use, we took the full data set from above and used the JMP Geocoder add-in to find latitude and longitude coordinates for the cities mentioned. Now a bubble plot for Gross Metropolitan Product can be created using the derived latitude and longitude columns:
In some situations, the location text is ambiguous, and the geocoder can return the wrong location. However, after this add-in geocodes a data table, it will present you with a confirmation window showing a map of the points it found and options to fix them if they were incorrectly located. For example, let's make a new data table with the following information:
These 10 cities were pulled from a list of cities in the state of Georgia, but without any identifying items that show they are from that state. Because the geocoder sorts in order of decreasing population, it will find more populous cities first if the city location is ambiguous. Here, for example, Houston, Texas, would be found before Houston, Georgia because no state was specified.
After entering these names into a data table, let's run the geocoder with settings set to search the United States, put the coordinates in the current table and use the local geocoder. These settings are by default selected in the initial dialog, shown below:
Press the "Geocode" button, and a progress bar should briefly appear, followed by a confirmation window like the following:
Notice that, despite our search for cities in Georgia, the first results are scattered throughout the US. The add-in provides tools to fix this type of mistake. First click on a mistaken point, Houston in the example below, and click "Fix/Delete Point." The window will expand with a series of options:
These options allow you to change this point if the geocoder originally found it incorrectly. In this example, let's use the "Try Again" button to choose the next matching location, which in this case is Houston, Georgia, the location we wanted. This same operation can be applied to the rest of the incorrect points on this map. Sometimes -- like Winston, Oregon on this map -- you have to try several times to get the result you want. This is because the geocoder weighs population and name type in its decision, it will find a town name before a county or region name, and a more populous place before a lesser one. If you go through and fix each of the points, you will get a map resembling the following, with all points within Georgia, just as we wanted.