cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Choose Language Hide Translation Bar
bernd_heinen
Level V
A geocoding function

I used the new technique of REST interfaces to provide a new geocoding add-in. At its core is the call to the respective REST interface of MapQuest, a company providing lots of routing services and applications. One of MapQuest’s services is geocoding. For every address provided, MapQuest sends back the latitude and longitude of the location, as well as administrative information such as the country and state.

Since data tables are the usual way to collect information in JMP, analysis starts with the user assigning roles to the columns of the data table via a user dialog. The Geocoding 2020 add-in works the same way. After the user has completed the dialog, the program collects the relevant data from the data table, translates it into the structure needed by the REST API and then sends it to MapQuest. So, the rough structure of the program is: 1) complete the user dialog, 2) collect data, 3) prepare data for interface, 4) send interface request, 5) interpret results, 6) write JMP data table(s). The third step, data preparation, is defined as a function and is available as MapQuest_geocode.jsl in the JSL cookbook. You can embed it in your own geo application or use it to get ideas of how to structure data for REST calls.

Baseline functionality

You, or any user of your program needs a valid API-key from MapQuest.

JMP data comes in form of data table columns, while REST interfaces most likely use JSON structures. As a result, there needs to be functionality that translates between both formats. The JSL also has structures for data collections, either lists or associative arrays. The purpose of this function is to encapsulate all non-native JSL commands. The necessary data is provided with the typical JSL variables in character, numeric or list formats. The data that is returned from the call comes as JSL lists as well. So, embedding this function in a JSL script doesn’t require any additional knowledge beyond JSL standards.

The interface

JSL functions are called with a statement like:

result = function();

If the function needs or accepts parameters, they are listed in the brackets. This function, GC_geocode, can be called with empty brackets or with a full list of parameters. Calling it with empty brackets invokes a parameter definition window with an outline for input and output, as shown in these two graphs:

bernd_heinen_0-1596908357756.png

bernd_heinen_1-1596908405312.png

 

 

Parameters in a function call are always positional parameters. When the function is called with parameters, ALL mandatory parameters need to be supplied, but they may be empty.

In my blog post about the Geocoding 2020 add-in, I’ve used a few of the local tourist sites around my hometown as examples of how to use the add-in. Let’s take two of those to see how this interface works:

bernd_heinen_2-1596908505638.png

 

Alte Freiheit 24a, 42103 Wuppertal, Deutschland

Wuppertaler Schwebebahn, a 13-km-long suspension railway, the oldest electric elevated railway with hanging cars in the world, inaugurated 1901.

bernd_heinen_3-1596908542634.png

 

Schloßplatz 2, 42659 Solingen, Deutschland

Schloss Burg, a reconstructed castle, originating from the 12th century.

 

First, you need an access key from MapQuest, let’s assume it is 1234ABCD. One way of using this interface is to take the address strings as they are. In this case, the format indicator (Position 2) needs to be 1 or larger. Then, the function call looks like this:

coordinates = GC_geocode ("1234ABCD", 1, {"Alte Freiheit 24a, 42103 Wuppertal, Deutschland",
"Schloßplatz 2, 42659 Solingen, Deutschland"}, {}, {}, {}, {}, {}, {});

If the addresses are in separate elements rather than in one string, the format indicator needs to be 0 (zero), meaning your call would look like:

coordinates = GC_geocode ("1234ABCD", 0, {}, {"Deutschland", "Deutschland"}, {}, {"Wuppertal", "Solingen"}, {42103, 42659},
{"Alte Freiheit 24a", "Schloßplatz 2"}, {})

Of course, you don’t need to put the values in as parameters; you can also supply a variable, e.g., a list variable with the list of country names.

In any case the result is the same list of data:

{200, "HTTP/1.1 200 OK", "", {0, 0}, {"", ""}, {51.25603, 51.13768}, {7.14865, 7.15245}, {"exact", "exact"}, {"DE", "DE"},
{"Nordrhein-Westfalen", "Nordrhein-Westfalen"}, {"", ""}, {"42103", "42659"}, {"Wuppertal", "Solingen"},
{"Alte Freiheit 24A", "Schloßplatz 2"}, 0}

Since the addresses could be found and tagged, the response code for each address is “0” (Element 4), which is why there are empty texts in element 5, where error messages would appear otherwise. The rest can easily be understood, given the interface declaration above.

Technical details

The basic functionality is simple. We send an address to MapQuest and get latitude and longitude back. For this simple task, MapQuest offers different input formats and process modes. The address information can be supplied as one text string with comma separated parts, like “Deutschland, Schloßplatz 2, 42659 Solingen”, which is the address of a famous castle near my hometown. Alternatively, the address can be given as single elements: country = “Deutschland”, Street = “Schloßplatz 2”, city = “Solingen”, postal code = “42659”.

I assume that, in most cases, more than one address might be looked up. So, the function expects to get address information as lists, either a list of strings with complete addresses or several lists with address elements. MapQuest has two different URLs for the two address formats. The single string interface operates in a batch mode, i.e., one call can resolve multiple addresses. The interface for separate address elements only accepts one address at a time. For performance reasons, the function combines separate elements into one string and runs the batch mode when many addresses need to be located (currently: > 100).

Each address, regardless of the format supplied, needs to provide its country. And there is no check if the required minimum information is given. All other elements are optional. The more precise you describe the address, the better the quality of the result. If the whole address is provided as a single string, the sequence of address elements within that string doesn’t matter.

If you are reading the locations from a data table and that data table has excluded rows, you must provide the list of excluded rows (excludedrows = As List( dt << Get Excluded Rows )) as parameter 12. Otherwise, the lists returned may not be in sync to your data table. In the message list these observations are marked as “excluded.” If there are more “excluded” observations than originally in the data table, the observations failed to provide a minimum of location information.

MapQuest is an American company, and I found consistent results for all the U.S. addresses I tested. With addresses in other regions of the world, that was not always the case. Location information such as state, district or city is often returned in the local language but sometimes the results are in English.

I hope this piece of code can help you with your project or just provide some insight into the application of JSL. I’m open to any questions or suggestions, but since it is part of the Geocoding 2020 add-in, I will only make changes that are consistent with its intended use in that add-in.

Last Modified: Sep 30, 2020 10:54 AM