Merging city and state info into a table with 9 digit zip codes
Oct 5, 2015 10:49 AM
In my last post, I detailed how to obtain 9 digit zip codes and their X and Y coordinates from a table available from the SAS support site. I passed this information along to the customer in question and he was able to merge in the 9 digit zip information into his table. His next question regarded obtaining the city and state information corresponding to these 9 digit zips.
I did a little digging online and found a graphic that showed the 5 digit zip prefixes for various states followed a predictable pattern.
Next, I found a table on Wikipedia that contained state names, cities and zip code prefixes and figured out how to join it to the 9 digit zip table I created in my last post.
In JMP, you can do the following to add City and State info to any table containing 5 digit zips:
Click File > Save As Data Table. In the dialog that pops up, leave everything as selected and click OK. A bunch of tables will open in JMP.
Click Tables > Concatenate, and remove anything currently showing in the selection list. Now choose all the open tables that contain zip code info. Their names start with "List of Zip Code prefixes". Check "Create source column" in the Concatenate dialog and click OK. You will now have a big table with multiple columns of data. We need to stack them to get them all in one column.
Click Tables > Stack, select Column 1 through Column 10 and choose them as the Stack Columns. Click OK. Now we have a table with columns Source Code, Label, and Data. Data contains the 3 digit prefix of the zips, the state and location. But it still needs some cleanup.
Select the Data column header, then click Cols > Utilities > Text to Columns. For the delimiter, enter a space. (Note: the Utilities menu and Text to Columns are only available in JMP 12. If you are working in an earlier version, you have a few options to try to work around this: you can download Jeff Perkinson's TexttoCols addin from the JMP file exchange, drag it into JMP to install, and it will add an item to your add-ins menu, or you can save your JMP table in Excel format, then use the Text to Columns feature in Excel also splitting on spaces, then re-import the Excel file into JMP.)
Double click on the Data 1 column header to rename it to Zip prefix
Double click on the Data 2 column header to rename it to State. With the State column selected, click control-F (or Mac equivalent) and check the Restrict to Selected Column box. Find *, replace with nothing. Find Not, replace with nothing.
Select columns Data 3 through Data 8 with shift click. Click Cols > Utilities > Combine Columns. (Again, only found in JMP 12, though you can use a formula column for concatenation in an earlier version). In the Combine Columns dialog, specify City as column Name and replace the "," in the Delimiter box with a space. Click OK.
You now have a column of the city names put back together with spaces. We want to get the "Not in use" string back in the rows where it should be, so do find and replace again, with that column selected, replacing "in use" with "Not in use" as it was originally before we removed the Not in the state column. You can check the table-basically all the rows with no state listed should have Not in use instead of a city name under the city column.
Now in your zip code data table, you will need to parse the zip code variable to get the first 3 digits to match with our newly imported table.
Open your zip code table, then double click on the first empty column after the existing columns. This will create a new column called Column 7. Double click to bring up the Column Properties dialog. Change data type to Character and rename the column to Three digit prefix. From the Column Properties pulldown in that dialog, choose Formula.
In the Formula Editor window that opens, choose ZIP under the columns list under Table Columns, then Character > Substr under the column chooser under Functions. In the Start box, type 1, and in the Length box, type 3. Click OK.
If you now do Tables > Join and match by the two 3 digit prefix variables, you should have the street, city, state, and zip variables all together. I used these settings to join. Note the matching columns, and Include non-matches checkboxes for both tables.
I used the steps above to add city and state info into my ZIP4 data table containing the 9 digit zip codes I imported from the SAS support site and graphed the X and Y coordinates: