If you were to ask me where my obsession with road trips began, I would have to say it probably started shortly after my wife and I drove I-40 from Blacksburg, VA, to Albuquerque, NM, to start my first job out of graduate school. Perhaps there was something in seeing the changing landscape and visiting new places I found exciting. Though, to be honest, it probably is more likely due to a compulsion to “finish the route” and drive the rest of I-40 on both ends. I’ll compromise and say it was a bit of both.
Initially, my road trip fetish focused primarily on traveling all the major US Interstates (those that ended in a 0 or a 5), seeing as how I had nearly finished one already. However, as I perused various travel books, I found that the road trip community tended to dismiss the interstates as avenues for commerce and all-around "busyness," instead preferring the quieter, slightly slower pace of America’s other major roadways: the US Highway system. Mapping out several of these highways triggered the excitement of new places and landscapes. My new dream was to travel on every major US Highway, that is, all those with two-digit designations that were still active, the only exception being Hwy 101 in California.
Since starting my position recently as a Research Statistician Tester at JMP, I’ve learned more about the software than I ever thought possible. I was surprised at how quickly I learned JMP Scripting Language (though I’m certainly still learning JSL), which opened even more possibilities for me. But, once I discovered that the drag-and-drop Graph Builder had the ability to plot maps?! Well, you can probably guess where this is going (if the title wasn’t a dead giveaway).
After overcoming my initial excitement, there was a moment of skepticism. Could you actually plot out a road trip with Graph Builder? I knew you could plot regions, but what about lines and roads? Well, I’m happy to say that, after spending perhaps more time than I care to admit, the answer is YES! It ain’t no Google Maps or AAA Trip Tik, but I think it’s definitely worth a look.
So the first question to answer was what kind of road trip was this going to be? Well, my dream was to travel every US Highway, so why not make a trip that did just that! Specifically, the goal would be as follows:
Find a path through the US Highway system that connects every active primary US Highway.
While my inner completionist would balk at the possibility of not traveling every single mile of pavement that made up a particular highway, some compromise had to be made, and this would still serve to be a great challenge in any case. But first, some ground rules:
Seems straightforward enough, right? The first step was to compile a database of highway intersections, showing which highways connected to which other highways and the locations of these intersections. I spent a great deal of time trying to find such a database ready-made, but no such luck. So I decided to make my own in a data table.
It’s worth mentioning that often times highways will intersect one another multiple times or even run concurrent with one another. I only recorded the first intersection that occurred when following a highway from north or east to south or west, imbuing my network with an inherent directional bias. In fact, that’s exactly what I was building: a network of US Highway “nodes” based on their intersection “edges.” Once this network was complete, the goal would then be to find a path through the network that connected each node with no node being repeated in the path (if possible).
For those familiar with graph theory, this might sound like I’m working with the famous “traveling salesman problem.” And you would almost be right. The traveling salesman is actually a special case of what is known as a Hamiltonian path, which is a path that connects every single node with no repetitions (not to be confused with an Eulerian path, which is a path that uses every single edge in the network without repetition and is associated with the famous Koenigsburg bridge problem). What makes the traveling salesman a special case is the fact that the edges have weights (say, time or distance), and so there is the added condition that the path must minimize the total weight. Fun fact: The Hamiltonian path gets its name from William Rowan Hamilton, who was working on finding a path connecting all the vertices of a dodecahedron. At one point, he pretended that the vertices were famous cities and he was searching for a “world tour.” I have a feeling we would have gotten along quite well.
This Hamiltonian path is exactly what I was looking for! Since it was a well-known object, there should be an efficient algorithm to find it, right? Well, not so fast. You see, the problem of finding a Hamiltonian path through an arbitrary graph is what is called an NP-hard problem. This is just a fancy term to say that it’s very easy to verify if a path is Hamiltonian, but quite difficult to find such a path for all but a few exceptions. In other words, there is no general fast algorithm to find a Hamiltonian path for every situation. So, once again, I set out on my own to see if I could develop an algorithm for my case.
I first started by converting my network to an associative array in JSL and then looked into a simple algorithm for finding a path in a small network written by a colleague. This algorithm utilizes a Depth-First Search, which involves starting at a node in the network and following the first connection you come across for every node thereafter until you run out of connections, at which point you go back to another non-connecting node and keep going. While this algorithm didn’t guarantee a Hamiltonian path, I found I could still use it to try and get close. The algorithm I derived adapting this approach is as follows:
Near-Hamiltonian Path Finder
The idea behind this algorithm is that adding nodes with the longest potential path should, in theory, get you much closer to an overall path throughout the network by reducing the risk of having a path terminate early.
Now that I can get an initial path, the next step was to determine how to “graft in” any missing nodes that got left out of the initial run. I came up with three possible methods:
In searching for my Hamiltonian path, I actually ended up using each of these approaches at least once. In the next post, you can see the fruits of my labor.