Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar
Community Manager Community Manager
Using JMP to Count Cars

This blog post was written by a blogger who is no longer at SAS

I'm a rising senior at the North Carolina School of Science and Mathematics. During this summer, I worked as a technical summer student at SAS.

While JMP 8 doesn’t have image manipulation support, it does allow the user to create a custom DLL that can be invoked from JSL. So I created a DLL that allowed some functions from ImageMagick, an open source image library, to be called from JSL.

Using this DLL, I tried to count cars in photos from some of the North Carolina Department of Transportation's Webcams. Here's an image from the one near Exit 289 off Interstate 40, in the Raleigh area:

Unmodified image from DOT webcam

Unfortunately, the more distant parts of the road are hard to see because everything blurs together. So I didn't look at that part. I cropped it out.

To make the later calculations easier, I also converted the images to grayscale, so I only had to work with a single intensity for each pixel, rather than three color channels.

The image then looked like this:

Image from DOT webcam, cropped and converted to grayscale

In order to count the cars, I needed to know what parts of the pictures were cars. Since the Webcams update every 3 minutes, I had a lot of other pictures to which to compare each one. So for each picture, I analyzed, I looked at the 10 pictures before it and the 10 pictures after it and averaged the 20 of them together. The result was pretty close to a picture of the road with no cars, as you can see:

Average of surrounding pictures

This picture told me what parts of the image I was looking at were not cars. By subtracting it from the image I was analyzing, I got a nice picture of what parts were cars:

Difference between current picture and average of surrounding pictures

At this point, it is pretty easy, even from a script, to tell which parts of the picture are cars. The gray regions are cars; the dark regions are not. There were some regions that were dark gray, where it wasn't clear whether they should be considered cars or not. I found that what seemed to work best for them was to consider the areas where the color intensity was less than 60 to be background, and the areas where the intensity was 60 or more to be cars. With this sort of filter applied, the picture looks like this:

Difference picture with filter applied

I then had a collection of (x, y) coordinates of pixels that composed several cars. Since the pixels of a single car should be closer to each other than to those of other cars, I tried using JMP's cluster analysis tool to divide these points into clusters. Ideally, each cluster would represent one car. Here's that same picture with each cluster given its own color:

Picture with clusters given unique colors

As you can see, it seems to have done a pretty good job on this picture. But since JMP's cluster analysis tool needs a number of clusters before it can do the analysis, it isn't the best tool for automated counting. That's why so many individual pixels seem to be their own clusters; JMP is splitting them up into the default number of clusters, 20, when there are only 8 cars in the picture.

Because JMP's cluster analysis tool isn't the best tool for this job, I ended up using a different one. Once you account for the fact that more distant cars look smaller, most cars are of similar sizes. Therefore, back at the black-and-white step, I could count white pixels (making sure to account for the distances of the cars from the camera) to get an approximation of the number of cars in the picture.

Around this time I also decided to switch Webcams. The problem with this one was that it moved around a lot, which prevented the averaging from working. Here are four consecutive pictures from it:

Picture from DOT webcam

Picture from DOT webcam

Picture from DOT webcam

Picture from DOT webcam

These four are a bit more extreme than most, but you get the idea. After some looking, I found a different Webcam that didn't move around. Here's a picture from it:

Picture from different DOT webcam

Since the size of a car varies with the y-coordinate in a predictable, linear way, it is possible to adjust for it using relatively simple calculations. Unfortunately, doing this requires identifying four points (two on each side of the road), and communicating those four points to a script is rather tricky.

There is an easier way. JMP has a modeling platform. If you count cars in some of the pictures by hand and give that data to JMP, it can create a model that will predict the number of cars in the other pictures, given the location and distribution of pixels in them.

I divided each image into 5-pixel-tall horizontal stripes, as is approximately shown in the (cropped) copy below of the image above.

Image split into 5-pixel-tall stripes

After subtracting the average picture from it, I counted the number of pixels with intensity higher than 60 in each stripe in each pixel. I could have fit a model right then; but since I was trying to explain the number of cars based on the number of pixels in each stripe, there were roughly 30 explanatory variables -- one for each stripe. I would have needed a lot of images to create the model, which wouldn't have left many to use it on.

To fix this, I extracted principal components. Principal components are linear combinations of the original 30 explanatory variables computed in such a way that the first principal component explains the most variation in the data and the 30th the least variation. It turns out that the top 11 principal components explained more than 95% of the variation. So by modeling based on the principal components instead of the raw data, I reduced the number of explanatory variables to roughly a third of what it was, while still retaining 95% of the information.

I hand-counted cars in a randomly chosen selection of 120 pictures (I had more than 600, so this was only a small portion of them) and fed those numbers, along with the pixel counts, to JMP's Fit Model platform. JMP came up with a model that I used to create a table of calculated number of cars by time. I then graphed that data for Thursday afternoon and Friday. With a spline fit to it, the graph looks like this:

Graph of number of cars over time with fit spline

This graph does indeed show the traffic patterns I expected. There are significant rises in traffic around morning and evening rush hours as well as lunchtime.

Article Labels

    There are no labels assigned to this post.

Article Tags

Liang Liem wrote:

Interesting. If it is night time and the streetlights are on would it still work.

>Saying what practical use the technology

Maybe it can be used to dim streetlights. If low traffic intensity light level may be low but if traffic intensity increases light level can be increased as well.

At some street it can be no cars at all except for some short periods when something happens like a car ferry arrives or a hockey game ends. if we could detect that the number of cars exceeds a thresholds then such information could be sent to a server that then sends commands accordingly to different dimming controllers along the street.




aki wrote:

Hey, Greg!

Pretty interesting! A nice report.

In looking at the plot at the end, I was thinking that if you count too often during rush hour, the same cars would still be in the picture a few minutes later in some distance. So you'd have to be careful about how you "add up" the data (which you're not doing in this report).... Recognizing that the same car in the previous picture is still up there farther away is a pretty hard problem, I would guess... They can change lanes, too, so you can't just recognize a cluster of them, re-scaled...

Well, but maybe those issues just work themselves out when you collect a large amount of data automatically, so maybe there's no real issue in collecting real data....

Saying what practical use the technology can be deployed for would help make the report even more interesting. (Like automatic traffic flow reporting, or capacity analysis, or whatever...)

Anyway, I read your report with interest. thank you!



Douglas M Okamoto (Data to Information to Knowledge) wrote:

I wish to congratulate the enterprising SAS technical summer student on his development of a data association tracking system to identify vehicular traffic in and around Raleigh, NC using the JMP platform.

Three questions, if I may.

Q1. What about systematically sampling pictures taken every 15 minutes? Thirty hours of webcam video surveillance with pictures updated every three minutes gives 600 pictures altogether. A systematic sample of 120 pictures should be more representative than a random sample of the same size.

Q2. Was heightened perspective one of the 30 some odd explanatory variables from which 11 principal components were combined? In the cropped and segmented picture, vehicles in the foreground segments are more pixelated than those in the background ones.

Q3. is the 16-car outlier around 1:00PM due to folks hurrying back to work after lunch at the Davis Family Bar-B-Q in Morrisville?


US Department of Transportation, Traffic Detector Handbook: Third Edition, Volume I, Publication No. FHWA-HRT-06-108 (October 2006)

â ¢ http://www.tfhrc.gov/its/pubs/06108/06108.pdf