In my previous blog entry, I discussed the discrepancy between WordPress and Google Analytics view counts. Today, I'd like to look more closely at what the data may be telling us.
If I think about modeling the number of WordPress views that show up on a blog entry, I would think using Google views would be sufficient even on its own as a predictor. Recalling the scatterplot from the previous post, it still looks like something is amiss (for those interested, the r-squared value is 0.44):
I do have additional information at my disposal, so I thought it would be interesting to see if anything else seems to drive the WordPress views. As a first try, I want to keep things simple, but here are the variables I have at my disposal:
Did anything seem useful?
My different attempts at modeling are best left for another day/blog post, but here are the results from using Fit Model with Stepwise using the above factors with main effects and two-factor interactions for the factors above:
Actual by Predicted Plot
While it’s not surprising that Google views is in the model, what did surprise me is how much comments drive up the WordPress count. In addition, I might have thought Google views would already account for the days online, but the fact that it doesn’t (as well as the nonzero intercept) suggests that WordPress counts end up accumulating much more over time.
The LinkedIn counts also show as significant, but not quite at the same level as days and comments in some of the other modeling I tried. Based on the data, it’s not possible to tell how many people actually saw the link and were encouraged to click on it. Likewise, with tweets not even showing up in the model, I don’t have the information as to how many followers the people who tweeted have. So using LinkedIn and Twitter counts don't really help us understand the view count popularity.
I also looked for a model that didn’t even use the Google views, and while it had some extra terms, it still looks pretty good:
I should mention that fitting a model for Google views is not very effective (even with the number of days, etc., which is a bit surprising). While there were more terms in fitting WordPress views without using Google, the biggest drivers were days online and comments.
Trying to use this model for previous years doesn’t perform that well. There are some posts that seem to get lots of traffic, whether due to keywords or some other mechanism. As for the large discrepancy between WordPress and Google, I think the true number of views (whatever that actually means) is somewhere in between. Some of the Google views are so low that it’s hard to imagine so few people have seen some of the entries. However, it’s also hard to gauge how many people have “viewed” entries via the index page for the JMP Blog (http://blogs.sas.com/content/jmp/) rather than clicking on an individual post to read it.
Note to self (and my fellow bloggers): To try to make the top 10 list for 2015, post early and get lots of comments… although I doubt leaving myself many comments will have the desired effect. An anonymous colleague jokingly offered another idea: purposefully add typos, hoping a kind soul leaves a comment to correct it.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.