Subscribe Bookmark RSS Feed

Can Pedometric Data (Daily Walks) predict Weight Loss)?

jenkins_macedo

Community Trekker

Joined:

Jul 13, 2015

Just  using JMP Pro 12.01 in my Own Life

I want to predict whether these predictor variables (X), # steps, #kCal, Days In Action, Walk Frequency, Distance (mile), Time In Action (minutes), Speed (mph), Week Days, and Hours of Sleep can predict weight lost (kg) (Y Response).

I used SAS JMP Pro 12 and conducted a simple Standard Least Squares Regression. There are multiple of models that can be build in JMP Model Fits Platform including advance Partition using decision trees, boostrap Forest, Boosted Forest and Neural Networks. But just to show how you can use the correlation matrices of your data quickly to develop a model in SAS JMP Pro 12.

From the model output attached from JMP Pro Standard Least Square Regression Model you can see that the Summary of Fit shows that the predictors (X) variables are predicting weight loss (kg) (Y) at R-Square 0.83 with an RMSE  of 0.88, which is pretty good. This shows that we are 83% sure that weight loss can be reduced if all of these factors are considered.

However, the ANOVA also shows that the variation explained by the R Square that is 0.83 is statistically significant at P-Value of <0.0001* with a F-Value of 18.84.

Lastly, because your model needs to be validated and tested, the training (which is the set if data on which the model was developed) are different from the sets of data the the model was validated and tested against and all show an R-Square value of (Training = 0.83, Validation = 0.80 and Test = 0.80), which further demonstrates that you can be more 80% sure that the data is 80% predicting the response effect.

This analysis is part of an on-going personal research using daily pedometric data to predict weight loss and BMI.

Data used in the personal study is currently on-going and would continue to be tracked for the next 5 months.

Jenkins Macedo
6 REPLIES
David_Burnham

Super User

Joined:

Jul 13, 2011

Thanks for sharing your model.  I suspect that you have fallen into a common trap: JMP makes it easy for us to build models and generate various statistics.  The trap is that we focus on the statistics and disengage our scientific thinking.  You point out the high R-square statistic, but does it make sense?  How is the model achieving this?  Your title question was whether "pedometric data predicts weight loss".  From the science we know that exercise translates to calorie burn and net daily calories should correlate to weight loss - but your model seems to ignore the pedometry data and relies only on 'days in action' to explain the variation in your response data. I would interpret the model as saying "the longer that you exercise the more weight you loose". My guess is that you are measuring cumulative weight loss (i.e. your weight each day), but if you want to understand the effect of your per-day pedometry data then you need to be looking at daily weight change.  I hope you will treat these as positive comments.  I think you've collected a lot of interesting data, and there should be a lot more useful information (including some interesting correlations between your x-variables!).

Dave

-Dave
jenkins_macedo

Community Trekker

Joined:

Jul 13, 2015

Doesn't it make sense the longer you exercise the more weight you will loss? Weight loss is computed as an average of daily AM Weight and PM Weight. Alternatively, you suggestion of a daily weight change also makes a valid point.

Jenkins Macedo
jenkins_macedo

Community Trekker

Joined:

Jul 13, 2015

After taking a closer look at the variables, i have decided to drop Days In Action since another time sensitive variable was used "Time in Action", which is more specific to measure in minutes the amount of time used to walk/exercise. I think this now makes more sense than the previous. But Dave, thanks as your suggestions allowed me to look closely at the relevance of that variable.

Jenkins Macedo
jenkins_macedo

Community Trekker

Joined:

Jul 13, 2015

Good thinking. The results is inconclusive at this time being data collection is on-going. I just posted this preliminary to engage folks like you for some positive feedback. Indeed the R-Square value does explain the portion of variation that the model is predicting and one should not and must not exclusively use the R-Square values to conclude that weight loss is solely a function of "days in action." Once I complete the data collection process, i will conduct a model scientific analysis and explanation of the results using various statistical tests, parameter estimates of the model, R-Square value as well as the P and F value to make my conclusions. In no way did I see you comments negatively. In fact, that is the exact reason I posted the initial analysis to get to see what other experts have to say. This is something I started just to encourage myself to kick some weight loss after bombardments from my wife. But as much as I want to do this, I have to undertake it such that it validate what others have found out or lead to new knowledge. I do take the daily average of my AM Weight and PM weight, which leads to the "Adj Weight Loss (kg). I will keep you folks posted. 

Jenkins Macedo
Peter_Bartell

Joined:

Jun 5, 2014

Jenkins: Whew...I was beginning to think I'd have to give you my rant about the dreaded disease "Mononumerosis". Which is my diagnosis when individuals use a single statistic, in isolation, to make broad and sweeping conclusions from analytical results. My unscientific survey suggests to me that R**2 is one of the most insidious carriers of this dreaded analytics disease. I can't count the number of times I've seen people look at R**2 and ignore all the other regression based model fitting diagnostic aids...F - Ratios, p - values, actual response vs. factors plots, residual plots, and on and on. If R**2 is number one, the Cp and Cpk are the next two...

jenkins_macedo

Community Trekker

Joined:

Jul 13, 2015

Hi Peter,

I agreed that to make a comprehensive conclusion on statistical results, one has to take into consideration all relevant stats presented; thus my emphasis that the R Square value should not be used as the only mark of your result. I agree with those you mention and the result posted here is inconclusive and results from other models suggest otherwise.

Jenkins Macedo