cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
TCM
TCM
Level IV

What technique do I use to justify eliminating measurements at a specific timepoint in a series?

I have measurements of an attribute at initial, 2hr, 4hr, 6hr. I want to drop the 4hr measurements because it does not add much value to the business application.  The measurements exhibit a negative slope from initial to final... What technique or rationale might I use to justify eliminating the 4hr measurement? [I was thinking along the lines of Information Gain, but I admit I don't know much about it beyond the context of Decision Trees].  I appreciate any guidance.timed series.jpg

3 REPLIES 3
P_Bartell
Level VIII

Re: What technique do I use to justify eliminating measurements at a specific timepoint in a series?

IMO the technique or rationale is not a statistical methods question but one driven by knowledge of the process, data, and problem at hand. If the 4th data point adds no value, and eliminating it is not unethical or against some application accepted standard or process rationale...then just use the JMP exclude/hide capability to eliminate those observations from any analysis or reports.

TCM
TCM
Level IV

Re: What technique do I use to justify eliminating measurements at a specific timepoint in a series?

Thank you for your response.  To be clear, it is not the 4hr result that I am eliminating, but the analysis altogether so that going forward, we measure only at initial, at 2hrs and at 6hrs.

 

I can convince myself to not do the 4hr measurement, but as the data scientist, I think I would need to advance a technical rationale for the action with the domain subject experts.

 

Teresa

 

 

ih
Super User (Alumni) ih
Super User (Alumni)

Re: What technique do I use to justify eliminating measurements at a specific timepoint in a series?

I agree with @P_Bartell that this is more a business question than a statistical question.  I am guessing this data is used to make a decision.  If so you could do this:

 

  • Using all three points, work out what decision you should have made in each instance in history (this might or maybe should match what actually happened).
  • Remove the second point, pretend it was never taken.  Adjust your procedure to only use the other two points.  Now retroactively figure out what decision they would have made for each instance in history.


Were any different decisions made? If so what was the impact of those different decisions? This could be an actual cost from customer complaints, rework, reduced capacity, or it could be a change in risk.  What would be saved by removing the additional test?  Now someone in the business needs to decide whether that test is worth the cost.