Choose Language Hide Translation Bar
Staff (Retired)

Re: Does k means sample size estimate require a normal distribution?

Hi, all. Sorry to jump into the conversation.


This is what I am hearing...


Twoolman has data from insurance companies that represent a number of surgeons, but it may not represent all surgies a surgeon may conduct. Only a subset of surgeries are provided by an insurance company based on the surgeries that are covered by the insurer.


So it sounds as if the number of observations is fixed, and it sounds as if Twoolman is asking whether the k-means power calculation can be used as a way to exclude certain physicians based on having too few observations.


In general, I probably would not do this. The power calculations are for a planned experiment for a particular delta to observe. As this delta gets smaller (or the variability increases) we would need more sample per group, fewer in the reverse scenario. Prospectively, we would try to get as many surgeries as possible to meet the criteria of the calculation. However, given that the "experiment" is already completed, we are taking the observations as they are. I wouldn't exclude surgeons with small # of surgeries just because they may be quite different than other surgeons.


However, one has to be careful here. Comparisons (assuming patients somehow arrive randomly to the set of surgeons analyzed) may identify statistical differences between the surgeon, but the question then becomes... are these differences clinically meaningful? This is often a challenge for many endpoints - we may identify statistical differences between groups but have no idea whether the differences we have found are of any practical importance.


Mark does point out that these data are observational, so there are likely a number of additional factors that would need to be controlled for that a comparison between the surgeons would not account for. This could involve covariates in a model, or the use of propensity scores. I am not sure if JMP has any features for propensity scores, but SAS has two new procedures described here.