Now I understand. The ANOVA assumes that the random errors are normally distributed. So let's not use ANOVA!
There is a simple way around all of your issues. First of all, the length of call time is generally not normally distributed. It is a measure of the life of the call or simply life time data. More generally, it is time to event data (event is end of call). You want to use Analyze > Reliability and Survival > Life Distribution instead of ANOVA. Select the Compare Groups tab at the top of the launch dialog. Select the column with the length of the call values and click Y, Time to Event. Select the column with the values that identify the agents and click Grouping. (Big assumption for now but let's get things going: all the calls were completed. That is, none of the length of call observations represent incomplete calls. That situation is known as censoring. We can deal with that case properly later if necessary.) Now click Go.
I suggest using the Weibull distribution model and scaling. Click the checkbox before Weibull and the radio button after Weibull. You are going to get a lot of information back at you.
First, the plot at the top in Compare Distributions is useful to visually assess goodness of fit and assess differences between agents. Second, the Summary report informs you about each agent. Open and examine the Wilcoxon Group Homogeneity Test. It assumes that the distribution of all the agents is the same but is significant if any agents are different. Third, there is a tab with an analysis of each agent. Each report is detailed and specific. You might not need or want all the information.
I will stop here and see if we are going in a good direction and if you have further questions.
Too add a bit to @markbailey 's recommended solution, I'll throw in another issue. What about focusing on the mean (or have you thought about median instead?) AND the variance/spread? Recall Jack Welch's famous quote, paraphrasing, "Customers rarely experience the mean...they feel and experience the variance." So if your ultimate goal is to improve or make more consistent AVERAGE contact length...I encourage you to also think about minimizing/reducing variance when your ultimate goal is improved customer satisfaction through reducing contact length VARIANCE.
Very true and a good comment. I usually make it a point to look at both the medians and the spread of the data to determine if the process is in control. Thanks for the feedback.
You can use the profilers in Life Distribution to get more than the mean or median. You can estimate any quantile (time) or probability you like.
Then if you want to really go crazy, if you have access to written transcriptions of calls (say in a .txt file) AND JMP Pro...then a whole world of Text Analytics and Predictive/Exploratory modeling work is at your fingertips. With JMP Pro you can analyze the free form text of agents conversations using simple word/phrase counts up to and including latent class analysis, topic analysis, and latent semantic analysis for exploration. From there between the document term matrix or other dimensionality reduction methods, it's a short leap over to the Generalized Regression platform and the quantile regression capabilities for modeling text (or it's surrogates) to contact time quantiles for median or say, 95th quantile. Now you have a link between words and talk time! And if you have customer satisfaction scores wrt to an engagement you can model these as well. Here's a link to a Mastering JMP event that illustrates much of this workflow:
https://www.jmp.com/en_us/events/ondemand/mastering-jmp/using_text_explorer_to_extend_analysis.html
Thanks for the advice and instructions. I have never used this platform but after reading more about it I see how it could be very useful in this situation. However, I am having some difficulty reading the output. I see under the Wilcoxon Group Homogeneity Test that the p-value is <.0001 so the variances are definitely not homogenous. With ANOVA I usually look at pairwise comparisons and can compare, but I am struggling with how to compare the Weibull. As you mentioned, I can visually assess the goodness of fit under Compare Distributions, but is there a way I can statistically determine the difference (like p-value) as we do with ANOVA?
You are correct, there is no analog to the choices for multiple comparisons as found in the Oneway platform.
The Wilcoxon test is an omnibus indicator of any difference. It is not specific to one parameter like the mean or variance. The plot at the top can help there, though. Parallel lines have the same variance or scale. Displaced lines have different mean or location. So if one agent is consistently completing their calls more quickly, their curve would shift to the left.
You also have the parameter point estimates and confidence intervals for each group (agent) for comparison, although that information is not the same as a multiple comparison test.
You can also use the profilers to extract information about each group. These answers are provided both as a point estimate and interval estimate.
I am not apologizing but simply recognizing that the methodology here comes from the reliability engineering field. The same methods were independently discovered in medical mortality and morbidity. The terminology, therefore, pertains to those fields but the methods are none the less relevant. It just requires a bit of translation. Sometimes it also requires reversing the goals. In reliability, an increasing hazard function is bad. In your case, though, it is good. It means that an event is more likely to happen. But in your case an event is not a failure, it is a completed call.
There are analogous methods for regression models with time to event data. So if you had covariates, you could include them in the model for lifetime and test them. There is a lot of flexibility here.
Thanks- this helps a alot. It definitely helps me with what I am trying to do. One last question- if the variances were equal and the population were normally distributed (or my sample size was sufficiently large) I could have used an ANOVA as I have in the past, correct? I just want to make sure I am not using the wrong tool for the job.
Yes.
If the variances were unequal but the errors were normally distributed, then you could use the Welch ANOVA, which JMP automatically provides if you select Unequal Variances from the Oneway platform menu (red triangle).
Life Distribution is the right tool in this case, as far as I can tell.