cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
francisk
Level I

How to perform simple intraclass correlation to determine interrater reliability?

I'm trying to figure out the best way to complete a relatively simple intraclass correlation (ICC) analysis. In my study, three raters scored 100 performance evaluations on a 1-10 scale for various components of performance.

 

Each rater scored each performance evaluation once. I ran the Measurement Systems Analysis platform with Rater as "X", Evaluation (1-100) as "part", and Score as "Y". I got a warning "Not enough data to compute the process standard deviation. Disabling options that require standard deviation" and the EMP results had an Intraclass Correlation Value (no bias) of 1, which is implausible.

 

I then ran the Measurement Systems Analysis platform with Evaluation (1-100) as "part", and Score as "Y" (without including Rater as "X") and appeared to get plausible correlation values that track variations for different components. I'm not conducting a reliability study per se and I'm simply using the mean performance score for all three raters for my substantive analysis, so I'm not that concerned about specific trends among the raters. My questions are:

1) Can I use the ICC output for the second analysis (without Raters) when I report results?

2) If not, what's the best way to calculate ICC with JMP? 

Thanks for any guidance you can provide! 

 

 

1 REPLY 1

Re: How to perform simple intraclass correlation to determine interrater reliability?

I think that the first set up is correct, but without replication, it is impossible to estimate repeatability. That situation leads to the warning message you got. Does your EMP Result look like this example:

 

icc.JPG

 

If so, notice that ICC is reported three times. They are all correct but represent different answers to different questions. Is it possible that you have both rater bias and interaction with the components? If so, use the second or third ICC, not the first one.