I am a fourth year undergraduate student working on my honors thesis and I have run into an issue with calculating a pooled kappa statistic in JMP.

While I understand that Cohen's kappa is normally calculated between two raters who provide ratings in a 2x2 agreement table format, what I want to do is to calculate a measure of agreement between two groups of raters. The first group, highlighted here, provided ratings from a likert scale of 1-3 (0-3 for the last criteria) to 11 criteria "C Attn R, tCLOppR..." as indicated along the column headers. This scale corresponds to the frequency of a behavior, with 0=Unavailable, 1=not consistently used, 3=consistently used throughout the session.

Group 1

The second, different group of raters also provided their ratings for the same 11 non-dichotomous criteria on equivalent rating scales.
Group 2
There are 117 raters in both the first group and the second group of raters. It is simple enough to use the Fit Y by X platform to calculate 11 pairwise kappas and average them but I doubt that this is the proper procedure for determining the overall agreement between two groups of raters (in addition to multiple rating criteria/categories). I would appreciate any advice or suggestions in using JMP for this type of analysis.

There are other platforms in JMP that you can use for analysis of agreement.  Please check out the example http://www.jmp.com/support/help/13/Example_of_an_Attribute_Gauge_Chart.shtml#304555

that involves multiple raters and multiple items (i.e, parts). It seems from the data you showed you would need to turn your data around into one where questions as rows and ratings as columns.

As for assessing an overall agreement between two groups of raters I would suggest you sum up the ratings across raters in each group and then calcuate the Kappa on the group ratings.

