Subscribe Bookmark RSS Feed

Comparing means for implicit data in JMP

mikethejumper

Community Trekker

Joined:

Jan 8, 2014

Hi all,

I have a data table with columns showing frequency of events (in my case skiing accidents in different regions in the united states). The first column contains the region names. I want to compare means of Males and Females (Males and Females are another two columns in the data table) who had skiing accidents irrespective of region. I know FIT Y by X can do this, but I don't have a Y as all the numbers I have represents the number of cases. Is there a way I can compare means between Males and Females?

Thanks in advance.

Mike

1 ACCEPTED SOLUTION

Accepted Solutions
Jeff_Perkinson

Community Manager

Joined:

Jun 23, 2011

Solution

Hi Mike,

You need to use Tables -> Stack to get a data table similar to mine with a column for Region, Gender and Accidents. Then you can use Fit Y by X to get your analysis.

Start with your table and in the Stack dialog add the Male and Female columns to Stack Columns and then name the Stacked Data Column "Accidents" and the Source Label Column "Gender".

4724_Stack-4.png

You'll get a table like this.

4725_Stacked Data.png

Then you can use Fit Y by X with Gender as your X and Accidents as your Y.

4726_untitled_5__Fit_Y_by_X_of_Accidents_by_Gender.png

You'll find the analysis options under the Red Triangle at the top.

Let us know how you make out.

-Jeff

-Jeff
6 REPLIES
Jeff_Perkinson

Community Manager

Joined:

Jun 23, 2011

I think you're saying that your data looks something like this:

MountainGenderAccidents
SnowmassMale5
AltaMale10
BreckenridgeMale5
Copper BasinMale4
SnowmassFemale7
AltaFemale2
BreckenridgeFemale9
Copper BasinFemale7

I'm not quite clear on the analysis you want to do but you can use the Accidents column in the Freq role to indicate that this column is a count of how many times this row occurs.

Let me know if I've misinterpreted how your data is laid out, and if you can clarify what question you're trying to answer we can try to point you to the appropriate analysis.

-Jeff

-Jeff
mikethejumper

Community Trekker

Joined:

Jan 8, 2014

Thank Jeff. I apologize for the non-clarity in the question. My data looks something like this. All the numbers inside the table represents number of accidents reported.

RegionMaleFemale
Alaska255
Wisconsin

10

4
Illinois53
NYC22
Detroit91
Jersey city150

Now, I would like to see if the mean of male accidents is significantly different from the mean of the female accidents, irrespective of region. The problem is I don't have a Y variable as the number of variables are implicitly embedded in the table, in other words, I don't have an explicit "accidents" column in the table.  I hope I am little bit clear this time:) . By profession I am an engineer, apologies for my illiteracy in Stats.

Mike.

ms

Super User

Joined:

Jun 23, 2011

With that layout you can try the Matched Pairs analysis platform (Add Male and Female columns as Y). However, you'll have more options if you stack the columns (Stack in Tables menu). Then you can use the Fit Y by X platform to compare means with Gender as X and the count data as Y. The variance appears higher for for males so you may want to look at a nonparametric method which are found in the red triangle menu in the Fit Y by X results window.

Here's an example script that does the above (paste into a script window and hit run!):

// Example table

dt = New Table( "Accidents",

  Add Rows( 6 ),

  New Column( "Region",

  Character,

  Nominal,

  Set Values( {"Alaska", "Wisconsin", "Illinois", "NYC", "Detroit", "Jersey city"} )

  ),

  New Column( "Male",

  Numeric,

  Continuous,

  Set Values( [25, 10, 5, 2, 9, 15] )

  ),

  New Column( "Female",

  Numeric,

  Continuous,

  Set Values( [5, 4, 3, 2, 1, 0] )

  )

);

// Stack table

dt_stacked = dt << Stack(

  columns( :Male, :Female ),

  Source Label Column( "Gender" ),

  Stacked Data Column( "N Accidents" )

);

// Compare means

dt_stacked << Oneway( Y( :N Accidents ), X( :Gender ), t Test( 1 ), Wilcoxon Test( 1 ) );

reeza

Community Trekker

Joined:

Jun 23, 2011

Wouldn't you just be comparing two numbers then, the total number of males vs females, if you're not interested in region?


This is a flawed analysis though, because you need the number of accidents per skiers-day really otherwise busier hills will always have more accidents and generally more males ski so there will be more male accidents. 

mikethejumper

Community Trekker

Joined:

Jan 8, 2014

Jeff,

Thanks for the input. I was not aware of such a powerful and robust command !

Mike

Reeza,

Thanks for the input. Agree, the data must be normalized to a characteristic quantity to get a sensible prediction. thanks.

Mike

Jeff_Perkinson

Community Manager

Joined:

Jun 23, 2011

Solution

Hi Mike,

You need to use Tables -> Stack to get a data table similar to mine with a column for Region, Gender and Accidents. Then you can use Fit Y by X to get your analysis.

Start with your table and in the Stack dialog add the Male and Female columns to Stack Columns and then name the Stacked Data Column "Accidents" and the Source Label Column "Gender".

4724_Stack-4.png

You'll get a table like this.

4725_Stacked Data.png

Then you can use Fit Y by X with Gender as your X and Accidents as your Y.

4726_untitled_5__Fit_Y_by_X_of_Accidents_by_Gender.png

You'll find the analysis options under the Red Triangle at the top.

Let us know how you make out.

-Jeff

-Jeff