turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Comparing means for implicit data in JMP

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 8, 2014 1:02 PM
(1873 views)

Hi all,

I have a data table with columns showing frequency of events (in my case skiing accidents in different regions in the united states). The first column contains the region names. I want to compare means of Males and Females (Males and Females are another two columns in the data table) who had skiing accidents irrespective of region. I know FIT Y by X can do this, but I don't have a Y as all the numbers I have represents the number of cases. Is there a way I can compare means between Males and Females?

Thanks in advance.

Mike

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2014 1:10 PM
(3406 views)

Solution

Hi Mike,

You need to use **Tables** -> **Stack** to get a data table similar to mine with a column for Region, Gender and Accidents. Then you can use **Fit Y by X** to get your analysis.

Start with your table and in the Stack dialog add the Male and Female columns to Stack Columns and then name the Stacked Data Column "Accidents" and the Source Label Column "Gender".

You'll get a table like this.

Then you can use **Fit Y by X** with **Gender** as your X and **Accidents** as your Y.

You'll find the analysis options under the Red Triangle at the top.

Let us know how you make out.

-Jeff

-Jeff

6 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2014 12:02 PM
(1703 views)

I think you're saying that your data looks something like this:

Mountain | Gender | Accidents |
---|---|---|

Snowmass | Male | 5 |

Alta | Male | 10 |

Breckenridge | Male | 5 |

Copper Basin | Male | 4 |

Snowmass | Female | 7 |

Alta | Female | 2 |

Breckenridge | Female | 9 |

Copper Basin | Female | 7 |

I'm not quite clear on the analysis you want to do but you can use the **Accidents** column in the Freq role to indicate that this column is a count of how many times this row occurs.

Let me know if I've misinterpreted how your data is laid out, and if you can clarify what question you're trying to answer we can try to point you to the appropriate analysis.

-Jeff

-Jeff

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2014 12:28 PM
(1703 views)

Thank Jeff. I apologize for the non-clarity in the question. My data looks something like this. All the numbers inside the table represents number of accidents reported.

Region | Male | Female |
---|---|---|

Alaska | 25 | 5 |

Wisconsin | 10 | 4 |

Illinois | 5 | 3 |

NYC | 2 | 2 |

Detroit | 9 | 1 |

Jersey city | 15 | 0 |

Now, I would like to see if the mean of male accidents is significantly different from the mean of the female accidents, irrespective of region. The problem is I don't have a Y variable as the number of variables are implicitly embedded in the table, in other words, I don't have an explicit "accidents" column in the table. I hope I am little bit clear this time:) . By profession I am an engineer, apologies for my illiteracy in Stats.

Mike.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2014 12:54 PM
(1703 views)

With that layout you can try the *Matched Pairs* analysis platform (Add Male and Female columns as Y). However, you'll have more options if you stack the columns (*Stack* in *Tables* menu). Then you can use the Fit Y by X platform to compare means with Gender as X and the count data as Y. The variance appears higher for for males so you may want to look at a nonparametric method which are found in the red triangle menu in the Fit Y by X results window.

Here's an example script that does the above (paste into a script window and hit run!):

// Example table

dt = New Table**(** "Accidents",

Add Rows**(** **6** **)**,

New Column**(** "Region",

Character,

Nominal,

Set Values**(** **{**"Alaska", "Wisconsin", "Illinois", "NYC", "Detroit", "Jersey city"**}** **)**

**)**,

New Column**(** "Male",

Numeric,

Continuous,

Set Values**(** **[****25**, **10**, **5**, **2**, **9**, **15****]** **)**

**)**,

New Column**(** "Female",

Numeric,

Continuous,

Set Values**(** **[****5**, **4**, **3**, **2**, **1**, **0****]** **)**

**)**

**)**;

// Stack table

dt_stacked = dt << **Stack****(**

columns**(** :Male, :Female **)**,

Source Label Column**(** "Gender" **)**,

Stacked Data Column**(** "N Accidents" **)**

**)**;

// Compare means

dt_stacked << **Oneway****(** Y**(** :N Accidents **)**, X**(** :Gender **)**, t Test**(** **1** **)**, Wilcoxon Test**(** **1** **)** **)**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2014 1:04 PM
(1703 views)

Wouldn't you just be comparing two numbers then, the total number of males vs females, if you're not interested in region?

This is a flawed analysis though, because you need the number of accidents per skiers-day really otherwise busier hills will always have more accidents and generally more males ski so there will be more male accidents.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2014 2:36 PM
(1703 views)

Jeff,

Thanks for the input. I was not aware of such a powerful and robust command !

Mike

Reeza,

Thanks for the input. Agree, the data must be normalized to a characteristic quantity to get a sensible prediction. thanks.

Mike

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 9, 2014 1:10 PM
(3407 views)

Hi Mike,

You need to use **Tables** -> **Stack** to get a data table similar to mine with a column for Region, Gender and Accidents. Then you can use **Fit Y by X** to get your analysis.

Start with your table and in the Stack dialog add the Male and Female columns to Stack Columns and then name the Stacked Data Column "Accidents" and the Source Label Column "Gender".

You'll get a table like this.

Then you can use **Fit Y by X** with **Gender** as your X and **Accidents** as your Y.

You'll find the analysis options under the Red Triangle at the top.

Let us know how you make out.

-Jeff

-Jeff