cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
GroupSquareWolf
Level III

Weighted mean

GroupSquareWolf_0-1662980821481.png

 

 

In this example, I would like to calculate Column 2 mean based on mean of each Group. Group 3 has two values, and the mean of Group 3 is (3+9)/2= 6. So I would need to get the by-Group column 2 mean : (3+3+(3+9)/2) /3 = 4.

 

When I use Tabulate or Column Mean (:Column2 ) to get the statistics, each individual column2 values were used separately for calculating the mean (3+3+3+9)/4 = 4.5. 

GroupSquareWolf_1-1662980979144.png

 

How to get the correct "by-Group" mean?

 

Thanks

 

 

 

6 REPLIES 6
pauldeen
Level VI

Re: Weighted mean

I do not really understand which means you are trying to calculate? You want to take the mean of Column 2 by Group? I do no tunderstand how subgroup needs to play into this. Trying a couple of things to see if it does what you need:

1. Create a new column and add this formula:

Col Mean( :Column 2, :Group, :Subgroup )

2. if you want that same idea but do not need Subgroup to be part of the by:

Col Mean( :Column 2, :Group)

(same result because of the unfortunate way you choose your example table)

 

To match your example calculation  (3+3+6) /3 = 4 you would do just the Subgroup column but that does not match your words where you talk about Group:

Col Mean( :Column 2, :Subgroup )

I hope that what you are looking for is in here somewhere!

GroupSquareWolf
Level III

Re: Weighted mean

I edited my question to make it clearer. 

Thanks!

 

Victor_G
Super User

Re: Weighted mean

Hello @GroupSquareWolf,

 

If I understand correctly your topic, here is the solution :

  1. Right-click on your column "Group", select "New Formula Column" and then "Group By".
  2. Then right-click on your "Column 2", select "New Formula Column", "Aggregate", "Mean".

JMP should give you mean calculated by the variable you have specified as "GroupBy".
Here by following these two steps, the formula in the column will be the same as the second script option proposed by @pauldeen.

 

Hope this helps you,

 

Victor GUILLER
Scientific Expertise Engineer
L'Oréal - Data & Analytics
txnelson
Super User

Re: Weighted mean

To get what you want, I would calculate the values outside of Tabulate, and then just use Tabulate for the formatting.

txnelson_0-1662987867713.png

Names Default To Here( 1 );
dt = New Table( "Sample",
	Add Rows( 4 ),
	New Column( "Group", Numeric, "Ordinal", Format( "Best", 12 ), Set Values( [1, 2, 3, 3] ) ),
	New Column( "Subgroup", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [1, 1, 1, 2] ) ),
	New Column( "Column 2", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [3, 3, 3, 9] ) )
);

dtSum1 =dt << Summary(
	Group( :Group ),
	Mean( :Column 2 ),
	Freq( "None" ),
	Weight( "None" ),
	statistics column name format( "column" ),
	Link to original data table( 0 )
);

dtSum2 = dtSum1 << Summary(
	Mean( :Column 2 ),
	Freq( "None" ),
	Weight( "None" ),
	statistics column name format( "column" ),
	Link to original data table( 0 )
);
dtSum2:Column 2 << set name( "Column 2_1" );

dtSum1 << Concatenate( dtSum2, Append to First Table(0));

 dtSum1 << Tabulate(
	Show Control Panel( 0 ),
	Add Table(
		Column Table(
			Grouping Columns( :Group ),
			Statistics( Mean ),
			Analysis Columns( :Column 2 )
		),
		Column Table( Analysis Columns( :Column 2_1 ), Statistics( Mean ) )
	)
);
Jim
pauldeen
Level VI

Re: Weighted mean

So you want the mean of "Column 2" by "Group" --> use tables --> summary:

pauldeen_0-1663011180523.png

Then in the new table that opens take a column mean:

pauldeen_1-1663011220064.png

Result:

 

pauldeen_2-1663011235198.png

 

 

GroupSquareWolf
Level III

Re: Weighted mean

Thank you all for your help!!!

The example is a simplified version of what we tried to do with a large data set that has thousands of data points and hundreds of groups. I was hoping that there is a solution without creating an additional data table so that we can keep every thing in the same Data Table, which looks like not an option.