cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

Pre-evaluated Statistics (Col Mean, Col Median) in Loop

Hi,

 

I am trying to calculate the mean and median of the standard deviation by group for multiple columns in JSL. The data table I'm using is shown below and attached. I have verified the correct calculations in the columns to the right of the values but the JSL output does not match exactly. The code below correctly calculates the mean of the standard deviation by group for both values, but the median only calculates correctly for the first value; the result for the median of the second value is exactly the same as what was calculated for the first value. I cannot figure out why the mean values update correctly in the loop but not the median values. Any help on this is appreciated and please let me know what the best method in JSL is to do this.

 

TheCakeIsALie_0-1655913183403.png

 

JSL:

For( i = 2, i < 4, i++,
	N = Column Name( i );
	Print( "i = " || Char( i ) );
	Print( "Column Name = " || Char( Column Name( i ) ) );
	Print( "Column Mean Std Dev by Group = " || Char( Col Mean( Col Std Dev( N, :Group ) ) ) );
	Print( "Column Median Std Dev by Group = " || Char( Col Median( Col Std Dev( N, :Group ) ) ) );
);

JSL Output:

"i = 2"
"Column Name = Value 1"
"Column Mean Std Dev by Group = 3.9849040980727"
"Column Median Std Dev by Group = 3.60555127546399"
"i = 3"
"Column Name = Value 2"
"Column Mean Std Dev by Group = 5.14008843607314"
"Column Median Std Dev by Group = 3.60555127546399"

 

JMP Pro v16

 

Thanks 

1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: Pre-evaluated Statistics (Col Mean, Col Median) in Loop

You could still loop it, but use Summarize instead of functions Col functions which might not work with the byVar. This is one option:

 

Names Default To Here(1);

dt = Current Data Table();

col_list = {"Value 1", "Value 2"};

For Each({col_name}, col_list,
	Summarize(dt, groups = By(:Group), v_std = StdDev(Eval(col_name)));
	Show(col_name, v_std);
	Show(Mean(v_std));
	Show(Median(v_std));	
);

Here is other option with Summary table could be used (two ways for calculations) :

Names Default To Here(1);

dt = Current Data Table();

col_list = {"Value 1", "Value 2"};
dt_summary = dt << Summary(
	Group(:Group),
	Std Dev(EvalList({col_list})),
	Freq("None"),
	Weight("None"),
	statistics column name format("column"),
	Link to original data table(0),
	invisible
);
For Each({col_name}, col_list,
	mea = Mean(dt_summary[0, col_name]);
	med = Median(dt_summary[0, col_name]);
	// or
	mea1 = Col Mean(As Column(col_name));
	med1 = Col Median(As Column(col_name));
	Show(col_name, mea, med, mea1, med1);
);
Close(dt_summary, no save);

 

-Jarmo

View solution in original post

5 REPLIES 5
jthi
Super User

Re: Pre-evaluated Statistics (Col Mean, Col Median) in Loop

Is there a reason to perform calculations like this with looping and with formulas? JMP does provide you with Summary (table), Summarize and Tabulate which might be better options depending on what you are trying to do.

 

For looping, I'm not sure if you can really (or if you should) loop Col functions like while using byVar that as they will get evaluated a bit weirdly.

jthi_0-1655916126101.png

-Jarmo

Re: Pre-evaluated Statistics (Col Mean, Col Median) in Loop

The reason for performing the calculations this way is I would like to feed the result to a script that separately plots the data in each column and uses the calculated values for upper and lower plotting limits.

jthi
Super User

Re: Pre-evaluated Statistics (Col Mean, Col Median) in Loop

Summarize should work fairly nicely here:

Names Default To Here(1);

dt = Current Data Table();

Summarize(dt, groups = By(:Group), v1_std = StdDev(:Value 1), v2_std = StdDev(:Value 2));
Show(groups, v1_std, v2_std);
Show(Mean(v1_std));
Show(Mean(v2_std));
Show(Median(v1_std));
Show(Median(v2_std));

/*groups = {"A", "B", "C", "D"};
v1_std = [3.60555127546399, 2.64575131106459, 3.60555127546399, 6.08276253029822];
v2_std = [7.23417813807024, 8.9628864398325, 0.577350269189626, 3.78593889720018];
Mean(v1_std) = 3.9849040980727;
Mean(v2_std) = 5.14008843607314;
Median(v1_std) = 3.60555127546399;
Median(v2_std) = 5.51005851763521;*/
-Jarmo

Re: Pre-evaluated Statistics (Col Mean, Col Median) in Loop

I agree, this looks like the correct way to handle this. However, it relies on typing out all the column names. Is there a way to generalize it for many columns without knowing the column name? This was the main reason to use the For loop in my code.

jthi
Super User

Re: Pre-evaluated Statistics (Col Mean, Col Median) in Loop

You could still loop it, but use Summarize instead of functions Col functions which might not work with the byVar. This is one option:

 

Names Default To Here(1);

dt = Current Data Table();

col_list = {"Value 1", "Value 2"};

For Each({col_name}, col_list,
	Summarize(dt, groups = By(:Group), v_std = StdDev(Eval(col_name)));
	Show(col_name, v_std);
	Show(Mean(v_std));
	Show(Median(v_std));	
);

Here is other option with Summary table could be used (two ways for calculations) :

Names Default To Here(1);

dt = Current Data Table();

col_list = {"Value 1", "Value 2"};
dt_summary = dt << Summary(
	Group(:Group),
	Std Dev(EvalList({col_list})),
	Freq("None"),
	Weight("None"),
	statistics column name format("column"),
	Link to original data table(0),
	invisible
);
For Each({col_name}, col_list,
	mea = Mean(dt_summary[0, col_name]);
	med = Median(dt_summary[0, col_name]);
	// or
	mea1 = Col Mean(As Column(col_name));
	med1 = Col Median(As Column(col_name));
	Show(col_name, mea, med, mea1, med1);
);
Close(dt_summary, no save);

 

-Jarmo