cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Mickyboy
Level V

Calculating a p-value in a script

Hi All,

l know l am doing something wrong here, but l can't see it at the moment, l am wanting the two-tailed p-value for a difference in two means and am using the following script

dt5 << New Column(" t-Value", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula( t Quantile (0.95,:"N(Data) VPP"n + :"N(Data) Commercial"n -2)));

dt5 << New Column( "p-Value", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula( t Distribution (:"T-Value"n,:"N(Data) VPP"n + :"N(Data) Commercial"n - 1 )));

can anyone please offer some advice.

 

Thanks,

Mick.

2 ACCEPTED SOLUTIONS

Accepted Solutions

Re: Calculating a p-value in a script

There a couple things here:

  • The script is calculating the p-value based on the critical t-value (calculated on the first line) instead of the observed t-value. The latter is the difference in means divided by the standard error of the difference.
  • The t Distribution() call is calculating the proportion of the distribution less than the t-value, which isn't quite the p-value you're looking for. Calculating the two-tailed p-value requires some simple arithmetic from here (see example).

 

If we can assume the data table has at minimum the by-group sample sizes, means, and standard deviations in their own columns, then here's some JSL that'll do what you want. Try it out on the attached example table. Note that I've included an intermediate step of calculating a pooled standard deviation column to improve readability of the formula for the t-value column.

 

Names Default to Here( 1 );

dt = Data Table( "Example.jmp" );

dt << New Column( "Pooled stdev",
	Numeric,
	Continuous,
	Format( "Fixed Dec", 15, 3 ),
	Formula(
		Sqrt( ((:Group A n - 1) * :Group A stdev ^ 2 + (:Group B n - 1) * :Group B stdev ^ 2) / (:Group A n + :Group B n - 2) )
	)
);

dt << New Column( " t-Value", 
	Numeric, 
	Continuous, 
	Format( "Fixed Dec", 15, 3), 
	Formula( 
		(:Group A mean - :Group B mean) / (:Pooled stdev * Sqrt( 1 / :Group A n + 1 / :Group B n) )	
	) 
);

dt << New Column( "p-Value", 
	Numeric, 
	Continuous, 
	Format( "Fixed Dec", 15, 3 ), 
	Formula( 
		( 1 - t Distribution( Abs( :"t-Value"n ), :Group A n + :Group B n - 2 ) ) * 2
	)
);

 

Ross Metusalem
JMP Academic Ambassador

View solution in original post

Re: Calculating a p-value in a script

I haven't gone through each equation in full detail so definitely double-check everything yourself too. But with that said, I only caught one thing you'll probably want to change: The formula for the t-statistic takes the absolute value of the difference in means, which means that your t-statistic will always come out positive when it should be negative any time the VPP mean is lower than the Commercial mean. I used the Abs() function in my example only to ensure that the p-value would be calculated correctly regardless of the sign of the t-statistic. You'll probably want to do the same.

Ross Metusalem
JMP Academic Ambassador

View solution in original post

5 REPLIES 5

Re: Calculating a p-value in a script

There a couple things here:

  • The script is calculating the p-value based on the critical t-value (calculated on the first line) instead of the observed t-value. The latter is the difference in means divided by the standard error of the difference.
  • The t Distribution() call is calculating the proportion of the distribution less than the t-value, which isn't quite the p-value you're looking for. Calculating the two-tailed p-value requires some simple arithmetic from here (see example).

 

If we can assume the data table has at minimum the by-group sample sizes, means, and standard deviations in their own columns, then here's some JSL that'll do what you want. Try it out on the attached example table. Note that I've included an intermediate step of calculating a pooled standard deviation column to improve readability of the formula for the t-value column.

 

Names Default to Here( 1 );

dt = Data Table( "Example.jmp" );

dt << New Column( "Pooled stdev",
	Numeric,
	Continuous,
	Format( "Fixed Dec", 15, 3 ),
	Formula(
		Sqrt( ((:Group A n - 1) * :Group A stdev ^ 2 + (:Group B n - 1) * :Group B stdev ^ 2) / (:Group A n + :Group B n - 2) )
	)
);

dt << New Column( " t-Value", 
	Numeric, 
	Continuous, 
	Format( "Fixed Dec", 15, 3), 
	Formula( 
		(:Group A mean - :Group B mean) / (:Pooled stdev * Sqrt( 1 / :Group A n + 1 / :Group B n) )	
	) 
);

dt << New Column( "p-Value", 
	Numeric, 
	Continuous, 
	Format( "Fixed Dec", 15, 3 ), 
	Formula( 
		( 1 - t Distribution( Abs( :"t-Value"n ), :Group A n + :Group B n - 2 ) ) * 2
	)
);

 

Ross Metusalem
JMP Academic Ambassador
Mickyboy
Level V

Re: Calculating a p-value in a script

Hi Ross,

Thanks for your reply, how does this look

dt5 << New Column( "Difference in Mean", Numeric, Continuous, Format( "Fixed Dec", 15, 3 ), Formula( :"Mean(Data) VPP"n - :"Mean(Data) Commercial"n ) );	

dt5 << New Column( "Pooled Std Dev", Numeric, Continuous, Format( "Fixed Dec", 15, 3 ), Formula( sqrt(((:"N(Data) VPP"n - 1)* :"Std Dev(Data) VPP"n ^ 2  + (:"N(Data) Commercial"n - 1) * :"Std Dev(Data) Commercial"n ^ 2 )  / (:"N(Data) VPP"n + :"N(Data) Commercial"n - 2)))) ;	

dt5 << New Column( "Lower 90% Confidence Interval", Numeric, Continuous, Format( "Fixed Dec", 15, 3 ), Formula( (:"Mean(Data) VPP"n - :"Mean(Data) Commercial"n  ) - t Quantile (0.95,:"N(Data) VPP"n + :"N(Data) Commercial"n -2) * (:Pooled Std Dev * sqrt(( 1/:"N(Data) VPP"n) + (1/:"N(Data) Commercial"n )) )));

dt5 << New Column( "Upper 90% Confidence Interval", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula( (:"Mean(Data) VPP"n - :"Mean(Data) Commercial"n  ) + t Quantile (0.95,:"N(Data) VPP"n + :"N(Data) Commercial"n -2) * (:Pooled Std Dev * sqrt(( 1/:"N(Data) VPP"n) + (1/:"N(Data) Commercial"n )) )));

dt5 << New Column( "Pooled Variance", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula(:Pooled Std Dev ^ 2));

dt5 << New Column(" Std Error for t-Value", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula(sqrt (:Pooled Variance * ( 1/:"N(Data) VPP"n + 1/ :"N(Data) Commercial"n ) )));

dt5 << New Column(" t-stat", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula((Abs (:"Mean(Data) VPP"n - :"Mean(Data) Commercial"n - 0)) / :"Std Error for t-Value"n));

dt5 << New Column( "one tail p-Value", Numeric, Continuous, Format( "Fixed Dec", 15, 4), Formula(1 - t Distribution (:"t-stat"n,:"N(Data) VPP"n + :"N(Data) Commercial"n - 2 )));

dt5 << New Column( "two tail p-value", Numeric,	Continuous, Format( "Fixed Dec", 12, 4 ),Formula( :"one tail p-Value"n * 2 ));

 

Re: Calculating a p-value in a script

I haven't gone through each equation in full detail so definitely double-check everything yourself too. But with that said, I only caught one thing you'll probably want to change: The formula for the t-statistic takes the absolute value of the difference in means, which means that your t-statistic will always come out positive when it should be negative any time the VPP mean is lower than the Commercial mean. I used the Abs() function in my example only to ensure that the p-value would be calculated correctly regardless of the sign of the t-statistic. You'll probably want to do the same.

Ross Metusalem
JMP Academic Ambassador
Mickyboy
Level V

Re: Calculating a p-value in a script

Thanks again Ross, much appreciated

Re: Calculating a p-value in a script

Thanks for the solution.

Thanks for sharing it with us. I am glad I found your post link while searching for a site online where I can find justbit casino - first deposit bonus gaming site reviews. I am interested in playing online real money games and that is why I am looking for a site online where I can find real money gaming site reviews.