Solved: Calculating a p-value in a script

Report Inappropriate Content · Jun 9, 2023 9:04 AM

Hi All,

l know l am doing something wrong here, but l can't see it at the moment, l am wanting the two-tailed p-value for a difference in two means and am using the following script

dt5 << New Column(" t-Value", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula( t Quantile (0.95,:"N(Data) VPP"n + :"N(Data) Commercial"n -2)));

dt5 << New Column( "p-Value", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula( t Distribution (:"T-Value"n,:"N(Data) VPP"n + :"N(Data) Commercial"n - 1 )));

can anyone please offer some advice.

Thanks,

Mick.

Ross_Metusalem · Dec 12, 2022 02:23 PM

There a couple things here:

The script is calculating the p-value based on the critical t-value (calculated on the first line) instead of the observed t-value. The latter is the difference in means divided by the standard error of the difference.
The t Distribution() call is calculating the proportion of the distribution less than the t-value, which isn't quite the p-value you're looking for. Calculating the two-tailed p-value requires some simple arithmetic from here (see example).

If we can assume the data table has at minimum the by-group sample sizes, means, and standard deviations in their own columns, then here's some JSL that'll do what you want. Try it out on the attached example table. Note that I've included an intermediate step of calculating a pooled standard deviation column to improve readability of the formula for the t-value column.

Names Default to Here( 1 );

dt = Data Table( "Example.jmp" );

dt << New Column( "Pooled stdev",
	Numeric,
	Continuous,
	Format( "Fixed Dec", 15, 3 ),
	Formula(
		Sqrt( ((:Group A n - 1) * :Group A stdev ^ 2 + (:Group B n - 1) * :Group B stdev ^ 2) / (:Group A n + :Group B n - 2) )
	)
);

dt << New Column( " t-Value", 
	Numeric, 
	Continuous, 
	Format( "Fixed Dec", 15, 3), 
	Formula( 
		(:Group A mean - :Group B mean) / (:Pooled stdev * Sqrt( 1 / :Group A n + 1 / :Group B n) )	
	) 
);

dt << New Column( "p-Value", 
	Numeric, 
	Continuous, 
	Format( "Fixed Dec", 15, 3 ), 
	Formula( 
		( 1 - t Distribution( Abs( :"t-Value"n ), :Group A n + :Group B n - 2 ) ) * 2
	)
);

Ross Metusalem
JMP Academic Ambassador

View solution in original post

Ross_Metusalem · Dec 16, 2022 09:04 AM

I haven't gone through each equation in full detail so definitely double-check everything yourself too. But with that said, I only caught one thing you'll probably want to change: The formula for the t-statistic takes the absolute value of the difference in means, which means that your t-statistic will always come out positive when it should be negative any time the VPP mean is lower than the Commercial mean. I used the Abs() function in my example only to ensure that the p-value would be calculated correctly regardless of the sign of the t-statistic. You'll probably want to do the same.

Ross Metusalem
JMP Academic Ambassador

View solution in original post

Ross_Metusalem · Dec 12, 2022 02:23 PM

There a couple things here:

The script is calculating the p-value based on the critical t-value (calculated on the first line) instead of the observed t-value. The latter is the difference in means divided by the standard error of the difference.
The t Distribution() call is calculating the proportion of the distribution less than the t-value, which isn't quite the p-value you're looking for. Calculating the two-tailed p-value requires some simple arithmetic from here (see example).

If we can assume the data table has at minimum the by-group sample sizes, means, and standard deviations in their own columns, then here's some JSL that'll do what you want. Try it out on the attached example table. Note that I've included an intermediate step of calculating a pooled standard deviation column to improve readability of the formula for the t-value column.

Names Default to Here( 1 );

dt = Data Table( "Example.jmp" );

dt << New Column( "Pooled stdev",
	Numeric,
	Continuous,
	Format( "Fixed Dec", 15, 3 ),
	Formula(
		Sqrt( ((:Group A n - 1) * :Group A stdev ^ 2 + (:Group B n - 1) * :Group B stdev ^ 2) / (:Group A n + :Group B n - 2) )
	)
);

dt << New Column( " t-Value", 
	Numeric, 
	Continuous, 
	Format( "Fixed Dec", 15, 3), 
	Formula( 
		(:Group A mean - :Group B mean) / (:Pooled stdev * Sqrt( 1 / :Group A n + 1 / :Group B n) )	
	) 
);

dt << New Column( "p-Value", 
	Numeric, 
	Continuous, 
	Format( "Fixed Dec", 15, 3 ), 
	Formula( 
		( 1 - t Distribution( Abs( :"t-Value"n ), :Group A n + :Group B n - 2 ) ) * 2
	)
);

Ross Metusalem
JMP Academic Ambassador

Mickyboy · Dec 15, 2022 05:45 PM

Hi Ross,

Thanks for your reply, how does this look

dt5 << New Column( "Difference in Mean", Numeric, Continuous, Format( "Fixed Dec", 15, 3 ), Formula( :"Mean(Data) VPP"n - :"Mean(Data) Commercial"n ) );	

dt5 << New Column( "Pooled Std Dev", Numeric, Continuous, Format( "Fixed Dec", 15, 3 ), Formula( sqrt(((:"N(Data) VPP"n - 1)* :"Std Dev(Data) VPP"n ^ 2  + (:"N(Data) Commercial"n - 1) * :"Std Dev(Data) Commercial"n ^ 2 )  / (:"N(Data) VPP"n + :"N(Data) Commercial"n - 2)))) ;	

dt5 << New Column( "Lower 90% Confidence Interval", Numeric, Continuous, Format( "Fixed Dec", 15, 3 ), Formula( (:"Mean(Data) VPP"n - :"Mean(Data) Commercial"n  ) - t Quantile (0.95,:"N(Data) VPP"n + :"N(Data) Commercial"n -2) * (:Pooled Std Dev * sqrt(( 1/:"N(Data) VPP"n) + (1/:"N(Data) Commercial"n )) )));

dt5 << New Column( "Upper 90% Confidence Interval", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula( (:"Mean(Data) VPP"n - :"Mean(Data) Commercial"n  ) + t Quantile (0.95,:"N(Data) VPP"n + :"N(Data) Commercial"n -2) * (:Pooled Std Dev * sqrt(( 1/:"N(Data) VPP"n) + (1/:"N(Data) Commercial"n )) )));

dt5 << New Column( "Pooled Variance", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula(:Pooled Std Dev ^ 2));

dt5 << New Column(" Std Error for t-Value", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula(sqrt (:Pooled Variance * ( 1/:"N(Data) VPP"n + 1/ :"N(Data) Commercial"n ) )));

dt5 << New Column(" t-stat", Numeric, Continuous, Format( "Fixed Dec", 15, 3), Formula((Abs (:"Mean(Data) VPP"n - :"Mean(Data) Commercial"n - 0)) / :"Std Error for t-Value"n));

dt5 << New Column( "one tail p-Value", Numeric, Continuous, Format( "Fixed Dec", 15, 4), Formula(1 - t Distribution (:"t-stat"n,:"N(Data) VPP"n + :"N(Data) Commercial"n - 2 )));

dt5 << New Column( "two tail p-value", Numeric,	Continuous, Format( "Fixed Dec", 12, 4 ),Formula( :"one tail p-Value"n * 2 ));

Ross_Metusalem · Dec 16, 2022 09:04 AM

I haven't gone through each equation in full detail so definitely double-check everything yourself too. But with that said, I only caught one thing you'll probably want to change: The formula for the t-statistic takes the absolute value of the difference in means, which means that your t-statistic will always come out positive when it should be negative any time the VPP mean is lower than the Commercial mean. I used the Abs() function in my example only to ensure that the p-value would be calculated correctly regardless of the sign of the t-statistic. You'll probably want to do the same.

Ross Metusalem
JMP Academic Ambassador

Mickyboy · Dec 19, 2022 12:02 AM

Thanks again Ross, much appreciated

TraciThompson · Jan 12, 2023 12:42 AM

Thanks for the solution.

Thanks for sharing it with us. I am glad I found your post link while searching for a site online where I can find justbit casino - first deposit bonus gaming site reviews. I am interested in playing online real money games and that is why I am looking for a site online where I can find real money gaming site reviews.

Calculating a p-value in a script

Re: Calculating a p-value in a script

Re: Calculating a p-value in a script

Re: Calculating a p-value in a script

Re: Calculating a p-value in a script

Re: Calculating a p-value in a script

Re: Calculating a p-value in a script

Re: Calculating a p-value in a script