cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar

monte carlo resapling simulation for mean and std dev jmp 14

I am having trouble debugging  the portion on resampling. Help would be appreciated.

 

dt=Current Data Table();
n=N Row(dt);
nSim=1000;
dtMC=New Table("MC", New Column ("Sim", Numeric, "Continuous"), New Column ("Mean", Numeric, "Continuous"), New Column ("Std Dev", Numeric, "Continuous");
For(i=1, i<=nSim, i++,
dtResample=dt<<Resample(Size(n), With Replacement(1), Seed(i));
dtMC<<Add Rows(1);
dtMC:Sim[i]=i;
dtMC:Mean[i]=Mean(dtResample:X)
dtMC:Std Dev[i]= Std Dev(dtResample:X);
);
dtMC<<Graph Builder(
Size(400, 300),
Variables (X(:Sim), Y(:Mean)),
Elements(Points(X,Y))
);

2 ACCEPTED SOLUTIONS

Accepted Solutions

Re: monte carlo resapling simulation for mean and std dev jmp 14

It is difficult for me to diagnose the error in your script. Perhaps another member can help with that problem. I will use it as an opportunity to show you an alternate way. If you were interactively resampling using the data table commands, your way makes perfect sense. You can instead use matrix operations in a script.

 

Names Default To Here( 1 );

// example
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
// dt=Current Data Table();

// original data as a vector
data = dt:weight << Get As Matrix;

n = N Row( data );
nSim = 1000;

// matrix of simulation results
sim = J( nSim, 3, . );

// resample statistics
For( i = 1, i <= nSim, i++,
	X = data[J( n, 1, Random Integer( 1, n ) )];
	sim[i, 1] = i;
	sim[i, 2] = Mean( X );
	sim[i, 3] = Std Dev( X );
);

// make data tabel from matrix
dtMC = As Table( sim, <<Column Names( {"Simulation", "Mean", "Std Dev"} ) );

// plot resampled statistics

dtMC << Graph Builder(
	Size( 534, 456 ),
	Show Control Panel( 0 ),
	Show Legend( 0 ),
	Variables( X( :Simulation ), Y( :Mean ) ),
	Elements( Points( X, Y, Legend( 3 ) ) )
);

View solution in original post

Re: monte carlo resapling simulation for mean and std dev jmp 14

This script produces two data tables: one table with the resampled data and another with the resampled statistics.

 

Names Default To Here( 1 );

// example
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
// dt=Current Data Table();

// original data as a vector
data = dt:weight << Get As Matrix;
sampleMean = Col Mean( dt:weight );
sampleSD = Col Std Dev( dt:weight );

n = N Row( data );
nSim = 1000;

// matrix of simulation data
simData = [];

// resample data
For( i = 1, i <= nSim, i++,
	X = data[J( n, 1, Random Integer( 1, n ) )];
	simData ||= X;
);

// make table of resampled data
colNames = List();
For( i = 1, i <= nSim, i++, Insert Into( colNames, Eval( "Sample " || Char( i ) ) ) );
dtSimData = As Table( simData, << Column Names( colNames ) );

// matrix of simulation results
simStat = J( nSim, 3, . );

// resample statistics
For( i = 1, i <= nSim, i++,
	simStat[i, 1] = i;
	simStat[i, 2] = Mean( simData[0,i] );
	simStat[i, 3] = Std Dev( simData[0,i] );
);

// make data table from matrix
dtMC = As Table( simStat, << Column Names( {"Simulation", "Mean", "Std Dev"} ) );

// plot resampled statistics
dtMC << Graph Builder(
	Size( 534, 456 ),
	Show Control Panel( 0 ),
	Show Legend( 0 ),
	Variables( X( :Simulation ), Y( :Mean ), Y( :Std Dev ) ),
	Elements( Position( 1, 1 ), Points( X, Y, Legend( 3 ) ) ),
	Elements( Position( 1, 2 ), Points( X, Y, Legend( 4 ) ) )
);
obj = dtMC << Distribution( Y( :Mean, :Std Dev ) );
rpt = obj << Report;
rpt[AxisBox(1)] << Add Ref Line( sampleMean, "Solid", "Red", "Sample Mean", 1, 0 );
rpt[AxisBox(2)] << Add Ref Line( sampleSD, "Solid", "Red", "Sample Std Dev", 1, 0 );

View solution in original post

8 REPLIES 8

Re: monte carlo resapling simulation for mean and std dev jmp 14

It is difficult for me to diagnose the error in your script. Perhaps another member can help with that problem. I will use it as an opportunity to show you an alternate way. If you were interactively resampling using the data table commands, your way makes perfect sense. You can instead use matrix operations in a script.

 

Names Default To Here( 1 );

// example
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
// dt=Current Data Table();

// original data as a vector
data = dt:weight << Get As Matrix;

n = N Row( data );
nSim = 1000;

// matrix of simulation results
sim = J( nSim, 3, . );

// resample statistics
For( i = 1, i <= nSim, i++,
	X = data[J( n, 1, Random Integer( 1, n ) )];
	sim[i, 1] = i;
	sim[i, 2] = Mean( X );
	sim[i, 3] = Std Dev( X );
);

// make data tabel from matrix
dtMC = As Table( sim, <<Column Names( {"Simulation", "Mean", "Std Dev"} ) );

// plot resampled statistics

dtMC << Graph Builder(
	Size( 534, 456 ),
	Show Control Panel( 0 ),
	Show Legend( 0 ),
	Variables( X( :Simulation ), Y( :Mean ) ),
	Elements( Points( X, Y, Legend( 3 ) ) )
);

Re: monte carlo resapling simulation for mean and std dev jmp 14

this definitely helped. one question. How does the "random integer " function work? Are there different techniques for resampling ? Perhaps more control on how resampling is done? thanks again

Re: monte carlo resapling simulation for mean and std dev jmp 14

The Random Integer() function returns an integer over a range of values with equal probability (uniform distribution function).

 

rand.PNG

 

It is the third argument in the call to the J() function. It is evaluated row-wise during the making of the matrix that is, in turn, supplied as the subscripts to the original data matrix. This use is equivalent to sampling with replacement, which is what you want for resampled sample statistics.

Re: monte carlo resapling simulation for mean and std dev jmp 14

Seems like it will replace one row in each simulation. Is it possible to generate random integer values in multiple rows?

Re: monte carlo resapling simulation for mean and std dev jmp 14

I do not understand your assumption ("Seems like it will replace one row in each simulation.") or your question ("Is it possible to generate random integer values in multiple rows?"). The script is correct for resampling statistics. I am using a vectorized solution that might be unfamiliar to you. A modified version of the script includes the distribution of the resampled estimates of the mean and the original estimate using the Distribution platform.

 

Names Default To Here( 1 );

// example
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
// dt=Current Data Table();

// original data as a vector
data = dt:weight << Get As Matrix;
sampleMean = Col Mean( dt:weight );

n = N Row( data );
nSim = 1000;

// matrix of simulation results
sim = J( nSim, 3, . );

// resample statistics
For( i = 1, i <= nSim, i++,
	X = data[J( n, 1, Random Integer( 1, n ) )];
	sim[i, 1] = i;
	sim[i, 2] = Mean( X );
	sim[i, 3] = Std Dev( X );
);

// make data tabel from matrix
dtMC = As Table( sim, <<Column Names( {"Simulation", "Mean", "Std Dev"} ) );

// plot resampled statistics

dtMC << Graph Builder(
	Size( 534, 456 ),
	Show Control Panel( 0 ),
	Show Legend( 0 ),
	Variables( X( :Simulation ), Y( :Mean ) ),
	Elements( Points( X, Y, Legend( 3 ) ) )
);

obj = dtMC << Distribution( Y( :Mean ) );
rpt = obj << Report;
rpt[AxisBox(1)] << Add Ref Line( sampleMean, "Solid", "Red", "Sample Mean", 1, 0 );

 

Are you attempting something else? Please explain your last reply in more detail.

Re: monte carlo resapling simulation for mean and std dev jmp 14

Apologies for the confusion. In the 1st script, how do I extract X from each iteration into a table so that I have all the values simulated in 1000 iterations? Thanks again for your help

 

 

Names Default To Here( 1 );

// example
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
// dt=Current Data Table();

// original data as a vector
data = dt:weight << Get As Matrix;

n = N Row( data );
nSim = 1000;

// matrix of simulation results
sim = J( nSim, 3, . );

// resample statistics
For( i = 1, i <= nSim, i++,
	X = data[J( n, 1, Random Integer( 1, n ) )];
	sim[i, 1] = i;
	sim[i, 2] = Mean( X );
	sim[i, 3] = Std Dev( X );
);

// make data tabel from matrix
dtMC = As Table( sim, <<Column Names( {"Simulation", "Mean", "Std Dev"} ) );

// plot resampled statistics

dtMC << Graph Builder(
	Size( 534, 456 ),
	Show Control Panel( 0 ),
	Show Legend( 0 ),
	Variables( X( :Simulation ), Y( :Mean ) ),
	Elements( Points( X, Y, Legend( 3 ) ) )
);

 

Re: monte carlo resapling simulation for mean and std dev jmp 14

This script produces two data tables: one table with the resampled data and another with the resampled statistics.

 

Names Default To Here( 1 );

// example
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
// dt=Current Data Table();

// original data as a vector
data = dt:weight << Get As Matrix;
sampleMean = Col Mean( dt:weight );
sampleSD = Col Std Dev( dt:weight );

n = N Row( data );
nSim = 1000;

// matrix of simulation data
simData = [];

// resample data
For( i = 1, i <= nSim, i++,
	X = data[J( n, 1, Random Integer( 1, n ) )];
	simData ||= X;
);

// make table of resampled data
colNames = List();
For( i = 1, i <= nSim, i++, Insert Into( colNames, Eval( "Sample " || Char( i ) ) ) );
dtSimData = As Table( simData, << Column Names( colNames ) );

// matrix of simulation results
simStat = J( nSim, 3, . );

// resample statistics
For( i = 1, i <= nSim, i++,
	simStat[i, 1] = i;
	simStat[i, 2] = Mean( simData[0,i] );
	simStat[i, 3] = Std Dev( simData[0,i] );
);

// make data table from matrix
dtMC = As Table( simStat, << Column Names( {"Simulation", "Mean", "Std Dev"} ) );

// plot resampled statistics
dtMC << Graph Builder(
	Size( 534, 456 ),
	Show Control Panel( 0 ),
	Show Legend( 0 ),
	Variables( X( :Simulation ), Y( :Mean ), Y( :Std Dev ) ),
	Elements( Position( 1, 1 ), Points( X, Y, Legend( 3 ) ) ),
	Elements( Position( 1, 2 ), Points( X, Y, Legend( 4 ) ) )
);
obj = dtMC << Distribution( Y( :Mean, :Std Dev ) );
rpt = obj << Report;
rpt[AxisBox(1)] << Add Ref Line( sampleMean, "Solid", "Red", "Sample Mean", 1, 0 );
rpt[AxisBox(2)] << Add Ref Line( sampleSD, "Solid", "Red", "Sample Std Dev", 1, 0 );

Re: monte carlo resapling simulation for mean and std dev jmp 14

Thanks. A question. If I have a data set [1 2 3 5 8 9 7] and I want to follow the logic above and do 2 iterations, will it create 

2 sets by randomly selecting  numbers from the data set by replacing each location ? for example:

 

 

[3 2 1 5 8 7 7]

[9 1 3 5 2 8 6]

If this is truly random, doesn't it mean every time its run we will get different data sets?