Choose Language Hide Translation Bar

Simulation, the Good, the Bad and the Ugly or Independence? Dependence, Synergistic, Antagonistic (2020-US-30MP-584)

Ned Jones, Statistician, 1-alpha Solutions

 

Simulation has become a popular tool used to understand processes. In most cases the processes are assumed to be independent; however, many times this is not the case. A process can be viewed as physically independent, but this does not necessarily equate to stochastic independence. This is especially true when the processes are in series such that the output of a process is the input for the next process and so forth. Using the JMP simulator a simple series of processes are set up represented by JMP random functions. The process parameters are assumed to have a multivariate normal distribution. By modifying the correlation matrix, the effect of independence versus dependence is examined. These differences are shown by examining the tails of the resulting distributions. When the processes are dependent the effect of synergistic versus antagonistic process relationships are also investigated.

 

 

Auto-generated transcript...

 

Speaker

Transcript

nedjones The Good, the Bad, the Ugly or Independence and Dependence Synergistic and Antagonistic.
I am Ned Jones. I have a small consulting business called 1-alpha Solutions. You can see my contact information there. Let's get into the simulation discussion.
I'm going to be running the simulation, obviously, and JMP in the discovery...
and allows you to discover model yet input random very...variation...model output random variation and from based on the inputs in any noise that you add into the simulator. Simulator also
allows you to profile...
is in the profiler and defines it defines the random input is defined based on random input that you have and you're able to run a simulation and produce output tables and simulate variables.
Next thing I want to do is talk about the a couple of different types of simulation.
Just a simple simulation. If you have one input and one output, there is no issue of dependence in the in the simulation. The ones that we're concerned about primarily are simulations, where we have multiple inputs.
And we are simulating and will have one or more system outputs. The concern is that there could be
dependence among the input variables.
Scroll down here a little bit and we'll see...want to talk about what it means to to
have stochastic independence.
Two events A and B are independent, if and only if their joint probabilities equals the product of their probabilities. Well, that's what we want. That's the end result we want. I'm going to define it. Look at it a little bit differently and so forth.
This should make it a little clearer. If we look at the intersection of A and B, events A and B, is equal to that joint product that implies that
the probability of A is equal to the probability of A given B, or similarly, the probability...the with the joint probabilities, but the probability of B is equal to the probability of B given A.
Thus the occurrence of A and B is
is not affected.
The occurence of B is not affected by the probability of A and vice versa.
2, 4, or 6.
You can easily see that the
the
probability of A is 2 and 6 or one third, and probability of B is 3 and 6 or one half.
But if you look at the
intersection of A and B, that's 2. And so the and the probability of A times B as 1 6 and 2 is they get 2 is, and the only outcome you get so it's 1 in 6.
Now, the probability of A
is equal to the probability of any given B as equal to one third. And if we look at that and we realize that we're saying, okay, B has occurred, we know that we have
a 2, a 4 or a 6, but there's still a one third chance that A could have occurred. So we can see that still, it stays at one third. And similarly, we see the same thing happening with the probability of B.
Therefore, A and B are independent.
Now I'm a role on in and look at the example I have and talk about that. What I'm doing is I'm simulating the pest load
And the probability of a mating pair.
What we have is we have a fruit harvest population and from that fruit harvest population, we're going to have some cultural practices that are applied in the...
in our
orchard Grove or
vineyard to get a pest load...will have a pest load after those cultural practices are applied.
Then we have to the harvest...the crop is harvested, we'll do some manual culling and will estimate a pest load there. And then after that we can...you can see that we have
a cold storage and we'll have a pest load after the cold storage. We're going to try to freeze them to death.
And the final thing we do is once we get this pest load here, we're going to break it up in a marketing distribution and split that population into several smaller pieces.
And we'll be able to calculate the probability of a mating pair from that.
Well, the problem, you can see immediately is that these things become very dependent because the output of the harvest population is the input for treatment A. The output of the treatment A is the input for treatment B and so forth on to C on down to the meeting pair.
Now here's, here's a table I...here's a table we'll work with and we'll start with. And here is what we have is we set this up in for the simulator to work with and we have a population range of 1 million to 2.5 million fruit. We have a treatment
here, a treatment range of the efficacy of mitigation that we're seeing. Here's the number of survivors we would expect from this treatment population and we have a
Population A is a result of that. We're going to have a population B as a result of that, we can take a look at the formulas here that are used.
And what is what is being done here, this a little differently is, I'm putting a random variable in the...that is going to go into the profiler. So the profiler is going to see this immediately as a random variable going in. So we're simulating the variable coming in, even before it comes in.
So with that, you can go...we can go across. You can see the rest of the table. We're going over. We have another set, we have survivors that's after Treatment C, the same type of thing.
Then we have this distribution and we had a probability of a mating pair. I'll show you that formula. It's a little different. The probability of a mating pair. Well, this is just using an exponential
to estimate the probability of mating pair so you know what's going on. I haven't hidden anything from you behind the curtain and so forth. Let's take a look. So to open up the profiler, we're going to go to graph and down to profiler.
All right. And then from there, we're going to select our variables that have a formula that we're interested in. So we're gonna have...we're gonna have the Pop A, Pop B, survivors and the probability of a mating pair.
Going to put those in and we're going to say, uh-oh, we got to extend the variation here and we're going to say, okay.
We got a nice result.
Very attractive graphs here. And first thing you're going to see is, you're going to see squiggly lines in this profiler that if you use a profiler that you're probably not used to seeing lines like that.
It's just a little different approach and so forth that you can see how these things work and
Doing a little adjustment here so you can see the screens better.
Now from this point what we're gonna do is we're going to open up the simulator in the profiler. We go up here and just click on the simulator and it gives us
these choices down here. First thing I want to do with this is I want to increase the number of simulation runs to 25,000. Okay. And what I'm going to do...what we do if we have independence, one of the tests, quite often for that, we use for that is that we
we'll look at the correlation. So I'm going to use a correlation here. Use the correlations and set up some correlations. So for this first
population, I'm going to call it multivariate and immediately you can see we get a correlation specification down below.
And we'll set up another multivariate here and another multivariate for treatment B.
Another multivariate for for treatment A. Now what this is doing is, this is taking those treatment parameters that we had up above, we had before.
And it is putting those in our multivariate relationship with each other.
We also got the last thing, this marketing distribution. I don't want it to be continuous so I'm going to make it random and we're going to make it an integer.
We'll make that an integer and we've got that run and we can see the results. Now this is the...I'll call this the Goldilocks situation with all the zeros down here, that implies that all of these relationships are completely independent and we can run our simulator here.
And see the results.
Do little more adjustment here on these axes.
This come to life. Please look for here.
Okay.
Now you see those results. But what we're going to have here and look at this, is, we have the rate
at which it's exceeding a limit that's been put in there. I put those spec limits back in the variables, but the one that I'm most concerned about is the probability of a mating pair.
And wouldn't you know, I've run this real time and it hasn't come out exactly the way it should. Let's try a couple more times here. See.
What we got the probability of a mating pair and that is supposed to be coming up as .5, but it certainly isn't. I have something isn't...oh, here, let's try this and fix this.
This would be 4 and
14
Let's try the simulation one more time. Still didn't come up. Well, the example hasn't worked quite right, but
in the previous example I was having .4 here. So that was saying, the rate was creating less than time but I'm having that probability is
A .15 probability of a mating pair, but that's what happens sometimes when you try to do things on on the fly.
So let me go up and I have a window that I can, we can look at that result with...we can look to that result. And let's...that has a little bit differently. And you can see now that
that probability is under 5%. That's the target we're aiming for.
in this thing, in this simulation. So if we go up and we can run those simulations, again you can see those bouncing around, staying under .5, so
it's happening less than 5% of the time that the probability of a mating pair is greater than 1.5.
Now because now, again, I'll say this is kind of a Goldilocks scenario because we're assuming all these relationships are independent. I have an example that I can show you that we have, where we have one that is
antagonistic and synergistic. So I'll pull up the first one here and in this one we have that the
relationships are antagonistic. Now when you...if you are are creating an example like this to work with it, you can't, at least I wasn't able to make everything negative. If you notice I have these two as being positive.
This wants that the matrix to be positive definite. And it doesn't come out as positive definite if you set the...if you set those all to zero, but we can run that simulation again. And you can see that...
you can see here that those simulations with a negative, it really makes things very, very attractive. We're getting a low, real low rate of...
that we have...we have 1., .15 probability of a mating pair so that you can see just the effect. And what I really want to show is the effect of this
correlation specifications, correlation matrix down here, covariance matrix that you specified. Now let's look at one other, we'll look at the one if it's positive.
And we've got we've got an example here where it's positive. And you can see I have down here. I've said here. Now, I haven't been real heroic about making those correlations very high.
I've tried to keep them fairly low and so forth to be fairly realistic, after all this is biological data. And we can run those simulations again and you can see very quickly that we're exceeding that 5% rate which is...becomes
a great deal of concern here and so forth. And if you were...if most of the time these simulations like this are run with no consideration of the correlation between variables and that is kind of like covering your eyes and saying, I didn't see this and so forth. But it really
if there is, if there is a correlation relationship and most likely there is, because one of these in...one of these outputs is the input to the next process,
so pretty well has to be dependent, and what the dependencies are, estimating these correlations will be a great task to have to come up with most of the time.
Work in this area is done based on research papers and they don't have correlations between different types of treatments so.
But having some estimate of those is a good thing, a good thing to have. Now the next step is to show you the what else we can do here. We can create a table. And if we create this table and...
Well, we'll create this table and I'm just going to highlight everything in the table out to the mating pairs here. And then I'm just going to
do an analysis distribution.
And run all of those and say, okay. Now we get all these grand distributions, fill up the
paper with it. But what we can do is we can go in and we can select these distributions that are exceeding our
limit out here. We can just highlight those and it becomes very informative as you look back and you can see the mitigations, what's happening, and so forth. What is affecting these things
greatly. And one of the things that really ...first of all, our initial population, and this has been based on what we've seen in real life, is as the population gets to be higher,
when we have large, large populations of the fruit, the tendency is that we have failures of the system,
Treatment A and so forth. So what what the one that I thought was most interesting was, if we look back here and we look at the marketing distribution,
That if we push them out, if we require that as shipments come into the country and that marketing distribution has to break these shipments up out into smaller lots to be distributed, the probability of mating pair pretty well becomes zero.
With with these these examples and so forth, I want to go ahead and open it up for questions. But let me just say one last thing. I think of George Box.
He was at one of our meetings a few years ago. And it was really interesting what...his two quotes that he said. He said, "Don't fall in love with a model."
And he also said, "All models are wrong, but some are useful." I hope this information and these examples to give you something to think about when you're doing the simulation that you need to consider the relationship between the variables. Thank you.

 

Presenter