Choose Language Hide Translation Bar

Applications of Bayesian Methods Using JMP® (2021-EU-45MP-786)

William Q. Meeker, Professor of Statistics and Distinguished Professor of Liberal Arts and Sciences, Iowa State University
Peng Liu, JMP Principal Research Statistician Developer, SAS

 

The development of theory and application of Monte Carlo Markov Chain methods, vast improvements in computational capabilities and emerging software alternatives have made it possible for the wide use of Bayesian methods in many areas of application.

Motivated by applications in reliability, JMP now has powerful and easy-to-use capabilities for using Bayesian methods to fit different single distributions (e.g., normal, lognormal, Weibull, etc.) and linear regression. Bayesian methods, however, require the specification of a prior distribution. In many applications (e.g., reliability) useful prior information is typically available for only one parameter (e.g., imprecise knowledge about the activation energy in a temperature-accelerated life test or about the Weibull shape parameter in analysis of fatigue failure data). Then it becomes necessary to specify noninformative prior distributions for the other parameter(s). In this talk, we present several applications showing how to use JMP Bayesian capabilities to integrate engineering or scientific information about one parameter and to use a principled way to specify noninformative or weakly informative prior distributions for the other parameters.

 

 

Auto-generated transcript...

 

Speaker

Transcript

Our talk today shows how to
use JMP to do Bayesian
estimation. Here's an overview
of my talk. I'm going to start
with a brief introduction to
Bayesian statistical methods.
Then I'm going to go through
four different examples that
happen to come from reliability,
but the methods we're presenting
are really much more general and
can be applied in other areas of
application. Then I'm going to
turn it over to Peng and he's
going to show you how easy it is
to actually do these things in
JMP. Technically, reliability
is a probability. The
probability of a system,
vehicle, machine or whatever it
is that is of interest, will
perform its intended function
under encountered operating
conditions for a specified period
of time. I highlight encountered
here to emphasize that
reliability depends importantly
on the environment in which a
product is being used. Condra
defined reliability as quality
over time. And many engineers
think of reliability is being
failure avoidance, that is, to
design and manufacture a product
that will not fail. Reliability
is a highly quantitative
engineering discipline, but
often requires sophisticated
statistical and probabilistic
ideas. Over the past 30 years,
there's been a virtual
revolution where Bayes methods
are now commonly used and in
many different areas of
application. This revolution
started by the rediscovery of
Markov chain Monte Carlo methods
and was accelerated by
spectacular improvements in
computing power that we have
today, as well as the
development of relatively easy
to use software to implement
Bayes methods. In the 1990s we
had BUGS. Today Stan and other
similar packages have largely
replaced BUGS, but the other
thing that's happening is we're
beginning to see more Bayesian
methods implemented in
commercial software. So for
example, SAS has PROC MCMC.
And now JMP has some very
powerful tools that were
developed for reliability, but
as I said, they can be applied
in other areas as well, and
there's strong motivation for
the use of Bayesian methods.
For one thing, it provides a
means for combining prior
information with limited data to
be able to make useful
inferences. Also, there are many
situations, particularly with
random effects complications
like censor data, where maximum
likelihood is difficult to
implement, but where Bayes
methods are relatively easy to
implement. There's one little
downside in the use of Bayes
methods. You have to think a bit
harder about certain things,
particularly about
parameterization and how to
specify the prior distributions.
My first example is about an
aircraft engine bearing cage.
These are field failure data
where there was 1,703 aircraft
engines that contained this
bearing cage. The oldest ones
had 2,220 hours of operation. The
design life specification for
this bearing cage was that no
more than 10% of the units would
fail by 8,000 hours of operation.
However, 6 units had failed and
this raised the question do we
have a serious problem here? Do
we need to redesign this bearing
cage to meet that reliability
condition? This is an event plot
of the data. The event plot
illustrates the structure
of the data, and in particular,
we can see the six failures
here. In addition to that, we
have right censored
observations, indicated here by
the arrows. So these are units
that are still in service and
they have not failed yet, and
the right arrow indicates that
all we know is if we wait long
enough out to the right, the units
will eventually fail. Here's a
maximum likelihood analysis of
those data, so the probability
plot here suggests that the
Weibull distribution provides a
good description of these data.
However, when we use the
distribution profiler to
estimate fraction failing at
8,000 hours, we can see that the
confidence interval is enormous,
ranging between about 3% all the
way up to 100%. That's not very
useful. So likelihood methods
work like this. We specify the
model and the data.
That defines the likelihood and
then we use the likelihood to
make inferences. Bayesian
methods are similar, except we
also have prior information
specified. Bayes theorem
combines the likelihood with the
prior information, providing a
posterior distribution, and then
we use the posterior
distribution to make inferences.
Here's the Bayes analysis of the
bearing cage. The priors are
specified here for the B10
or time at which 10% would fail.
We have a very wide interval
here. The range, effectively 1,000
hours up to 50,000 hours.
Everybody would agree that B10
is somewhere in that range. For
the Weibull shape parameter,
however, we're going to use an
informative prior distribution
based upon the engineers'
knowledge of the failure
mechanism and their vast
previous experience with that
mechanism. They can say with
little doubt that the Weibull
shape parameter should be
between 1.5 and 3, and
here's where we specify that
information. So instead of
specifying information for the
traditional Weibull parameters,
we've reparameterized, where now
the B10 is one of the
parameters, and here's the
specified range. And then we
have the informative prior for
the Weibull shape parameter
specified here. And then JMP
will generate samples from the
joint posterior, leading to
these parameter estimates and
confidence intervals shown here.
Here's a graphical depiction of
the Bayes analysis. The black
points here are a sample from
the prior distribution, so again
very wide for the .1 quantile
and somewhat constrained for the
Weibull shape parameter beta. On
the right here, we have the
joint posterior, which in effect
is where the likelihood contours
and the prior sample
overlap. And then those draws
from the joint posterior are
used to compute estimates and
Bayesian credible intervals. So
here's the same profiler that we
saw previously where the
confidence interval was not
useful. After bringing in the
information about the Weibull
shape parameter, now we can see
that the confidence interval
ranges between 12% and 83%,
clearly illustrating that we
have missed the target of 10%.
So what have we learned here?
With a small number of failures,
there's not much information
about reliability. But engineers
often have information that can
be used, and by using that prior
information, we can get improved
precision and more useful
inferences. And Bayesian methods
provide a formal method for
combining that prior information
with our limited data. Here's
another example. Rocket motor is
one of five critical components
in a missile. In this particular
application, there were
approximately 20,000 missiles
in the inventory. Over time, 1,940
of these missiles had been fired
and they all worked, except in
three cases, where there was
catastrophic failure. And these
were older missiles, and so
there was some concern that
there might be a wearout failure
mechanism that would put into
jeopardy the roughly 20,000
missiles remaining in inventory.
The failures were thought to be
due to thermal cycling, but
there was no information about
the number of thermal cycles. We
only have the age of the
missile when it was fired.
That's a useful surrogate, but
the effect of using a surrogate
like that is you have more
variability in your data. Now
in this case, there were no
directly observed failure times.
When a rocket is called upon to
operate and it operates
successfully, all we know is
that they had not failed at the
age of those units when they
were asked to operate. And for
the units that failed
catastrophically, again, we
don't know the time that those
units failed. At some point
before they were called upon to
operate. They had failed, so all
we know is that the failure was
before the age at which it was
fired. So as I said, there was
concern that there is a wear out
failure mechanism kicking in
here that would put into
jeopardy the amount of remaining
life for the units in the
stockpile. So here's the table
of the data. Here we have the
units that operated
successfully, and so these are
right censored observations, but
these observations here
are the ones that failed and as
I said, at relatively higher
ages. This is the event plot of
the data and again, we can see
the right censored observations
here with the arrow pointing to
the right, and we can see the
left censored observations with
the arrow pointing to the left
indicating the region of
uncertainty. But even with those
data we can still fit a Weibull
distribution. And here's the
probability plot showing the
maximum likelihood estimate and
confidence bands. Here's more
information from the maximum
likelihood analysis. And here we
have the estimate of fraction
failing at 20 years, which was
the design life of these rocket
motors, and again the interval
is huge, ranging between 3% and
100%. Again, not very useful.
But the engineers,
knowing what the failure
mechanism was, again had
information about the Weibull
shape parameter. The maximum
likelihood estimate was
extremely large and the
engineers were pretty sure that
that was wrong, especially with
the extra variability in the
data that would tend to drive
the Weibull shape parameter to a
lower value. As I showed you on
the previous slide, confidence
interval for fraction failing at
20 years was huge. So once again,
we're going to specify a prior
distribution and then use that
in a Bayes analysis. Again, the
prior for B10, the time at which
10% will fail, is chosen to be
extremely wide. We don't really
want to assume anything there,
and everybody would agree that
that quantity is somewhere
between five years and 400
years. But for the Weibull shape
parameter, we're going to assume
that it's between one and five.
Again, we know it's greater than
one because it's a wear out
failure mechanism, and the
engineers were sure that it
wasn't anything like the number
8 that we had seen in the
maximum likely estimate.
And indeed, five is also a very
large Weibull shape parameter.
Once again, JMP is called upon
to generate draws from the
posterior distribution. And here are
plot similar to the ones that we
saw in the bearing cage example.
The black points here, again, are
a sample from the prior
distribution. Again very wide in
terms of the B10, but somewhat
constrained for the beta, so the
beta is an informative prior
distribution. And again the
contour plots represent the
information in the limited data.
In our posterior, once again, is
where we get overlap between the
prior and the likelihood and we
can see it here. So once again
we have a comparison between the
maximum likelihood interval,
which is extremely wide, and the
interval that we get for the
same quantity using the Bayes
inference, which incorporated the
prior information on the Weibull
shape parameter. And now the
interval ranges between .002 and
.98 or about .1, about 10%
failing, so that might be
acceptable. Some of the things
that we learned here, even
though there were no actual
failure times, we can still get
reliability information from the
data, but with very few failures
there isn't much information
there. But we can use the
engineer's knowledge about the
Weibull shape parameter to
supplement the data to get
useful inferences and JMP makes
this really easy to do. My last
two examples are about
accelerated testing. Accelerated
testing is a widely used
technique to get information
about reliability of components
quickly when designing a
product. The basic idea is to
test units at high levels of
variables like temperature or
voltage to make things fail
quickly and then to use a model
to extrapolate back down to the
use conditions. Extrapolation is
always dangerous and we have to
keep that in mind. That's the
reason we would like to have our
model be physically motivated.
So here's an example of an
accelerated life test
on a laser. Units were tested at
40, 60 and 80 degrees C, but the
use condition was 10 degrees C.
That's the nominal temperature
at the bottom of the Atlantic
Ocean, where these lasers were
going to be used in a new
telecommunications system. The
test lasted 5,000 hours, a little
bit more than six months. The
engineers wanted to estimate
fraction failing at about 30,000
hours. That's about 3.5 years
and again, at 10 degrees C.
Here's the results of the
analysi. In order to
appropriately test and build the
model, JMP uses these three
different analyses. The first
one fits separate distributions
to each level of temperature.
The next model does the same
thing, except that it constrains
the shape parameter Sigma to be
the same at every level of
temperature. This is analogous
to the constant Sigma assumption
that we typically make in
regression analysis. And then
finally, we fit the regression
model, which in effect, is a
simple linear regression
connecting lifetime to
temperature. And to supplement
this visualization of these
three models, JMP does
likelihood ratio tests to test
whether there's evidence that
the Sigma might depend on
temperature and then to test
whether there's evidence of lack
of fit in the regression model.
And from the large P values
here, we can see that there's no
evidence against this model.
Another way to plot the results
of fitting this model
is to plot lifetime versus
temperature on special scales. A
log rhythmic scale for hours of
operation in what's known as an
Arrhenius scale for temperature.
Corresponding to the Arrhenius
model, which describes how
temperature affects reaction
rates, and thereby lifetime. And
this is the results of the
maximum likelihood estimation
for our model. The JMP
distribution profiler gives us
an estimate of the fraction
failing at 30,000 hours.
And we can see it ranges between
.002 and about .12, or 12%
failing. The engineers in
applications like this, however,
often have information about
what's known as the effective
activation energy of the failure
mechanism, and that corresponds
to the slope of the regression
line in the Arrhenius model. So
we did a Bayes analysis and in
that analysis, we made an
assumption about the effective
activation energy. And that's
going to provide more precision
for us. So what we have here is
a matrix scatterplot of the
joint posterior distribution
after having specified prior
distributions for the
parameters, weakly informative
for the .1 quantile at 40
degrees C. Again, everybody
would agree that that number is
somewhere between 100 and
32,000. Also weakly informative
for the lognormal shape
parameter. Again, everybody
would agree that that number is
somewhere between .05
and 20. But for the slope of the
regression line, we have an
informative prior ranging
between .6 and .8, based upon
previous experience with the
failure mechanism. And that leads
to this comparison, where now
on the right-hand side here, the
interval for fraction failing at
30,000 hours is much narrower
than it was with the maximum
likelihood estimate. In
particular, the upper bound now
is only about 4% compared with
12% for the maximum likelihood
estimates. So lessons learned.
Accelerated tests provide
reliability information quickly,
and engineers often have
information about the effect of
activation energy. And that can
be used to either improve
precision or to reduce cost by
not needing to test so many
units. And once again, Bayesian
methods provide an appropriate
method to combine the engineers'
knowledge with the limited data.
My final example concerns an
accelerated life test of
an integrated circuit device. Units
were tested at high temperature
and the resulting data were
interval censored. That's
because failures were discovered
only during inspections that
were conducted periodically. In
this test, however, there were
only failures at the two high
levels of temperature. The goal
of the test was to estimate the
.01 quantile at 100 degrees C.
This is a table of the data
where we can see the failures at
250 and 300 degrees C.
And no failures all right
censored at the three lower
levels of temperature. Now when
we did the maximum likelihood
estimation, in this case, we saw
strong evidence that the Weibull
shape parameter depended on
temperature. So the P value is
about .03. That turns out
to be evidence against the
Arrhenius model, and that's
because the Arrhenius model should
only scale time. But if you
change the shape parameter by
increasing temperature, you're
doing more than scaling time.
And so that's a problem, and it
suggested that at 300 degrees C,
there was a different failure
mechanism. And indeed, when the
engineers followed up and
determined the cause of failure
of the units at 250 and 300,
they saw that there was a
different mechanism at 300. What
that meant is that we had to
throw those data away. So what
do we do then? Now we've only
got failures at 250 degrees C
and JMP doesn't do very well
with that. It's surprisingly,
actually runs and gives
answers, but the confidence
intervals are enormously wide
here, as one would expect. But
the engineers knew what the
failure mechanism was and they
had had previous experience and
so they can bring that
information about the slope into
the analysis using Bayes
methods. So again, here's the
joint posterior and the width of
the distribution in the
posterior for beta 1 is
effectively what we assumed
when we put in a prior
distribution for that parameter.
So again, here's the
specification of the prior
distributions, where we used
weakly informative for the
quantile and for Sigma, but
informative prior distribution
for the slope beta 1. And I can
get an estimate of the time at
which 1% fail. So the lower end
point of the confidence interval
for the time at which 1% will fail
is more than 140,000 hours.
So that's about 20 years, much
longer than the technological
life of these products in which
this integrated circuit will be
used. So what did we learn here?
Well, in some applications we
have interval censoring because
failures are discovered only
when there's an inspection. We
need appropriate statistical
methods for handling such data,
and JMP has those methods. If
you use excessive levels of an
accelerating variable like
temperature, you can generate
new failure modes that make the
information misleading. So we
had to throw those units away.
But even with failures at only
one level of temperature, if we
have prior information
about the effective activation
energy, we can combine that
information with the limited
data to make useful inferences.
Finally, some concluding
remarks, improvements in
computing hardware and software
have greatly advanced our ability
to analyze reliability and other
data. Now we can also use Bayes
methods, providing another set of
tools for combining information
with limited data and JMP has
powerful tools for doing this.
So, although these Bayesian
capabilities were developed for
the reliability part of JMP,
they can certainly be used in
other areas of application. And
here are some references,
including the 2nd edition of
Statistical Methods Reliability,
which should be out probably in
June of 2021. OK, so now I'm
going to turn it over to Peng
and he's going to show you how
easy it is to do these analyses.
Thank you, professor.
Before I start my
demonstration, I would like
to show this slide about
Bayesian analysis workflow
in life distribution and
Fit Life by X.
First, you need to fit a
parametric model using maximum
likelihood. I assume you already
know how to do this in these two
platforms. Then you need to
review or find model
specification graphical user
interface for Bayesian
estimation within the report
from the previous step. For
example, this is screenshot of
Weibull model in Life
Distribution. You need to go to the
red triangle menu
and choose Bayesian estimates
to reveal the graphical
user interface for the
Bayesian analysis.
In Fit Life by X, please see
the screenshot of a lognormal
result. And the graphical user
interface for the Bayesian
inference is on the last step.
After finding the graphical user
interface for Bayesian analysis,
you will need to supply the
information about the priors.
You need to decide the priors
dispersion for individual
parameters. You need to supply
the information for the
hyperparameters and additional
information such as the
probability for the quantile.
In addition to that, we
need to provide the number
of posterior samples.
Also need to provide a random
seed in case you want to
replicate your result in the
future. Then you can click Fit
Model. This will generate a
report of the model.
You can fit multiple models
in case you want to study the
sensitivity of different
Bayesian models given
different prior distribution.
The result of a Bayesian model,
including the following things,
first is a method of sampling.
And then is a copy of the priors.
Then had a posterior estimates
of the parameters.
And then there are some scatterplots,
either for the prior or
the posteriors.
In the end, we have two
profilers. One for distribution
and the one for the quantile.
Using these results, you
can make further inferences
such as failure prediction.
Now look at the demonstration.
We will demonstrate with the
last example that the professor
mentioned that ??? presentation.
It's the IC device there.
We have two columns for the time
to event HoursL and HoursU to
represent the censoring situation.
We have a count
for individual observation and
temperature for individual
observation. We exclude the last
four observations because they
are associated with a different
failure mode. We want to exclude
these observations from the
analysis. Now start to specify.
???.
We put hours into Y
We put Count into frequency.
We put Degrees C into X.
We use the Arrhenius Celsius
for our relationship.
We use lognormal for our
distribution.
Then click OK.
The result is the maximum
likelihood influence
for the lognormal. We go to the
Bayesian estimates, and
start to specify our priors like
Professor did in his
presentation. We choose a
lognormal for the quantile.
It's 250 degrees C.
And its B10 life. So
probability is 0.1.
The two ends of the
lognormal distribution is 100
and 10,000.
Now specify the slope.
Distribution is lognormal.
And two ends of the
distribution is .65
and .85 because it's informative
to require the range is narrow.
Now we specify the prior
distribution for Sigma,
which is a lognormal
and it had a wide range;
it's .05 and 5.
We decided to draw 5,000
posterior samples. And assign an
arbitrary random seed. And then
we click...click on Fit Model.
And what the report generates for this
specification. The method is
simple rejection. And here's a
copy of our proir specification.
The posterior estimates
summarize our posterior samples.
You can export the posterior samples
by either clicking this Export
Monte Carlo samples
or choose it from the menu,
which is here Export Monte
Carlo Samples.
Posterior samples are illustrated
in these scatterplots.
We have two scatterplots here.
The first to use
the same paramaterization as the prior
specification, which is...which
use quantile, slope and signal.
The other scatter plot....the
second scatter plot use a
traditional parameterization,
which includes the intercept of
regression, a slope of the
regression and Sigma.
In the end to make ???, we
can you we can look at the
profiler. Here let's look
at the second profiler,
quantile profile, so we can
find the same result
as what Professor had shown in
one of the previous slides.
Enter 0.1...
0.01 for probability. So this is
1%. We enter 100 degree C
for DegreesC.
And we adjust
the axes.
So now we see a similar
profiler.
It has that was already in
the previous slide.
And we can read off the Y axis
to get the result we want, which
is the time that 1% of the
device will fail at 100 degrees
C. So this concludes my
demonstration. And let
me move on.
This slide explains
about...explains JMP implementation
of sampling algorithm.
We have seen that the simple
rejection has shown up in the
previous example and this is the
first stage of our
implementation. The simple
rejection algorithm is tried and
true method to your samples,
but it can be impractical
if rejection rate is high.
So if the rejection
rate is high, we...our
implementations switch to the
second stage, which is a Random
Walk Metropolis-Hastings
algorithm. The second algorithm
is efficient, but in case...in
situations it can fail
undetectably if the likelihood
is irregular. For example, the
likelihood is rather flat. We
designed this implementation
because we have a situation,
there are very few failures or
even no failures. In that
situation the likelihood is
relatively high, but ???
situation, we use simple
rejection algorithm and the
rejection rate is not that bad
and this method will suffice.
When we have more and more
failures, the likelihood it
becomes more regular. So it has
a peak in the middle. In that
situation, the simple
rejecction rate...the simple
rejection method becomes
impractical because of the high
rejection rate. But the Random
Walk algorithm becomes more and
more promising to succeed
without failure. So this is our
implementation and explanation of
why we do that.
This slide explains how do we
specify truncated normal prior
in these two platforms. Because
truncated normal is not a
building prior distribution in
these two platforms.
First, look at what is
truncated normal. Here we
give example of truncated normal
with two ends at 5 and 400. The
two ends are illustrated by
this L and this R.
But truncated normal is nothing
but a normal by discarding
all the values that are
negative, which is represented
by this through curve as the
equivalent normal distribution
for this particular truncated
normal distribution.
If we want to specify this
truncated normal or we need to
define is equivalent...equivalent
normal distribution with two ends
that had that had the same new
and Sigma parameters as those of
these truncated normal
distributions. So we provide a
script to do this.
In this script, the
calculation is the
following. We find the new
and Sigma from the truncated
normal.
So we get can get the
equivalent normal
distribution.
And we that ??? Sigma of this
normal distribution we can find
out the two ends of this normal
distribution.
So to specify the truncated
normal with this two end
value to specify equivalent
normal distribution with two
end...these two end values.
This is how we specify
truncated normal in these
two platforms using
equivalent normal
distribution. And here's the
content of the script.
All you need to do is by calling
a function, which here is the
reverse parameter tnorm to
normal value. What do you need
to provide are the two ends of
the truncated normal
distribution and it will give
the two ends of the equivalent
normal distribution and you can
use those two numbers to specify
the prior distribution.
So this concludes my
demonstration. In my
demonstration I showed how to
start Bayesian analysis in Life
distribution and Fix Life by
X, how to enter prior
information,
and what's the content of
Bayesian result. Also I explained
what our implementation of the
sampling and why do you do
that. And in the end I explain
how do we specify a truncated
normal prior using an
equivalent normal prior in
these two...in these two
platforms. Thank you.