Choose Language Hide Translation Bar

Case Study – The Use of Gaussian Process for Analyzing Computer Generated Experiments

Level: Intermediate

Gaussian Process (GP) is one of several analysis techniques that are used to build approximation models for computer generated experiments.  Generally, a space filling design is used to guide the computer experimentation efforts because all the parameters/variables are derived from or directly pulled from first principles physics models/equations.  Space filling designs are used because the data generated by the computer experiments is deterministic and likely to be highly non-linear.  This is where GP comes into play.  Because the data is deterministic, GP will attempt to fit every point in the design perfectly allowing for a close approximation of the true model.  We will compare GP to Response Surface and Neural Net models.  We will also compare GP models derived from different types of space filling designs.

Gaussian Process

  • Typically used to build models for computer simulation experiments.
  • Data is deterministic so there is no need to run an experiment more than once.  A given set of inputs will always produce the same answer.
  • Also known as kriging.
  • More than 100 conditions will take a long time to compute a solution. JMP Pro has Fast GASP; for larger data sets – breaks the GP into blocks allowing for faster computation.
  • You can also have categorical inputs with JMP Pro.

 

Model Options for Gaussian Fit

  • Estimate Nugget Parameter – This useful if there is noise or randomness in the response, and you would like the prediction model to smooth over the noise instead of perfectly fitting.  Highly recommended
  • Correlation Type – lets you choose the correlation structure used in the model
    • Gaussian – allows the correlation between two responses to always be non-zero, no matter the distance between the points.
    • Cubic – allows the correlation between two responses to be zero for points far enough apart.
    • Minimum Theta Value – allows you to set up the minimum theta value used in the fitted model.

 

Variance vs. Bias

  • For most design of experiments the goal is to minimize the variance of prediction.  Because computer experiments are deterministic there is no variance, but there is bias. 
  • Bias is the difference between the approximation model and the true mathematical function.  Space filling designs are used in an effort to bound the bias.

 

Borehole Example

Borehole.png

Types of Space Filling Designs in JMP

  • Sphere Packing – maximizes the minimum distance between design points.
  • Latin Hypercube – maximizes the minimum distance between design points but requires even spacing of the levels for each factor.
  • Uniform – minimizes the discrepancy between the design points and a theoretical uniform distance.
  • Minimum Potential – spreads points inside a sphere around a centroid.
  • Maximum Entropy – measures the amount of information contained in the distribution of a set of data
  • Gaussian Process IMSE Optimal – creates a design the minimizes the integrated mean square error (IMSE) of the Gaussian Process over the experimental Region
  • Fast Flexible Filling (FFF) – FFF method uses clusters of random points to choose design points according to an optimization criterion.  Can be constrained.

 

Summary of Fit

  • Do Gaussian with and without Nugget Parameter and check Jackknife fit.
  • Neural Net models offer a good alternative to Gaussian models but can be more complicated.  NN models sometimes outperform Gaussian models.
  • Use the smoothing function for Neural Nets – JMP Pro
  • Don’t rely on R2 alone when deciding on the best fit model.
  • Picking the right model is about keeping the model as simple as possible while still getting reasonable prediction.

 

Gaussian Process Resources

Comparison of different GP packages - from 2017

Borehole model example found in JMP 14 DOE Guide Chapter 21 pg 637.

Discovery Summit 2011 Presentation: Meta-Modeling of Computational Models – Challenges  and Opportunities