cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

JMP Blog

A blog for anyone curious about data visualization, design of experiments, statistics, predictive modeling, and more
Choose Language Hide Translation Bar
Victor_G
Super User
Exploring space filling designs part 2: Comparative framework and evaluation metrics

Introduction

Welcome back to our space filling DOE series! In Part 1, we explored the fundamentals and common types of space filling designs. Now we dive into the heart of our comparative study: establishing a robust framework to evaluate and compare these designs across different scenarios.


The comparative study framework

To fairly assess space filling designs, we need standardized conditions and meaningful metrics. Our approach evaluates designs across multiple dimensions of performance.

  1. Responses

Space coverage metric/uniformity response

  • Discrepancy: For measuring deviation from theoretical uniform distribution. The goal is to minimize the difference between the distribution generated by the design and a uniform distribution for each factor. Discrepancy response allows the space coverage and uniformity of the design points repartition to be evaluated: the lower the discrepancy, the more uniform the repartition of the design points (and the better the coverage of the experimental space).

    Once the factors have been entered in the Space Filling platform and the type of design has been chosen, the discrepancy value can be found in the Design Diagnostics panel (Figure 1).

    Menuspacefillingpart2.svgFigure 1: Space filling design panel. Design diagnostic of a fast flexible filling design for two factors with 20 runs. The discrepancy (red) and MaxPro (blue) values are displayed after design generation in the Design Diagnostics panel.

    To better understand the discrepancy value and how it relates to the distances between design points, let’s compare two space filling designs - Latin hypercube and fast flexible filling - created for six factors with 60 runs each (Figure 2).

    SF-Designs_comparison_MaxPro - Graph Builder.svg
    Figure 2: Visualization of the minimum distance between points for a fast flexible filling and a Latin hypercube design.

    If we compare the minimum distance between points for the two designs, we can clearly see that the Latin hypercube design provide a very narrow distribution of values for the minimum distance between design points. This behavior shows the better discrepancy of the Latin hypercube design over the fast flexible filling design for this specific dimensionality and number of points, where the design points are uniformly distributed, resulting in a stable minimum distance between points.

 

Projection property response

  • MaxPro: Evaluates space filling properties on projections to all possible subsets of factors. The goal is to minimize the MaxPro criterion to ensure that distances between points are the highest in all dimensions, ensuring good projections of design points in lower dimensional spaces. Therefore, the only way to successfully minimize this criterion is by ensuring that no two points share similar coordinate values in any dimension.

    Once the factors have been entered in the Space Filling platform and the type of design has been chosen, the MaxPro value can be found in the Design Diagnostics panel (Figure 1).
    The MaxPro statistic is undefined for sphere packing design because points can have identical values for one dimension.

    To better understand the MaxPro value and how it relates to the projection property of designs, let’s compare two space filling designs - Latin hypercube and fast flexible filling - created for six factors with 60 runs each.
    If we project the design points of these two space filling designs in two dimensions, we can compare differences in projection (Figure 3).

    SF_comparison_MaxPro_MDS.svgFigure 3: Visualization of projection property of two space filling designs with six factors and 60 runs. Note how the Waern links connect all design points in the Latin hypercube scenario based on their actual proximities, but not for the fast flexible filling design.

    For the Latin hypercube design on the left, distances between design points are homogeneous and small. This low and homogeneous distance between design points reduces the spread of points across all projections, increasing the MaxPro criterion value. For the fast flexible filling design on the right, distances between design points are heterogeneous and higher, corresponding to a lower risk for these points to collapse when projected into fewer dimensions.

    When considering different sample sizes, discrepancy and MaxPro values are heavily negatively correlated. A design with very low discrepancy can exhibit a high MaxPro value: because of the high sample size, distances between design points are small, causing a higher risk for these points to be close to each others in a lower dimensional space. Therefore the discrepancy and MaxPro criteria are mostly used to compare different designs with same sample size.

 

Design generation complexity

  • HP Time(): The function HP Time() returns a high precision time value in microseconds. Since this function is only useful when used relative to another HP Time() value, it is included in the design generation script. The example here creates a sphere packing design with 60 runs for three factors, and the recording and display of the generation time is a result of the two HP Time() functions:
    t1 = HP Time();
    
    DOE(
       Space Filling Design,
       {Add Response( Maximize, "Y", ., ., . ),
       Add Factor( Continuous, -1, 1, "X1", 0 ),
       Add Factor( Continuous, -1, 1, "X2", 0 ),
       Add Factor( Continuous, -1, 1, "X3", 0 ),
       Set Random Seed( 138892707 ),
       Space Filling Design Type( Sphere Packing, 60 ), Simulate Responses( 0 ),
       Set Run Order( Randomize ), Make Table}
    );
    
    t2 = HP Time();
    
    Show( t2 - t1 );
    The generation time (t2-t1) can be seen in the log (Figure 4).

    SF_log.svg
    Figure 4: Log result of the script with the display of generation time.

    A transformed response column Time (s) is used for the response, as the units (seconds) are much more understandable and comparable between designs and enable the design generation time to be assessed more easily. The column Time (s) is created with the following formula:
    Time (s) = HP Time (µs) / 1000000

 

  1. Factors

As we want to cover the biggest differences in space filling design generations, we need to specify different factors that will help create diverse designs in terms of design generation methods, dimensionality, and density of points.

  • Design type: As already seen in the first post in this series, different design generation methods exist for space filling designs, with different distributional or geometrical emphasis for different use cases and benefits. In the context of this study, we explore the most common type of methods to generate space filling designs, using uniform, sphere-packing, Latin hypercube, and fast flexible filling designs.

  • Dimensionality (number of factors): As the dimensionality (number of factors) increase, the performance of space filling designs may decrease, as it becomes more complex and time-consuming to maintain a uniform distribution for each factor and good projection properties. To study the impact of dimensionality on the different responses listed, we study the dimensionality aspect through three levels:
    • Low-dimensional: Three factors
    • Medium-dimensional: Six factors
    • High-dimensional: Nine factors
  • Sample size variations (number of runs/factors): By default, JMP provides a default ratio of number of runs per factor that is equal to 10, enabling good design performance and ensuring easy modeling. However, depending on the experimental budget allowed, the performance of space filling designs can greatly vary, as it can becomes complex to maintain a uniform distribution for each factor or good projection properties for low number of design points. To account for this factor, we test multiple sample sizes:

    • Small: Three designs points per factor, a situation that is similar to what can be obtained with classical central composite designs (CCD), where the repartition of points is done so that each factor is tested at its minimum, middle and maximum values.
    • Medium: Twenty design points per factor, close to the JMP default ratio of 10 design points per factor.
    • Large: One hundred design points per factor, corresponding to the need for building highly predictive models, thanks to a high density of points in the design space.

 

  1. Experimental procedure: Design choice

We want to use a very simple design that will help compare the different factor levels easily. Moreover, as the design generation seed is not fixed, we want to estimate algorithm variability and separate this aleatoric variability from any factor effects. Also some combinations could be challenging to realize for certain designs configurations, so we want a design that enables a robust analysis even in the presence of possible outliers or missing values.

For all these reasons, a full factorial design is chosen. To create a full factorial design in JMP, the path is easy and straightforward: in the DOE menu, choose Classical and then Full Factorial Design (Figure 5).

Fullfactorialdesign_creation.svg
Figure 5: DOE menu for creating a full factorial design.


Following the responses and factors definition that were listed earlier, we can then create our design to evaluate and compare different space filling configurations (Figure 6).

 

SF.jpg
Figure 6: Creating a full factorial design to evaluate and compare different space filling scenarios.


Since a full factorial design can perform all possible factor level combinations, 36 runs are needed (4x3x3).
The script to generate the full factorial design is:

DOE(
	Full Factorial Design,
	{Add Response( Minimize, "Discrepancy", ., ., . ),
	Add Response( Minimize, "MaxPro", ., ., . ),
	Add Response( Minimize, "Time", ., ., ., ., ., "s" ),
	Add Factor(
		Categorical,
		{"Fast Flexible", "Uniform", "Sphere Packing", "Latin Hypercube"},
		"Type",
		0
	), Add Factor( Continuous, {3, 6, 9}, "Factors", 0 ),
	Add Factor( Continuous, {3, 20, 100}, "Ratio exp/factors", 0 ),
	Set Random Seed( 31068506 ), Make Design, Simulate Responses( 0 ),
	Set Run Order( Randomize ), Make Table}
);

 

The data table obtained with this script can be seen in Figure 7.

Fullfactorialdesign.svgFigure 7: Full factorial design table

 

Response evaluation

For each row in the design, a space filling design is created with the corresponding parameter values specified: number of continuous factors, number of points per factor, and design type. Discrepancy and MaxPro metric values are recorded for all design configurations.

Analysis framework

To analyze the results of this comprehensive study on space filling designs, two approaches are considered:

  • Ranking analysis approach: To simplify the comparison of the performance of the space filling design type across different configurations (dimensionality and sample size) for the three responses, a ranking is done, from 1 (best design) to 4 (worst design).

    For responses where the goal is to minimize the value, such as discrepancy and time, the script used involves the function Col Rank, which returns the rank ranging from 1 as lowest. The <<Tie argument is set to “average”, in case two design configurations have the same performance, to return the average of tied ranks. The By variables used are “Factors” and “Ratio exp/factors” to rank the different design types for each dimensionality and sample size scenarios.

    • Formula for discrepancy ranking:
      Col Rank( :Discrepancy, :Factors, :"Ratio exp/factors"n, <<Tie( "average" ) )
    • Formula for time ranking:
      Col Rank( :Time, :Factors, :"Ratio exp/factors"n, <<Tie( "average" ) )

    • Formula for MaxPro ranking: 
      Col Rank( :MaxPro, :Factors, :"Ratio exp/factors"n, <<Tie( "average" ) )
  • Modeling approach: A modeling approach is done through the use of the model script attached in the data table after design generation. As the study design is a full factorial design, the default a priori model specified in this script involves all main effects (design type, number of factors and ratio of experiments per factor) and all two-factor interactions (Figure 8).  

    FitModel1.svg
    Figure 8: Default Fit Model launch. The option Fit Separately is checked to fit separate models for each response variables.

    As three levels are used for the continuous factors, the modeling script is modified to account for curvature effects due to quadratic effects of the continuous factors (Figure 9).

    FitModel2.svgFigure 9: Modified Fit Model launch. The model script is modified by adding quadratic effects for continuous factors. Select the two continuous factors and then click on Macros>Polynomial to Degree to add the two quadratic effects.

    The model used to analyze the performance responses of the space filling designs includes main effects, two-factor interactions, and quadratic effects for the continuous factors.

 

Coming next

In our final post, we reveal the results of this comprehensive comparison! We show which space filling designs excel in different scenarios and provide practical guidelines for design selection based on experimental objectives and constraints.

This is the second part of a three-part series on space filling designs of experiments:

  1. Introduction to common types of space filling designs.
  2. Comparative framework and evaluation metrics (this post).
  3. Results, analysis, and design selection guidelines.
Last Modified: May 22, 2026 11:08 AM
Comments
ktbrickey22
Level II

Hi Victor, thanks so much for the insightful post! Is it possible for JMP to calculate the space filling design diagnostics, e.g. the discrepancy and MaxPro criteria, for space-filling designs not generated by JMP? It doesn't look like Evaluate Design offers these statistics as an option, just curious if there's another easy route for this evaluation. 

Victor_G
Super User

Hi @ktbrickey22,

Thanks a lot for the positive feedback !

Currently no, it's not possible to calculate any space filling design diagnostic (discrepancy and MaxPro) for designs generated outside of JMP. It was also a question I had when doing my experiments for this blog: Get MaxPro value for any Space-Filling design As it seems not possible to calculate these values easily for the moment, I created a Wish List for this topic : Compare Designs platform for Space-Filling designs 

For the moment, if I need to compare discrepancies between two space filling designs, I use discrepancy function from SciPy, which doesn't give me the same values as in JMP but allows me to compare any designs. There are other nice metrics and visualizations available from SciPy like minimum spanning tree to visualize minimum distance between points.

For MaxPro, it's a more complex topic and I haven't found a satisfying solution (yet).

Hope this answer may help you :)

ktbrickey22
Level II

Thanks @Victor_G for the speedy reply and tips! I liked your Wish List items and will look into the SciPy functions in the interim. Cheers!

charlie_whitman
Staff

@Victor_G, I had a quick question.  For the MaxPro approach you discuss above, you mention, "The goal is to maximize the MaxPro criterion to ensure good projections of design points in lower dimensional spaces.".  However, from the help, it says the idea is to minimize the MaxPro criterion.  Can you please clarify?

Victor_G
Super User

Hi @charlie_whitman,

Thanks a lot for your careful reading and comment, and spotting this potential mistake. I am deeply sorry, I think I got confused when reading about the MaxPro criterion. This criterion is meant to evaluate the projection properties of the design, so the higher projection property, the better, as it helps maximizing the information and having an efficiently spaced design even in the presence of inactive/low importance factors. It influences this spacing by maximizing the product of the distances between potential design points. However in the formula of the MaxPro criterion in the JMP Help, this distance between design points is at the denominator, so indeed it seems this criterion is supposed to be minimized. This is also repeated in Design Diagnostic of JMP Help: "Smaller values are better".

Even if it is written in the JMP Help, I'm still confused now, and even more when looking at the option Maximize MaxPro criterion in Bayesian Optimization platform. What is the benefit of maximizing the MaxPro criterion (have runs that may be clustered or "stacked" when looking at smaller subspaces of the experimental space) ? The Help in this section mentions: "This option is a model-free exploration of the factor space that avoids replication of any of the factor settings in both the training data and the current batch. Use this option for any batch size when one or more of the models is not fitting well." If the objective is to explore further because the model is not fitting well, should this option try to minimize MaxPro new design points values (to increase distance between design points and explore experimental space) instead of maximizing them, to ensure good coverage of newly added runs ?

If someone from the JMP DoE team could help solve this confusion, I would greatly appreciate !
I will correct as soon as possible parts 2 and 3 of this blog post series accordingly.

charlie_whitman
Staff

@Victor_G - Thanks for the quick response.  It wasn't me who saw the inconsistency but a customer.  Thanks again!

 

Victor_G
Super User

@charlie_whitman sorry again about this mistake. The corrections have been done in parts 2 and 3. Let me know if you or your customers notice some other mistakes (hopefully not).