Solved: DOE, balanced custom design

johanna_younous · Jun 9, 2023 8:58 AM

hello community ,

I'm trying to make a balanced DOE design on 2 factors : A at 3 levels and B at 31 levels. I want my facto B to be replicated ~ 1.41 times ( the 31 levels of my factor B are tested on each level of facot A and 13 of them are tested twice for each level of factor A).

I want the DOE platform to design me an experiment were I have the same number of run for each level of my factor A ( and more or less balanced as well for my facor B). So I ask for a custom number of Runs of 132 ( 44 run per level of factor A). But unfortunatly I cannot have something balanced : I have 43 /44 /45 runs for each level of factor A. It may not be clear, i hope the code will help.

lstA = {"L1","L2" , "L3"};


lstB= {"V1_0", "V1_1", "V1_2", "V1_3", "V1_4", "V1_5", "V1_6", "V1_7", "V1_8", "V1_9",
"V1_10", "V1_11", "V1_12", "V1_13", "V1_14", "V1_15", "V1_16", "V1_17", "V1_18",
"V1_19", "V1_20", "V1_21", "V1_22", "V1_23", "V1_24", "V1_25", "V1_26", "V1_27",
"V1_28", "V1_29", "V1_30"};

eval(parse("DOE1 = 
 DOE(
	Custom Design,
	{	Add Factor( Categorical, "||char(lstA )||", \!"A\!", 0 ),
	Add Factor( Categorical,"|| char(lstB)||", \!"B\!", 0 ),
	//Set Random Seed( seed1 ),
		Set Sample Size( "||char( 132)||" )}
	 
);")); 

DOE1<< Make Design; 
DT_DOE= DOE1<< Make Table();
DOE1<<close window();

Finaldt= DT_DOE << Summary(
	Group( :B ),
	N,
	Subgroup( :A ),
	Freq( "None" ),
	Weight( "None" ),
	statistics column name format( "column" ),
	Include marginal statistics // I wish to have in total 44 
);

This is running perfectly smoothly when I have more level for my factor A. I'm gessing 3 levels is a bit low, but still it seems to be possible to do something. There is possibly an option that I forgot.

Any help is welcolme !

thanks a lot in advance

(working on JMP 14.1.0 64bit)

Georg · Sep 14, 2022 08:33 AM

Hello @johanna_younous , "not always" means, there are solutions from JMP that fulfill your requirements. So why not let JMP try until design is balanced.

See loop in script below. It's better than manually changing factors. Trying this a few hundred times, I found that the design was balanced in more than half of tries. So the loop below should have always a solution for you.

Names Default To Here( 1 );

dt_sum = New Table( "DOE designs",
	New Column( "Balanced", "Character", formula( If( :L1 == L2 == L3, "y", "n" ) ) ),
	New Column( "L1", "Continuous" ),
	New Column( "L2", "Continuous" ),
	New Column( "L3", "Continuous" )
);

done = 0;
For( i = 1, (i <= 100) & !done, i++,
	doe_obj = DOE(
		Custom Design,
		{Add Response( Maximize, "Y", ., ., . ), Add Factor( Categorical, {"L1", "L2", "L3"}, "X1", 0 ), Add Factor(
			Categorical,
			{"V1_0", "V1_1", "V1_2", "V1_3", "V1_4", "V1_5", "V1_6", "V1_7", "V1_8", "V1_9", "V1_10", "V1_11", "V1_12", "V1_13", "V1_14", "V1_15",
			"V1_16", "V1_17", "V1_18", "V1_19", "V1_20", "V1_21", "V1_22", "V1_23", "V1_24", "V1_25", "V1_26", "V1_27", "V1_28", "V1_29", "V1_30"},
			"X2",
			0
		), Number of Starts( 1 ), Add Term( {1, 0} ), Add Term( {1, 1} ), Add Term( {2, 1} ), Add Alias Term( {1, 1}, {2, 1} ),
		Set Sample Size( 132 ), Simulate Responses( 0 ), Save X Matrix( 0 ), Make Design}
	);

	doe_dt = doe_obj << make table();
	doe_obj << close window();
	
	Summarize( doe_dt, by_grp = by( :X1 ), by_cnt = Count() );
	
	dt_sum << add row( 1 );
	dt_sum:L1[N Rows( dt_sum )] = by_cnt[1];
	dt_sum:L2[N Rows( dt_sum )] = by_cnt[2];
	dt_sum:L3[N Rows( dt_sum )] = by_cnt[3];
	
	// finish, when design is balanced, otherwise continue with new try
	If( by_cnt[1] == by_cnt[2] == by_cnt[3],
		done = 1,
		Close( doe_dt, NoSave )
	);
);

dt_sum << Distribution( Nominal Distribution( Column( :Balanced ) ) );

Georg

View solution in original post

Dan_Obermiller · Sep 14, 2022 09:15 AM

I am a little late to this thread, and there are some interesting solutions. However, I wonder if you just change the Optimality Criterion to be Alias Optimal. That optimization criteria will typically balance things as much as possible. Here is a script when I took that approach and it seemed to work.

DOE(
	Custom Design,
	{Add Response( Maximize, "Y", ., ., . ),
	Add Factor( Categorical, {"L1", "L2", "L3"}, "A", 0 ),
	Add Factor(
		Categorical,
		{"V1_0", "V1_1", "V1_2", "V1_3", "V1_4", "V1_5", "V1_6", "V1_7", "V1_8",
		"V1_9", "V1_10", "V1_11", "V1_12", "V1_13", "V1_14", "V1_15", "V1_16",
		"V1_17", "V1_18", "V1_19", "V1_20", "V1_21", "V1_22", "V1_23", "V1_24",
		"V1_25", "V1_26", "V1_27", "V1_28", "V1_29", "V1_30"},
		"B",
		0
	), Set Random Seed( 1127601240 ), Number of Starts( 3796 ), Add Term( {1, 0} ),
	Add Term( {1, 1} ), Add Term( {2, 1} ), Add Term( {1, 1}, {2, 1} ),
	Set Sample Size( 132 ), Optimality Criterion( Make Alias Optimal Design ),
	Simulate Responses( 0 ), Save X Matrix( 0 ), Make Design}
)

Dan Obermiller

View solution in original post

Georg · Sep 13, 2022 06:21 PM

Hello @johanna_younous ,

to me it looks like the unbalanced design is caused by the optimization, that JMP does when creating the design according to the optimality criterion.

If you restrict the number of starts to 1, then you exactly get 44 runs for each level of A.

See: Number of Starts (jmp.com)

You can compare the design that you got with optimization to that w/o. In your case there seems to be no real difference at first glance (power analysis etc.).

Maybe there is a better answer from other members or JMP staff as well.

Names Default To Here( 1 );

doe_obj = DOE(
	Custom Design,
	{Add Response( Maximize, "Y", ., ., . ), Add Factor( Categorical, {"L1", "L2", "L3"}, "X1", 0 ), Add Factor(
		Categorical,
		{"V1_0", "V1_1", "V1_2", "V1_3", "V1_4", "V1_5", "V1_6", "V1_7", "V1_8", "V1_9", "V1_10", "V1_11", "V1_12", "V1_13", "V1_14", "V1_15",
		"V1_16", "V1_17", "V1_18", "V1_19", "V1_20", "V1_21", "V1_22", "V1_23", "V1_24", "V1_25", "V1_26", "V1_27", "V1_28", "V1_29", "V1_30"},
		"X2",
		0
	), Set Random Seed( 1348397327 )
	, Number of Starts( 1 /* 1000 */ )
	, Add Term( {1, 0} ), Add Term( {1, 1} ), Add Term( {2, 1} ), Add Alias Term( {1, 1}, {2, 1} ), Set Sample Size( 132 ), Simulate Responses( 0 ),
	Save X Matrix( 0 ), Make Design}
);

doe_dt = doe_obj << make table();
doe_obj << close window();

doe_dt << Summary( Group( :X1 ), Freq( "None" ), Weight( "None" ) );

Georg

johanna_younous · Sep 14, 2022 02:37 AM

Hello Georg,

Thx al lot for this answer.

Unfortunatly, I just tested this solution but fixing the Number of starts to "1" is not sufficiant to always get this results of 44 for levels A. Indeed it can change while the seed number changes.

So far I fixed my problem "manually" by a loop interverting levels A for one of my level B until I get this balance but this is done afterwards and it may not be the best solution nor the most elegant nor the best statistically . I wish I could do that directly in the DOE module.

David_Burnham · Sep 14, 2022 03:21 AM

"Balance" is a feature of classical designs. Whilst we can argue that balance gives us nice statistical properties, the primary motivation was probably the ability to do the analysis in days before access to computers (e.g. Yates algorithm). Custom designs do not enforce balance, and in fact slight imbalance can enhance statistical performance (I think this is discussed in 'Optimal Design of Experiments: A Case Study Approach' by Jones & Goos. If you are really interested I can probably dig out some examples where I have manually balanced the design and shown that statistical performance is degraded as a result.

-Dave

Georg · Sep 14, 2022 08:33 AM

Hello @johanna_younous , "not always" means, there are solutions from JMP that fulfill your requirements. So why not let JMP try until design is balanced.

See loop in script below. It's better than manually changing factors. Trying this a few hundred times, I found that the design was balanced in more than half of tries. So the loop below should have always a solution for you.

Names Default To Here( 1 );

dt_sum = New Table( "DOE designs",
	New Column( "Balanced", "Character", formula( If( :L1 == L2 == L3, "y", "n" ) ) ),
	New Column( "L1", "Continuous" ),
	New Column( "L2", "Continuous" ),
	New Column( "L3", "Continuous" )
);

done = 0;
For( i = 1, (i <= 100) & !done, i++,
	doe_obj = DOE(
		Custom Design,
		{Add Response( Maximize, "Y", ., ., . ), Add Factor( Categorical, {"L1", "L2", "L3"}, "X1", 0 ), Add Factor(
			Categorical,
			{"V1_0", "V1_1", "V1_2", "V1_3", "V1_4", "V1_5", "V1_6", "V1_7", "V1_8", "V1_9", "V1_10", "V1_11", "V1_12", "V1_13", "V1_14", "V1_15",
			"V1_16", "V1_17", "V1_18", "V1_19", "V1_20", "V1_21", "V1_22", "V1_23", "V1_24", "V1_25", "V1_26", "V1_27", "V1_28", "V1_29", "V1_30"},
			"X2",
			0
		), Number of Starts( 1 ), Add Term( {1, 0} ), Add Term( {1, 1} ), Add Term( {2, 1} ), Add Alias Term( {1, 1}, {2, 1} ),
		Set Sample Size( 132 ), Simulate Responses( 0 ), Save X Matrix( 0 ), Make Design}
	);

	doe_dt = doe_obj << make table();
	doe_obj << close window();
	
	Summarize( doe_dt, by_grp = by( :X1 ), by_cnt = Count() );
	
	dt_sum << add row( 1 );
	dt_sum:L1[N Rows( dt_sum )] = by_cnt[1];
	dt_sum:L2[N Rows( dt_sum )] = by_cnt[2];
	dt_sum:L3[N Rows( dt_sum )] = by_cnt[3];
	
	// finish, when design is balanced, otherwise continue with new try
	If( by_cnt[1] == by_cnt[2] == by_cnt[3],
		done = 1,
		Close( doe_dt, NoSave )
	);
);

dt_sum << Distribution( Nominal Distribution( Column( :Balanced ) ) );

Georg

johanna_younous · Sep 14, 2022 08:43 AM

That's a good point !

I may use the "while" loop rather than the classical "if" , but I like your solution thx!

Georg · Sep 14, 2022 09:06 AM

I've chosen the for loop to not end up in infinite loops, because there may be DOE-definitions where we will not find the solution. And then we have an endpoint.

Georg

Dan_Obermiller · Sep 14, 2022 09:15 AM

I am a little late to this thread, and there are some interesting solutions. However, I wonder if you just change the Optimality Criterion to be Alias Optimal. That optimization criteria will typically balance things as much as possible. Here is a script when I took that approach and it seemed to work.

DOE(
	Custom Design,
	{Add Response( Maximize, "Y", ., ., . ),
	Add Factor( Categorical, {"L1", "L2", "L3"}, "A", 0 ),
	Add Factor(
		Categorical,
		{"V1_0", "V1_1", "V1_2", "V1_3", "V1_4", "V1_5", "V1_6", "V1_7", "V1_8",
		"V1_9", "V1_10", "V1_11", "V1_12", "V1_13", "V1_14", "V1_15", "V1_16",
		"V1_17", "V1_18", "V1_19", "V1_20", "V1_21", "V1_22", "V1_23", "V1_24",
		"V1_25", "V1_26", "V1_27", "V1_28", "V1_29", "V1_30"},
		"B",
		0
	), Set Random Seed( 1127601240 ), Number of Starts( 3796 ), Add Term( {1, 0} ),
	Add Term( {1, 1} ), Add Term( {2, 1} ), Add Term( {1, 1}, {2, 1} ),
	Set Sample Size( 132 ), Optimality Criterion( Make Alias Optimal Design ),
	Simulate Responses( 0 ), Save X Matrix( 0 ), Make Design}
)

Dan Obermiller

johanna_younous · Sep 14, 2022 09:16 AM

Hello David

I understand that my request for balance may not the best statistically but it's a technical constraint that I cannot escape Moreover, I checked the performances ( optimallity /power...) and it seems to be really close. Thank you also for your reading recommandation , I'll have a look.

DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design

Re: DOE, balanced custom design