I'm Phil Kay and I'm joined by Weronika, and we're going to talk about a fun experiment that we set up as a competition with the idea that it's sowing the seeds of love for design of experiments. And it's all about growing cress, which I'm sure many of you will have done when you were at school or at home. And there's a problem, I think, in how we educate our young scientists.
T his is taken from the British Broadcasting Corporation's bite size for Key Stage 2, so for young scientists. The curriculum in the United Kingdom, at least, tells people about this fair test idea. And that is when you are testing something, you need to make sure it is a fair test. To do this, everything should be the same except the thing you are testing. So we're only allowed to change one thing at a time.
And that's not ridiculous. It's not necessarily wrong, but it's not all of the truth as well. It's not necessarily the best way when you come to experiment in commercial R&D and industry. T he consequences of this, if we accept that we're only going to test one thing at a time, let's imagining we're experimenting to understand what affects the height of garden cress, and we want to understand what's the effect of light conditions, sunlight or dark.
We're pretty sure that's going to have an effect, but we'd like to understand what it is. What's the effect of the growing medium whether we grow on soil or on cotton wool? Again, we think it's probably going to have an effect. We'd like to experiment to understand or quantify the effect. T he fair test way of doing this would be to take control conditions. So we grow some cress in sunlight and on soil. And then we do a fair test. We just change one thing.
So for Fair test 1, we change the growing medium to cotton wool, and we see what the effect is. For Fair test two, to understand the effect of light conditions, we change to dark and we keep everything else the same. We keep our other factor the same. T his would be fine, fair tests would be fine, except nature doesn't necessarily play by those rules. Nature doesn't play fair all the time. And what we should really be doing in this situation is a designed experiment. And in this case, we would test all possible combinations. W e wouldn't just be changing one thing at a time, we'd make sure we tested all the possible combinations , we changed all the factors according to a strategy.
And what this enables us to do is gain a richer understanding. So we can understand things like interactions between factors. And for this cress experiment, we can see the height at day five after five days of growing, we're looking at the effect of light condition, sunlight or dark. And we can see that the effect of light, whether it's sunlight or dark, is dependent on the growing medium.
F or soil, we are seeing a bigger difference between dark and sunlight than we are with cotton wool. This is an interaction. We can only understand these interactions when we use designed experiments, and these are often critical in commercial R&D. So what we need is some fun ways of introducing these ideas to young students, to students of any age.
Now, let me go to a digression where we got the name of this talk from. It's from a song by a group called Tears for Fears. It's not a very new song, so if you're young, you may not have heard it. If you're a bit older, you'll probably know it because it was nominated for the best postmodern video at the MTV Music Awards 30 odd years ago, whatever best postmodern video means.
The first line is, high time we make a stand and shook up the views of the common man. I don't know if I like that first line very much, but I think it's appropriate here. We'd like to shake up people's views about how we should do experiments, how we should change the factors in an experiment. I was a little bit concerned that this is a British band that people may not have heard of Tears of Fears, may not have heard of this song. So I looked at the data and actually I found that it was a worldwide hit and particularly big in Canada. It reached number one there in 1989.
The Cress experiment, how did this start? Well, my colleague, Michael, in marketing here at JMP, wondered if we could make a fun experiment out of growing garden cress. That hadn't occurred to me. When I first heard this, I thought, that's a brilliant idea. What we wanted to do was create an experiment that's simple enough for anyone, for experimenters of all ages, young experimenters, old experimenters. We wanted it to be simple enough you could do it at home. One of the challenges with coming up with good examples of design of experiments is, science is generally expensive.
Measuring the outputs of your scientific experiments often requires really expensive instruments. So we wanted something that was simple and cheap to do. And we wanted it to be an interesting way, just a fun way to introduce the key concepts of statistical design of experiments. W e didn't want it to be difficult, we didn't want you to have to do lots of very complex analysis. We wanted it to be very immediate and fun way of introducing these ideas.
I did some experiments in the Kay Family Research Kitchen here with some assistants. I had my eight- year- old daughter and my 15- year- old daughter help me with this. My 12- year- old daughter was too busy watching DOC, I think. And it was very successful. They had a good time doing it, I think, and it started some interesting discussions.
W e did this experiment and we set up the experiment so that we were growing some of them in soil, some of them in cotton, some of them in dark, some of them in light conditions. And my eight- year- old child said, "Well, Dad, it would have been easier if we just put all the soil ones in the dark and all the cotton ones in the light."
I didn't say anything, so I wait for my 15- year- old to respond. She said, "Well, but then we wouldn't know if it was the soil or if it was the dark that had the effect." This is a beautifully concise way of describing confounding. This was a very proud moment for me as a parent that one of my children could explain this concept of confounding in a much more succinct way than I have ever managed to do.
And we got great data. T he 15- year- old lost interest after we'd set it up, but my eight- year- old daughter carried on with the experiment, observing it over a number of days. W e measured the height of the tallest plant in each pot, actually within each compartment of an egg box. W e measured those and she took all the measurements and we got some really good quality data.
Let me just show you first of all, though, the actual experiment. T hree factors, we tested substrate, soil or cotton wool, the light conditions, dark or light, and we used plain or curled cress types. We got two different types of cress seeds.
And this is a two- to- the three full factorial for those DOE nerds out there. And we've replicated on the two to the three minus one half factorial there. So that gives us 12 runs, 12 pots, which works well because in the UK at least, egg boxes generally come in sixes. So we could use two egg boxes to do these 12 runs.
And as I said, the data was very good. We can do some simple analysis. This is one of the things I like about it, is that we can just look at the ones that were grown in the light and the ones in the dark and see how the height is different after seven days. And it's very compelling, there's a very big difference.
It wasn't really the difference that I was necessarily expecting, and it was an interesting surprise to all of the experimenters involved. W e can do some simple analysis, just some simple visuals. Let's just plot the heights versus light conditions. And again, you can see the big effect there, big effect of substrate, very little effect of cress type there. So introducing these simple analysis, and then we can obviously take it to a greater level of sophistication, build a full statistical model.
And that brings us to the profiler, which I think is just such a great way of understanding design of experiments and statistical models. Very powerful, compelling way to understand the effects of each factor and interactions between factors as well. I f we look at day seven, we can see there's an interaction between light conditions and our substrate. And we can take it to an even greater level of sophistication because this is actual functional data.
I f you're interested in Functional Data Explorer, well, this is a great example data set because we're collecting the height data as a function of time for each of the runs of our experiment. W e can use Functional Data Explorer and Functional DOE to understand how the factors affect the shape of this growth curve.
We can see the rapid growth with soil versus cotton wool. We can see the rapid growth, increased rate of growth in the dark, and actually the fact that it's starting to die off towards the end of the experiment here. I was really delighted with how the experiment went. It was very simple to do, very compelling, really accurate results. It's so hard to find experiments that people can do at home where they can get an accurate, continuous quantitative response out that they can measure just with a plastic ruler in this case.
We went ahead and did this as a competition. I wrote a blog post about it, which we'll provide the link to that as well. W e ran this last summer, summer of 2022. I'm going to introduce next our competition winner, Weronika. I've also done some visuals of Weronika's results in JMP Public.
We'll share the link to that as well so you can actually see Weronika's data, download the data yourself if you log into JMP Public and see the results for yourself. But now, Weronika is going to show you what she found in this cress experiment and the impressive results that she got that meant she was the competition winner. I think you're on mute, Weronika.
Thank you, Phil for introducing me. I would like to share with you my experience in a competition, my experience regarding design of experiments, regarding the planting of the cress. T he main aim of the challenge was to introduce design of experiments to researchers, to engineers, to students, to any of the people. But also in that experiment, we have to check with design of experiments what factors has influence on the health of the garden products.
T hen defined factors by the organizers, Phil , there were three factors. It was the surface. W e use cotton wool, and g arden soil. Then the second factor was light conditions. So we plant garden cress in sunlight and in the dark. And also we had to check what influence has been soaking on the height.
What was my first impression? As Phil said, they wanted experiment to be simple enough for everybody. But I was not so convinced at the beginning because when taking a look at my previous experiments with planting, it was not so good. So I didn't expect that my garden cress acted in different way. And I wasn't mistake, I wasn't wrong. My first results were good.
First of all, I put so many seeds in one spot that the pre- soak samples become a shell. Some kind of shell. They didn't germinate , so I don't receive any plants. Moreover, my egg box was broken by the water, which can be seen here. It was broken. Also, some marker was destroyed and I saw no numbers of spot. And also the soil migrated from one hole to the adjacent one. I t was mixed with the cotton, especially when I put the waters on the soil, it was not good.
After my first failure, I drove some conclusion why I received a failure. First of all, I decided to use plastic espresso cups instead of the paper cups because plastic is better than water. Use less number of seeds in each hole. Don't put as many as I can, but do it smartly. And also in this moment, I come to idea to maybe add the fourth factor to my experiment, density of the seeds. Also, I wanted to check not only how surface, light condition, and soaking influenced the height, but also the density of seed. I set two levels, low and high.
In low density, I use 20 seeds and evenly spread them in a cup. In high density, I took 40 seeds and try to put every in the middle of the cup, so it's [inaudible 00:16:09] . M y design had four factors. Each factor had two levels. I used a full factorial design as a design time. I received 16 number of treatments, 2 to the power of 4.
I decided to replicate eight treatments in order to receive variability and be able to estimate some standard deviation and so on. I n total, I received 24 test runs. Experiment was done in August when it was very warm, so it was nice weather for a planting and being a gardener. Okay, those are my results. Here we can see design table with all factors, test runs 24. Here I put the height after three, five, and seven days.
In that table, you can see the factor effect estimates after seven days. We can end with the bold font. I marked variables, factors which I found to be statistically significant, and it was surface, light, density. Soak occurred not to be important, but alone as a main effect only.
It occurred to be important in two factor interactions. T he interaction between surface and soak and the light occurred to be significant, so we cannot assume that the soaking is not important. Also, two three-way interactions were significant, four- way interaction not significant. N ow I would like to present you some pipeline and some steps which I used in my design experiment. I think that it's quite a good approach which everyone can use in the experiment.
First of all, we have to generate the design. As a first step, we shall define what factor we want to check and what levels. And when we set it, we have to choose design type, because usually choosing the type is dependent on the factors, how many factors we have, or is only two or three, or maybe we have no factors, how many levels. D efine number of replicates we have to include, and then we can generate this table with which in JMP is very quickly and convenient. When we have a table, we can run experiment, collect data, put in a table.
When we have everything, we can go to the next step, estimation of the factors. We formulate the full regression model and estimate factor effects. So we check which factor is important. Here you can see the main effects plots, two- way interaction plots, three- way interaction plots after seven days. What it's worth to mention is that interaction. This is what Phil said, that the interactions are important. They happen in the real world.
And here it's a good example. F or example, when we have cotton at the surface, it's better to use no soaking. If you use soil, it's better to pre-soak samples. And this is when we would check only one factor at a time, so for example, take soil. And with soil, we would receive that presoking is better. With a cotton , we would use also pre-soaking, but in that case it's not true. T his is the beauty of the interactions. And that's why we have to take into consideration their health.
Then, statistical test. C hecking which effect is important. In JMP, we can also see parameter estimates, the effect tests, and conclude which is significant. W hen we see which are not significant, we should redefine the model after dropping the non- significant effect and calculate estimate one more time, linear regression.
In that case, linear regression . But we cannot finish on that, but we have to also check assumption that our model is correct, statistically correct. So we have to, for example, check the residuals for normal distribution. It can be done with the normal probability graph of residuals in a JMP. When we see the observations, residuals follow the straight line and are in the border range, it means that it's correct, it's normal distribution. But also we can check it with the test, with numerical test like Shapiro- Wilk test, to check if residuals follow normal distribution, and then mean test to check if the mean value is equal to zero.
When we finished that, we can draw the conclusion. In my conclusion, in my experiment, was that the most important factor was light, and its effect was about eight times higher than the effect of the second most important. Plants cultivated in dark grow higher than those in the sun . The other significant factor was surface, and I obtain the result that the garden soil is better. In garden soil, the plants grow higher. Also, the fourth factor which I added, sowing density, also occurred to be important, but its significance increased over the time.
After the three days, sowing density was not insignificant, but after five days it was significant, and after seven days it was even more significant. So it increased with time. A lso in general, during seven days, three different three-way interaction were significant, which suggests that all factors interact really together and we cannot interpret them separately. That all, sun, soil, water, everything in nature is combined and have some dialog inner dialog.
Also, except of that, I checked different physical things, let's say. And cress cultivated inside light become green and developed big leaves. Whereas in dark, they were very yellowish, they were fragile. When I touched them, they broke down. E xcept that they were higher, but they were, I would say, not healthy. And also roots for plants cultivated inside light go longer. Here you can see inside… it's very difficult to see because roots are white and cotton is white. But you can see somehow that they are here rolling around the roots, and here there is just plain cotton. With soil, it's better to visualize because it's better to discern the s oil. And we can see that in some light, we have longer roots, whereas in dark, they are very short.
And to maximize the height after seven days, we shall use soil, we should pre-soak samples, seeds, we shall put them in dark and use high density. Those are the picture of my results. We can see that throughout the experiments, samples in dark all the time they were yellow, they were thin, whereas in the sunlight they were healthy green and thicker.
My conclusion regarding the design of experiments, my experience. Design of experiment is a great tool which can be used to optimize any process. Even something like cultivated garden cress can be fitted to the design of experiments. It helps to incrementally gain knowledge about the process. For example, like me at the beginning, I had no idea how the density influenced the height, but when I put so many things, I decided that I gain knowledge that it has influenced and I have to do something about it and also consider it.
We can also increase our confidence about our our results, and that our results will be indeed statistically significant. So we will have no biases. We know that interactions are involved. Of course, some factors can be alias with others, for example, in factor design.
But the advantage of design of experiment is that we are aware which one are confounded, and we can draw proper conclusion based on that. So if, for example, one pair of confound factors appears to be significant, but we don't know exactly which one, we know to what we have to focus on. And also, do not be afraid and disaffected in the first try to not be successful, treat it as a lesson and draw a conclusion why it happened.
D on't give up, but sit, think, why I failed, what I can do in other way, what I can improve, and do it and try one more time. And design of experiment can bring fun with the proper attitude because this experiment really, really have fun. And as I said, it was August, it was very sunny, so it was nice weather, nice time to spending time on the [inaudible 00:27:04] . Thank you for your attention.
Yes, thanks very much and thanks, Weronika.