Peter van Bruggen, Statistician, Unilever R&D
At Unilever, thousands of experiments are done before products are (re)launched on the market. The hundreds of researchers, engineers and technicians involved globally in these experiments are educated in different disciplines and have a varying level of statistical knowledge. Applying DOE is often not common practice, and statisticians are not always around for support. It is a fact that for non-statistical experts, DOE poses some difficulties: 1) the overwhelming number of DOEs, 2) the level of detail to fill in the necessary information and 3) the difficulty to understand the results of the statistical analysis. An incorrect design could easily be chosen, or results could be interpreted erroneously. It makes the difference between success and failure. Instead of training all scientists in statistics and DOE, Unilever deployed a tool to overcome these difficulties. This tool – called Plyos – was developed in-house and runs in JMP. Plyos simplifies the process and guides the user through the process of setting up DOEs with extended help. Plyos also gives detailed explanations to interpret the results of the statistical analysis. The presentation highlights the ways of working with the tool, as well as its advantages and limitations.
Please see a separate PDF for the presented slides
“Thinking is one of the most important weapons in dealing with problems”
Mandela said this many years ago in an interview referring to his imprisonment of 27 years.
However, these wise words could be valid for Design of Experiments as well. It has become part of our slogan for DoE related courses.
Unilever is a company producing “fast moving consumers goods” in a divers area of foods and home and personal care products. It knows many sites all around the world.
The process of launching products on the market could be, simplified, split in three stages: In the Research & Development locations products are shaped which are then tested on small scale in pilot plants before they are produced for the market in our factories.
The cost related to experimenting in factories is relatively high because of expensive equipment, often larger batch sizes and due to the fact that the resulting goods cannot be sold on the market.
In R&D we can do much more experiments due to the smaller scale of the experiments resulting in much lower costs per sample. The pilot plant is literally in the middle.
The desire for DoE techniques is, of course, directly related to the number of experiments you do.
The common knowledge about statistics and DoE however – but also the level of statistical support – is unfortunately much lower in pilot plant compared with that in R&D and it is even lower in factories.
With the introduction of DoE in our pilot plants we were facing several challenges.
These challenges starts with the fact that we have over 40 pilot plants around the world, with more than 500 colleagues involved. The level of education differs quite a lot and with that their knowledge of statistics. Using statistical techniques is not always daily practise.
The people working in pilot plants are very highly qualified and appreciated in their own expertise area, but when they start a DoE tool they are facing many options and choices. Some of these choices are “easy” but others might be “very difficult”. Most of the issues they are facing deal with the type of the design and the analysis and interpretation of the results. The lack of knowledge in this area might stop them in using DoE.
There are several ways to overcome a possible lack of knowledge of which the most obvious are listed here. However both “training” and “increasing statistical support” are very expensive due to the large number of sites and colleagues involved. And forcing the use of a technique will not work as we all might know.
2 Introducing DoE in pilot plants
Unilever has opted for the route to simplify the process of applying DoE. This is not only an option to reduce costs, but might also be a more sustainable option for Unilever with so many colleagues and where people change jobs from time to time.
To do so we have set certain demands set to simplify the process of setting up a Design of Experiments and its analysis.
We have tried to avoid as much as possible the use of statistical language. We skipped the terms about the type of design and the analysis techniques, but focus instead on the information that need to be harvested after the trial and the knowledge the user has before the trial.
Since avoiding statistical language is not always possible – especially when analysing data – it has been chosen to provide help at every stage of the process.
3 Plyos tool
For making such simplified process being successful you need, inevitably, a simple, useful and easy to use software tool.
We have developed such a tool. The tool is called “Plyos” and is, of course, available throughout Unilever globally.
This tool runs in JMP and samples only the information that is desired. What are the responses and factors? And it samples only additional information when that is strictly needed. For example the question whether interactions need to be included in the design or not, is raised only when interactions can be included in the design.
All steps are explained and possible decisions are communicated to the user. It checks the input and suggest improvements depending on Unilever knowledge.
Finally the tool provides one effective design which is presented randomised or blocked if circumstances desires.
Also the tool analyses the responses using a minimum number of statistical techniques depending on the type of response and the number of factors and it explains how every technique – individually and in combination – should be interpreted.
4 Plyos program flow
With the given examples you get a rough idea about the flow of the program, with some screen shots aside. Only some highlights are presented.
At start Plyos gives options to change the default folder for storage of the data. You could read help or open a more detailed manual. There are also options to test the application using example files.
More important is of course the design of a new set of experiments or to open an existing design to add response data and to analyse these results.
When the users starts to create a new design, the tool asks first for details to identify the trial in a later stage. Then the user needs to fill in the details of the responses, of which an example screen is given. The detail of information that is requested here is more than you would expect when setting up a design in JMP. This extra information is typical Unilever information and is used in the background to optimise the final resulting design. For example Plyos will ask whether ordinal or nominal responses could be replaced by continuous responses. Continuous responses are easier to analyse and the results will give less problems in interpretation by non-statisticians.
A separate help screen is available to explain the different fields and their impact.
After closing this response screen, the next step in the process deals – in a similar way – with the factors that need to be investigated.
If the Plyos knowledge system decides that the user needs to make a choice – e.g. whether interactions need to be included or not – the system asks the question and supplies help to clarify this question.
The next step “Decisions & consequences” is referring to a forced decision by Plyos.
If Plyos has forced a decision – for example when factor levels need to be limited – a separate window with explanations is shown with – if applicable – options to select.
When everything has been filled and answered, a randomised and ready to use design is provided.
This JMP data table – again supported by help – can be used directly as a recipe for the trials in the pilot plant. Response results can be filled and, of course, the table can be saved for later processing.
The table corresponds with the information provided and is more detailed than the standard JMP data table after their standard DoE process. For example JMP DoE will create nominal factors only for all categorical factors at input, while Plyos make them ordinal or nominal depending on the input.
A good design of experiments is useless without proper analysis of the response data.
In the given example you see on the left hand side a JMP report window presenting the predefined statistical techniques which are chosen for the model under investigation and on the right hand side you see the explanation of the same techniques.
The explanations will only show those techniques that are actually used in the analysis and vice versa.
Together with general descriptions of the model and the most important assumptions the detailed explanations of the analysis techniques – individually and in combination – should be enough for most situations to draw solid and sound conclusions from the data even when the user has only limited knowledge of statistics.
5 Design challenges
During the design of the tool we faced several challenges. Some were rather big in impact, others were small.
The first deals with what the types of design that would be needed for specific combinations of factors and would we be able to create such a flow that the same design is chosen when you start the same process all over again?
Luckily we had quite some expert knowledge about the target group and their specific needs. The different situations were defined by the number of factors, the type of the factors, the number of levels per (categorical) factors, whether or not to include interactions and/or non-linearity and some other constraints.
All information is stored in a large data table which also includes information how the data should be analysed.
In an early stage of development it was decided to use mainly classical designs. These are easy to understand and can be repeated. However these designs are not easy to code especially when you want to avoid that the users comes into JMP’s standard DoE dialogue screens.
Therefore most designs are hard coded in the scripts via loops. Others are stored in data tables and used on demand.
Many types of help were needed, called during different stages in the process of the design creation process. To make maintenance easy the text should not be stored too deep in the scripts.
We put text in different sources, including scripts. However the majority is stored in data tables and journals. Especially the way we found to store lots of text in a journal and to present parts of the text in a flexible way gave us a great satisfaction.
We also encounter several smaller issues:
Certain created column names appeared being language dependent (e.g. “Residuals”). We chose for the simple solution and test now for the language and ask the user to change if needed.
Some “Platform Preferences” change the way how data is presented during analysis. Some cannot be overcome by hard coding. We did not found ways to overcome some of the issues, so we placed warnings in our text.
We started development in 2011 in JMP9. Later JMP11 was introduced causing some problems for certain routines. We build version dependent subroutines to overcome. However JMP9 is not available anymore.
6 Tool availability and roll out
The tool is rolled out over the entire Unilever world. It is delivered via a shared drive.
Although the use of the tool is self-explanatory by means of the help and the included manual, the tool is also part of a training program.
During this training the focus is not on the background of design of experiments nor complicated statistics, but much more on practising the tool and on the preparation necessary to set up a design. Our motto during the training is “Think hard before you start” repeating in fact the words of Mister Mandela.
It is our experience that this thinking process is even more important than the actual process of setting up a DoE.
7 Tool usage
Plyos is available since 2012.
We see the use of the tool spread over the entire Unilever business with a coverage of over 20 countries and over 500 users. About 100 users have created several of the 600 relevant cases that we have seen over the past 2 to 3 years. Plyos appeared being applicable in about 95% of these cases resulting in a valid and useful design. For about 5% of these cases input of a statistician was needed.
One point is however extremely important: All aspects need to be driven and supported by senior management. Without such support there is a high risk that the use of the tool – and thus the application of DoE – will decline. We are very lucky to have this support.
It remains difficult to draw a solid conclusion – especially as a designer of the tool.
However from the statistics we just saw and the comments and questions we received and still receive from the users I think we have been successful in the simplification of the process of setting up designs of experiments.
Of course we need to do everything we can to keep the user focussed on creating a design of experiments whenever it is demanded by the work in the pilot plant. We do so partly by management, but also by delivering dedicated help by statisticians on demand.
Also the tool itself needs to be updated regularly to keep it in line with the growing demands of the user.
It also appeared that the focus on “Think Hard Before You Start” is beneficial. The user is much more aware of what they would like to investigate and sees Plyos as a useful tool to get that done without the help of a statistician.
9 Team & Closure
Finally I would like to stress that this development would not have been possible without a great team.
Winfried Theis left Unilever unfortunately far before the tool was scripted.
The two other guys – Frits Quadt from Quadt Consultancy for technical support and Hans de Roos from Octoplus Information Solutions for scripting – are still in our team.
Of course many others have been involved in the background making it possible that Plyos could be rolled out globally. Their help was and is greatly appreciated.
Peter van Bruggen
Unilever Research & Development, Vlaardingen, The Netherlands