Solved: Re: What might be the most appropiate approach to modeling an ordinal response v...

Julianveda · Aug 13, 2024 02:28 PM

Hi Community,

I have a DoE with several response variables. One of those response variables is an ordinal one having levels: -3, -2, -1, 0, 1, 2 and 3.

My questions is what might be the best or more pertinent way to model this response variable ? is it to model it as an ordinal variable using logistic regression ? or considering the variable as a continuous one and using traditional Ordinal least squares could also be an alternative ? any warnings on the 2 approaches ?

In addition, should I in some way tell jmp that this is a finite scale and that no result is expected beyond -3 or 3 ?

Thank you for any insight on this subject.

Julian

Victor_G · Aug 14, 2024 2:55 AM

Hi @Julianveda,

It looks your ordinal response might be a sensorial rating, very similar to a Likert scale : Échelle de Likert — Wikipédia.

It might be very difficult to help you concretely on your use case, as it might depend on various factors, some linked to measurement system capability and agreement between operators, sensitivity, ... and some linked to the characteristics of the data sample and the choice of the algorithm. Since @statman already mentioned the one linked to measurement system capability, I will make a focus on data sample and choice of the algorithm.

About the characteristics of the data sample :

7 points are indeed the recommended number of levels for this type of scale, but its effectiveness depends on the data balance and representativeness between the different levels, as well as the data quantity (and quality ! signal/noise). In situation with strong imbalance or if some ratings are not (or rarely) used by operators, then it might be very difficult to analyze the data "as it is", so it might be helpful to pre-process/clean the data, for example by binning/grouping some ratings/classes together. Simplifying the number of classes can help logistic models or other algorithms "figure out" the rules to best separate the classes. But don't "over-simplify" your problem, or you risk to either have a "useless" model with only trivial conclusion/outcomes, or to have very noisy classes with a lot of different "realities"/levels in each of them.

Also take into consideration that using your ratings as an ordinal response imply to estimate more coefficients than with a continuous numerical response : you'll need to estimate n-1 intercepts (n being the number of classes) and the parameter estimates linked to your factors (main effects, interactions, ... depending on your assumed model). It might happen that with a DoE, you don't have enough data to estimate all these coefficients, so you might have to choose to simplify your classes (by grouping some) and/or using a continuous response, and/or using a different type of model/algorithm (next section).

The choice of using ordinal and/or continuous responses may also be linked to the number of classes for your response, as well as the uniform/even spacing between classes (from a numerical and sensorial points of view) : even if your classes have a difference of one between each neighboring classes (uniformly spaced from a numerical point of view), does the operator really "feel" this linearity and uniformity in the rating ? Or are there some "gaps" or differences in perception (like no or small perception of difference between intermediary classes -1, 0 and 1, or between the extreme classes -3 and -2 or 2 and 3) ? Does the evaluation involves a benchmark (noted 0) so that the evaluation of your experiments can be done in reference to another reference sample ?

About the choice of the algorithm :

Depending on the linearity of the separation between classes/ratings (and the assumption of linearity between response and predictors), logistic regression may encounter some problems ; it might be interesting to try using other type of models/algorithms, that may help separate the classes in a non-linear way. Some Machine Learning algorithms are effective at separating classes in a non-linear way with a reduced risk of overfitting, so you could try using for example Bootstrap Forest and Support Vector Machines. These algorithms can also be used if you decide to consider your response as numerical continuous.

If your classes are linearly separable and your inputs continuous, you could also check the Discriminant Analysis and see how different methods agree or differ about the results.

In any case, I would recommend to first plot the data and visualize the trends before doing any analysis. Can you spot some trends/patterns ? Does the separation between classes seems easy/hard (visualize or analyze the response depending on each predictor in a univariate way to explore your data) ?

I hope this complementary answer make sense for you and will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

statman · Aug 13, 2024 03:49 PM

Not enough context to give specific advice, but here are my thoughts/questions. What is the ordinal scale? Is it based on human categorization / sensory perception? Have you evaluated the measurement variation in this response?

I would look to see if the ordinal response correlates with the other response variables (make sure the ordinal response is set to continuous). This will be less than perfect, but look at the scatter plots. Then I would try both methods of analysis. Do they agree?

"All models are wrong, some are useful" G.E.P. Box

Julianveda · Aug 28, 2024 08:00 AM

Thank you @statman for your help

Victor_G · Aug 14, 2024 2:55 AM

Hi @Julianveda,

It looks your ordinal response might be a sensorial rating, very similar to a Likert scale : Échelle de Likert — Wikipédia.

It might be very difficult to help you concretely on your use case, as it might depend on various factors, some linked to measurement system capability and agreement between operators, sensitivity, ... and some linked to the characteristics of the data sample and the choice of the algorithm. Since @statman already mentioned the one linked to measurement system capability, I will make a focus on data sample and choice of the algorithm.

About the characteristics of the data sample :

7 points are indeed the recommended number of levels for this type of scale, but its effectiveness depends on the data balance and representativeness between the different levels, as well as the data quantity (and quality ! signal/noise). In situation with strong imbalance or if some ratings are not (or rarely) used by operators, then it might be very difficult to analyze the data "as it is", so it might be helpful to pre-process/clean the data, for example by binning/grouping some ratings/classes together. Simplifying the number of classes can help logistic models or other algorithms "figure out" the rules to best separate the classes. But don't "over-simplify" your problem, or you risk to either have a "useless" model with only trivial conclusion/outcomes, or to have very noisy classes with a lot of different "realities"/levels in each of them.

Also take into consideration that using your ratings as an ordinal response imply to estimate more coefficients than with a continuous numerical response : you'll need to estimate n-1 intercepts (n being the number of classes) and the parameter estimates linked to your factors (main effects, interactions, ... depending on your assumed model). It might happen that with a DoE, you don't have enough data to estimate all these coefficients, so you might have to choose to simplify your classes (by grouping some) and/or using a continuous response, and/or using a different type of model/algorithm (next section).

The choice of using ordinal and/or continuous responses may also be linked to the number of classes for your response, as well as the uniform/even spacing between classes (from a numerical and sensorial points of view) : even if your classes have a difference of one between each neighboring classes (uniformly spaced from a numerical point of view), does the operator really "feel" this linearity and uniformity in the rating ? Or are there some "gaps" or differences in perception (like no or small perception of difference between intermediary classes -1, 0 and 1, or between the extreme classes -3 and -2 or 2 and 3) ? Does the evaluation involves a benchmark (noted 0) so that the evaluation of your experiments can be done in reference to another reference sample ?

About the choice of the algorithm :

Depending on the linearity of the separation between classes/ratings (and the assumption of linearity between response and predictors), logistic regression may encounter some problems ; it might be interesting to try using other type of models/algorithms, that may help separate the classes in a non-linear way. Some Machine Learning algorithms are effective at separating classes in a non-linear way with a reduced risk of overfitting, so you could try using for example Bootstrap Forest and Support Vector Machines. These algorithms can also be used if you decide to consider your response as numerical continuous.

If your classes are linearly separable and your inputs continuous, you could also check the Discriminant Analysis and see how different methods agree or differ about the results.

In any case, I would recommend to first plot the data and visualize the trends before doing any analysis. Can you spot some trends/patterns ? Does the separation between classes seems easy/hard (visualize or analyze the response depending on each predictor in a univariate way to explore your data) ?

I hope this complementary answer make sense for you and will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Julianveda · Aug 28, 2024 08:55 AM

Thanks a lot @Victor_G for the numerous and detailed tips

I have tested binning, but have not tested yet most of the other alternatives. Unfortunately don't have pro version so for the ML alternatives either I have to test them elsewhere or stick to the less advanced alternatives.

What might be the most appropiate approach to modeling an ordinal response variable in a DoE ?

Re: What might be the most appropiate approach to modeling an ordinal response variable in a DoE ?

Re: What might be the most appropiate approach to modeling an ordinal response variable in a DoE ?

Re: What might be the most appropiate approach to modeling an ordinal response variable in a DoE ?

Re: What might be the most appropiate approach to modeling an ordinal response variable in a DoE ?

Re: What might be the most appropiate approach to modeling an ordinal response variable in a DoE ?