Choose Language Hide Translation Bar
Highlighted
Level I

Regression with Categorical (Character) Variables

Hi all,

I recently finished a test that measured the pressure inside the combustion chamber just after firing/igniting a potato gun. I ran this test for three very different combustion chamber geometries. I gave each combustion chamber geometry a name: Alpha, Bravo, and Charlie. I elected to not use a nomial or ordinal characterization for this factor

I understand that in all types of modeling, it's best to flush out categorical variables wherever I can. Hypothetically speaking though, if I used the names of each geometry as a categorical factor (character type) in my regression development (Fit Model), how does JMP treat this type of factor? Are there any assumptions it makes? Is there a particular report I can look at that tells me the "goodness" or "wrongness" of using this type of categorical variable?

I am fairly new to model development and advanced regression analysis and appreciate any help or guidance.

Thank You!

-Matt

3 REPLIES 3
Highlighted
Level VI

Re: Regression with Categorical (Character) Variables

Matt,

You can choose to assign data type to the column containing the 3 different geometries.  It would, of course, be incorrect to consider these continuous and most likely the nominal type would be correct (unless you can actually assign these to some order).  You will still be able to create a model with the nominal x, but those models are indeed limited.

I can't answer all of your questions, but do want to provide some high level thoughts:

1. Regarding your 3 categories of different geometries, can you quantify those better?  Perhaps volume, angle, some other measure (or even a set of measures) of the geometry?  This might provide for better understanding of which aspects of the geometries impact pressure.

2. I'm curious as to how and where the pressure measurements inside the chamber are taken?  Is the internal pressure at equilibrium?  Can the pressure change within the chamber (particularly curious as geometries may impact this?)?  Have you evaluated the measurement system capability?

I also suggest you play around with first looking at the data graphically.  Then running different models and even changing data types to see the affect the data type has on the analysis outputs.  Good way to learn the software.

Highlighted
Level I

Re: Regression with Categorical (Character) Variables

I was originally thinking nominal could work and I would reference Alpha, Bravo, and Charlie, as 1, 2, and 3, respectively, but was advised not to do this by a statistics SME from work.

With regards to quantifying the three categories, I'm currently working to represent each geometry by its ellipsticity (fitting an ellipsoid through all the edge points and calculating ratios of the major and minor axis). I can't use volume because more interested in how the shape affects the combustion pressure wave, and ultimately the pressure magnitude I'm measuring. I could have the same volume but in a drastically different chamber volume, so volume alone won't help capture the effect of geometry. I'm also already using amount of fuel (measured in percent of chamber volume) as a factor, so using volume as a additional standalone factor would be considered double-counting.

To answer your last question, 6 pressure measurements were taken from the top and sides of each chamber and made sure to measure near the barrel entrance as well as near the igniter. Pressure between the chamber and ambient were fairly equal (+/- 0.5 psi).

All in all, I know that using Alpha, Bravo, and Charlie as as categorical factors of the character type in a least squares fit model is NOT the right approach. Yet, JMP allows me to do it. I just want to know what its doing in the background so I can point to it and say in my report "This is what it does, it's not good."

Highlighted
Level VI

Re: Regression with Categorical (Character) Variables

@mbdahlThere is nothing inherently wrong with articulating the three geometries as categorical variables in a least squares regression model AS long as you are willing to use the hypotheses that are specific to that type of model and categorical factors. See the Statistical Details listed here for model parameterization and other details:

https://www.jmp.com/support/help/en/15.1/#page/jmp/nominal-factors.shtml#ww65535

Here's an example in the JMP documentation that seems to fit your problem. It's called One Way Analysis of Variance, since that's the way the model was parameterized...but it's still essentially a linear model wrt to the parameters and the continuous response.

https://www.jmp.com/support/help/en/15.1/#page/jmp/one-way-analysis-of-variance.shtml#ww229925

Article Labels

There are no labels assigned to this post.