Subscribe Bookmark RSS Feed

Best way to analyze subjective response data

Hello -

I am analyzing an experiment that I have run where subjects (n=12) scaled 4 responses based on watching a movie composed of combinations of 3 categorical factors:

sound stimuli (5-level)
visual stimuli (4-level)
position (4-level)

I built the experiment in the DOE platform such that I could analyze main effects and second-order interactions, which yielded 44 "scenes". Subjects saw the scene 3 times and the mean response of the 4 responses (let's call them R1, R2, R3, R4 - all continuous) were taken for analysis.

Currently I am using the fit model platform where I put in:

Sound Stimuli
Visual Stimuli
Position
SS x VS
SS X P
VS x P

using the factorial to degree 2 macro.

The resultant fit using "effect leverage" shows sig. main effect of
VS and Pos. for R1
VS and Pos for R2
VS, Pos, AS for R3
Pos for R3

When I include "Subject" as a random effect using REML - the Rsq of the model increases quite a bit, from say 0.17 to 0.64 on avg.

1) Is using subject as a random effect a legit. way of analyzing these data?
2) How would one report finding in terms of the descriptive stats in a paper
i.e.

A significant main effect of Visual Stimuli was found for R1 [F(i,j) = X, p=.0001] -
I see the DF for my effect, but not of the total model? how does this work with a REML effect?

I did not test for a three-way interaction (which would have required 80 scenes), but since none of my two-way interactions are significant, is there any reason to expect a three-way interaction?

I know this is a lot but I appreciate any advice!

Best - Dan
3 REPLIES
statman

Community Trekker

Joined:

Jun 23, 2011

I will offer some comments. First, what is your objective? What are you trying to accomplish? Are you looking for clues about some hypotheses you have or are you trying to write some predictive equation?...etc. Second, since the response is "subjective", wouldn't it be better to plan an experiment where the factors are set to two-levels (with bold level setting). Why would you be wanting to understand complex curvature when the linear effects are not known? Third, you are analyzing only the mean of the responses? I suggest you first establish the means are reasonable (estimable) by looking at the variation of the 12 respondents. Are they consistent? Are they similar? Any trends if time ordered (I would be suspicious of the response variable trending with the nature of the visual observation). Provided the variation estimates are OK, I would also analyze the experiment with the variation estimate as a response variable.

It seems you have written a saturated model (accounting for all degrees of freedom). When "playing " with other terms in the model, you should be using R-squared adjusted? How did you add subject? what terms did you drop? or did you super saturate the model?

You certainly can't claim that a three-way is not present given no two-ways, but I would not focus on the three-way given you don't understand the main effects.

In summary, I think you would be better off running several DOE studies (iterative) using fractional factorials and two-level designs. I would also understand the measurement system before proceeding.

Good luck.
I will offer some comments.

***First, what is your objective?***

- The objective it to determine the factors that influence scaling acoustical stimuli taken from measurements in a real space to a visual representation of that space: do subjects have clear expectations about how a room will sound given it visual appearance? Basically how do their expectations align with results taken from measurements.

***What are you trying to accomplish? Are you looking for clues about some hypotheses you have or are you trying to write some predictive equation?...etc***

The hypothesis is that the nature of the sound-emitting source, position in the room, and other visual cues will affect a participant's scaling of an acoustical model in a way that does not align with measured results in that room. Also the perceptual ranking of source width and envelopment will be affected by the visual make-up of the sound-emitting source and position in the room.

*Second, since the response is "subjective", wouldn't it be better to plan an experiment where the factors are set to two-levels (with bold level setting). Why would you be wanting to understand complex curvature when the linear effects are not known?*

This is a follow-up experiment to a exploratory experiment where subjects were shown images of rooms of increasingly large volumes and asked to set levels of reverberation in the room. A monotonic increase of reverberation was seen as a function of presented room volume. The slope of the increase was not parallel to measured results in the room. This use a fixed image of the sound-emitting source. i.e. just a video recording of the musician playing in the room. This follow up experiment aimed to asses how changing the make-up of the source could further sway this experience.

Since all the factors were categorical, I wasn't really aiming to show an curved effects, just test at different levels.

For the first factor there was 5 different types of musical performance, 2 percussion, 2 sustained, and 1 speech
For the second factor we looked at 4 different measurement locations in the room. i.e distance and angle of view from the performer
For the third factor we looked at how the sound source was presented. You either saw no sound source, an image of a loudspeaker, a video recording of the source, or a video recording of the source + two loudspeakers to the left and right.

The experiment is complete and the methodology has been accepted by the reviews of the paper, they just had issue with the way I conducted the statistical analysis...which is why I approached this forum, my background is in architectural acoustics and engineering and unfortunately not statistics...


Third, you are analyzing only the mean of the responses?


44 conditions x 3 = 132 videos that were randomly presented in a 90 min experimental session, the mean of the replications was taken for each of the 12 subject's 3 viewings of a particular set of conditions.

Had I done 44, 44, 44 I could have done a time variable and then it would be a repeated measures but since each of the replicates was interleaved in the same session, it wouldn't really be a repeated measures, would it?

After each scene, there was a black-out for 10 seconds to minimize learning effects etc..

****I suggest you first establish the means are reasonable (estimable) by looking at the variation of the 12 respondents. Are they consistent? Are they similar?


There lies the problem, some subjects were quite variable in how they ranked extremes - R3, R4 were scales from 1-7, some subjects used the full range, some the top, some the bottom etc.
R1 and R2 were level values from -90 to 0 dBfs (acoustical power), these were much less variables, i.e. on average subjects had a more clear expectation about how loud a source should be (spoken voice should have less sound power than a snare drum for instance...)


Any trends if time ordered (I would be suspicious of the response variable trending with the nature of the visual observation). Provided the variation estimates are OK, I would also analyze the experiment with the variation estimate as a response variable.***


So calculate the variation of all responses for a subject and through that into the mix? That sounds reasonable...




It seems you have written a saturated model (accounting for all degrees of freedom). When "playing " with other terms in the model, you should be using R-squared adjusted? How did you add subject? what terms did you drop? or did you super saturate the model?


I am not familiar with the concept of saturated mode, or super saturated

I ended up using main-effects only plus subject&random - to yield a mixed model REML and then looking at the effects leverage platform in fitmodel to asses significance.

You certainly can't claim that a three-way is not present given no two-ways, but I would not focus on the three-way given you don't understand the main effects.

The main effects make sense given the experiment - i.e. the trends fit the hypothesis it is just that there is so much variability in a subject's response... i.e. some are giving me the forrest and some the trees for the same scene.

In the future, I would look at a paired comparison on conjoint design, because the range of responses are too large in this experiment. That being said, I can't go back and collect more data, just analyze what I have...



In summary, I think you would be better off running several DOE studies (iterative) using fractional factorials and two-level designs.


I agree - this would have been a better approach, and certainly make the step size much large. i.e. use a bathroom and a concert hall as opposed to 4 position within the same large space...

Thank you for your response.



I would also understand the measurement system before proceeding.
statman

Community Trekker

Joined:

Jun 23, 2011

Thank you for your thorough response. I will add to the discussion although it has become apparent there is much more to the study then originally presented. First comment is I believe you have repeats and replicates confused....repeats occur when multiple responses (same Y) are collected without changing treatment combinations...in replicates, treatments change between responses. For repeats you get no added DF for replicates you do (over simplifying).

I am sorry to comment on your experiment since it is already completed...of course I think it is ALWAYS good to critique the experimental work afterwards for learning.

As you seem to indicate, if the measurement system has that much variation, playing statistical "games" with the average (which may not be representative of anything) is well...meaningless.

I need more on how you wrote the model for analysis....I thought you said you had the 3 main effects and the three 2-way interactions (that is 43 degrees of freedom). Your original post said nothing of replicates, so you have quite a few additional degrees of freedom currently unassigned (131 total DF)...you could write a saturated model adding the term replicate (2 DF) and all of the interactions by replicate....then do fit model (least squares) and go from there.

Perhaps some simple graphical analysis would be effective. What are the complaints about how you did the analysis from the reviewers?