All right.
Well, thanks, everyone, for joining us.
The title of our project
is Design a Digital Music Melody Hearing Test.
I'm Patrick Giuliano,
and my co- presenters are Charles Chen and Mason Chen
who couldn't be here today.
So I'm going to be presenting on their behalf.
And this is a project,
a high- school STEM project inspired by ESTEEM's methodology,
which is basically STEM but with AI, math, and statistics well- integrated.
Okay, so just to introduce this project
in the project management flavor with the project charter.
The purpose of the project, in effect, is to design a test
to test the hearing capability of a musician.
The experimental design, philosophy or methodology we use
is JMP's powerful, definitive screening design capability.
And we designed the test based on six music melody variables
in order to test hearing capability,
where each question starts with a short melody
followed by four choices, and where only one is repeated
and the other three melodies are similar but not identical.
From this test, each listener has to pick their best choice
among the options available.
Once we designed this test, we analyze the test survey results.
We build a sensitivity model
in consideration of six music hearing variables,
and then screen the listeners to determine which ones performed the best
in the music hearing test.
And in doing so, in the screening process,
we analyze the strengths and weaknesses of their hearing capability
in the service of ultimately creating an orchestra
with a grading of listeners who are highly capable to evaluate them.
Okay, so in the service of science,
we have an introduction to the mechanism of hearing
where the ear is just basically a frequency- receiving apparatus
that collects sound and vibration of the ossicles in the ear
and cause the mechanical vibration to be converted
into an electrical stimulus, which is interpreted by the brain
by the auditory nerve and ultimately by the brain.
All right, so before we get into the experiment
and the variables that we analyzed,
let's talk a little bit about the frequency range of hearing
among individuals depending on their age.
So people of all ages without hearing impairment
should be able to hear at a frequency of approximately 8,000 Hertz,
and gradual loss of sensitivity to higher frequencies with age
is a normal occurrence.
And so what the science tells us
is that the auditory structures of younger people
are typically more capable of absorbing or interpreting
hearing higher frequency sounds, which is, of course,
relevant in terms of which instruments people are playing,
where the violin has a higher pitch than the cello,
so perhaps a younger person might be more suited
for playing the violin than an older person.
And so this just gives you an idea that basically
people that are in their fifties maybe may only be able to hear
at 12,000 kilohertz... or 12 kilohertz rather, 12,000 Hertz,
whereas people in their 20s can hear up to perhaps 18 kilohertz.
And just to give some context, the average frequency range
for what we listen for the sounds that we hear most often every day
is between 250 Hertz and 6,000 Hertz.
Okay, so what are some challenges associated with hearing
in the context of sounds of different frequency?
So people typically miss high frequency sounds
more often than low frequency ones.
And people with high frequency hearing loss,
they have trouble hearing higher- pitched sounds, of course, right?
And so higher pitch sounds can usually come from women or children
and are in the upper two to eight kilohertz range.
And what's also typical
with high frequency hearing loss in many people
is the presence of a phantom sound, which is the condition called tinnitus,
and that competing sensation of sound can also inhibit a person's ability
to distinguish other high frequency sounds.
So clearly, age is an important factor in terms of designing
an effective hearing test and developing an effective panel of listeners
who are attuned to music.
Although we didn't explicitly consider age in our experiment,
as you'll see in the subsequent slides,
it definitely could be a factor that we could explore further
in our sampling strategy in terms of the survey respondents that we choose.
Okay, so the basic measure of hearing performance
is called an audiogram.
And what you see in the graph on the right
is just a plot of hearing threshold level in decibels on the vertical axis
versus frequency on the horizontal.
And you can clearly see that as hearing loss progresses,
the threshold level of sound and decibels starts to increase
and the degradation and the performance is shown as the plot
splitting the performance by year moving down into the right.
That's the trajectory of the line that's connecting the points
moving down into the right.
Okay, so just a little bit more background
before we launch into the design of the survey and the analysis.
The intent here is just to emphasize
that frequency interference can be a problem
in producing a melodious harmony in an orchestra in particular
or in any sort of musical composition.
And what we're basically showing here
is the difference between what's called fundamental frequencies and harmonics
in the context of a piano,
at least at the note scale indicated at the bottom.
Okay, so what do we know about the music note frequency spectrum?
Well, each note has, not surprisingly, based on the introduction so far,
each note has a particular frequency.
As an example, middle C is at around 262 Hertz,
and higher notes, of course, are going to have higher frequency
and lower notes have lower frequency.
And this slide just gives you a context
for what frequency the notes correspond to.
So note A is around much higher, 440,
then note C at 261 in the second set on the right,
in the lower portion of the slide.
Okay, so there's some relationship between frequency and the number of notes.
Frequency needs to double every 12 notes,
and we have 12 notes in each octave, seven white and five black.
And so you can see that this relationship,
that frequency follows as a function of these notes,
and n is a power- law type relationship.
All right, so taking us back now to the project
and the implementation and the analysis.
So the project plan has three phases.
The first phase is what I'm going to cover,
it's the analysis that I'm going to discuss today.
The first phase is effectively the process of identifying
which people are best hearing performers from a collection of survey results
that we send out based on the survey that we designed.
The second phase is identifying the best hearing performers
from the survey results in order to serve as judges.
In this phase, basically, we try to work on forming the orchestra
prior to phase three where we're actually doing the forming.
But in this instance, we're thinking about things
like which instruments have any potential limitation.
And we may give the same melody to different test instruments,
and not every instrument can play every melody, obviously.
And so the idea is,
how do we know that the individuals that are playing
are playing these instruments accurately?
Well, we need judges who have good listening capability.
So the judges that we curate from phase one
will provide that excellent evaluation in phase two.
So once we have that in place, in phase three,
we can actually really form the digital orchestra.
And we'll think about things like how many players should be involved,
who should play where, obviously.
We'll have a good understanding of how the melodies could be difficult
for certain instruments.
And this is why we need phase two in the middle.
Okay, so here's our design or survey question design.
So we've identified six variables for this hearing test
related to music, the parameters in music:
step, speed, notes changed, notes level, a repeat variable,
and a difficulty variable, a categorical variable, easy or difficult.
The experiment, as I mentioned before, is we're using JMP's DSD,
and in addition to the default, we're generating a default DSD,
and then in effect, we're augmenting the design
by adding two more center points.
So we're doing an 18- run DSD,
which includes one center point, which is row number three in this table,
to have indicated with a zero and an arrow highlighting row three.
And then we're adding two more center points
at row 10 and row 20 respectively.
And the idea here in terms of, we're replacing these center points,
is we want to get an idea of how consistent the results are
throughout the experiment.
So we try to put a center point roughly at the beginning, in the middle,
and the end of the experiment.
And this is analogous to understanding whether a measurement process is stable,
if you're in a manufacturing environment, getting a sense for that.
And then the other important thing about our design here
is that we're randomizing the test sequence,
and that's something that we can do in JMP through the generation of the design.
And I'll show a little bit about that briefly
when we come to the next few slides.
And that randomization is really important because it helps eliminate any bias
due to factors that aren't in the experiment
when we run the test.
And that bias is referred to sometimes as lurking variation
or variation due to lurking variables.
Okay, so there's another consideration that I touched on.
It's in the context of randomization, but it's slightly different context,
which is a little bit more unique
to this particular application and experiment.
And so basically, what we did is generated an initial random variable
and assigned a random sequence, one, two, three, four, and randomized.
But we did a recoding on that.
So we labeled one A, two B, three C, and four D,
and that' s what we see
in terms of identifying the correct answer.
So in the two columns at the right in this table,
in this 20- row table,
we're identifying what the correct answer should be
in terms of the letter,
which is associated with a random variable of one, two, three, four,
w here one corresponds to A, two to B, three to C, and four to D.
And we're doing this to ensure a uniform distribution
of the location variable.
And basically what that means in practical terms is that A, B, C, and D
all have equal percentage of being selected at random.
And this is to avoid the biasing situation
where a student may pick the same answer over and over again
in order to possibly increase his or her chances of performing well,
or perhaps because the survey respondent isn't paying attention
or isn't engaged in the survey.
All right, so here is where we come to an evaluation
of the performance of the design, of the DSD design.
The way we approach this
is through the evaluation of the statistical power of the experiment
which is shown on the left, on the panel on the left
through an evaluation of the confounding pattern
or the extent to which factors are correlated in the experimental design,
and that's shown in the panel in the middle,
and the uniformity, what we call the uniformity of the design,
which is simply, what does the structure of the design look like
in a multivariate space?
Have we covered all of the design points in an approximately uniform way
so that we're able to predict across the entire range of the experiment
with the same degree of precision?
And so what these three indicate, and going back over to the left,
is that the overall power for each of the factors in the experiment
is greater than 90 percent, which is good.
And it shows us that we have good sensitivity to detect effects,
if they're actually there in the population.
The panel in the middle shows
that the risk of what we call multicolinearity
or excessive correlation among the experimental factors is low
because all of the pairwise correlations in this correlation matrix,
most of them are blue,
where a more bluish correlation corresponds to a lower correlation,
where solid blue indicates zero correlation.
And the squares that are closer to a red shading
indicate a higher extent of correlation among factors or terms in the experiment.
And so overall, what we see is that we look for correlations
that don't exceed 0.3,
and that's typically all the squares in this plot
with the exception of those slightly reddish squares
where the correlation is a little bit higher.
And that's because we have categorical factors, right?
We have at least one categorical factor in this experiment.
And if we didn't have the presence of a categorical factor,
this plot would look even bluer.
So we say that in DSD, we don't recommend
adding too many categorical variables into the experiment,
because if we do, then we increase this correlation problem,
which affects our ability to produce estimates in our model that are precise,
leads to inflation of variance in our estimates.
And the final plot on the right, on the far right,
which is an indication of the uniformity of this design,
is a scatter plot matrix in JMP,
and it shows each variable versus every other variable.
And what we're looking for
is for white space to be minimal in this plot.
What I've drawn is a little circle here which your eye can easily pick out.
There's a little bit of extra white space there
at the intersection of Repeat and Step.
And that, again, is because we have a categorical variable in our experiment.
And so truthfully,
there's no perfect zero in the main effects,
no true center point in the main effects
due to the presence of that difficulty variable,
the categorical variable.
And that's reflected in the non- symmetric pattern
of the scatter plot matrix on the right, slightly non-symmetric,
where that asymmetry is indicated in that white space
and with the circle that I've drawn.
Okay, so before I discuss this slide, I just want to quickly show you
how I got to these design diagnostics.
So what you're seeing here is the table that I just showed you.
And I've generated this design using the DSD platform under the DOE menu,
under Definitive Screening and Definitive Screening Design.
And after I did that, JMP already generate,
after I complete the design table generation process
and fill in the results, JMP generates a DOE dialogue script,
saves it to the data table,
and I can actually relaunch the DOE dialog,
and I can also evaluate the design.
So I'm going to go ahead and quickly click on Design Evaluation.
And this is just an overview of the design.
And right here under Design Evaluation
is where I get the diagnostics related to p ower,
which I showed you on the left panel on that slide,
the diagnostics that indicate
to the extent to which factors or terms in the experiment are correlated.
And that's shown here in the color map on the correlation.
And to generate the plot,
looking at the uniformity among the factors,
I actually have to go in and do that in s catter plot matrix
under the Graph menu, S catterplot Matrix.
So that's just some context for you.
And now, I'm just going to quickly bring up the next slide
and then come back to JMP here
just to dynamically show you what we're doing.
So here's probably the most interesting part
of this experiment.
How do we increase the survey test difficulty
and do it in a smart way?
Well, we can use hierarchical clustering analysis to do that.
Now, we already know the correct answer.
It's indicated here in the corresponding column.
The Choice column,
the columns of the four variables on the right
which indicate the choices corresponding to the 20 melody choices
are indicated there.
So we know, for example, in the first row,
the correct answer is C corresponds to melody one
where the C ID number is one.
So we already know the correct answer
where we've assigned it in terms of row order based on a random number,
but how do we pick the other three answers?
Well, based on hierarchical clustering,
we can get a sense of how close each of the other three answers are
to the correct answer.
And in this way, we can make the test a little more difficult.
So all the answer choices are from the 20 melodies.
How do we pick the closer formalities for each question,
or the closest formalities, if you will,
or even maybe melodies that are relatively close together
based on the clustering criterion, but not honoring that criterion strictly,
right?
So this might seem a little bit nebulous,
but in effect, all we're really doing
is telling JMP to assign a clustering scheme by row
and based on some clustering criterion that we specify.
And by default, that criterion is Ward.
So I'm just going to show that dynamically here.
So I have the table open.
All I did here is run Hierarchical C luster under the C lustering menu.
And once I ran this, I went ahead and invoke Cluster S ummaries,
which I turned on here.
And then watch what happens here when I click on each of these clusters.
So you can see that when I click on each of these clusters.
These are the clusters.
So seven and 18 are associated with each othe r,
14 and 17, eight and nine, row two and 13, and so on.
So this is the idea.
We're using the power of JMP to identify rows
that are associated with each other.
And in this way, by arranging the answer choices close to each other,
we make it relatively close to each other by following some schema like this,
we make the test more difficult.
All right, so just launch back into slides here again.
Okay.
All right, so basically the last step here in terms of completing this experiment
is in addition to using a passive criteria for increasing the difficulty of the test,
we want an active criteria.
So we want to be able to separate,
in effect, the beginner level from the advanced level.
So think of it like this.
If every question was super difficult
or if all the choices were very hard to discriminate from,
then you wouldn't be able to distinguish between an advanced- level respondent
and a beginner- level respondent
because everybody would miss all of the questions.
Similarly, if you made all the questions too easy,
then you'd have all experts and no beginners,
and so you have no differentiation.
So based on the science, we have a hypothesis that step and speed
are the most important factors for performance,
for hearing performance, for discriminating between a good melody,
a good composition, and a bad musical composition
and a bad one.
So are we sure about that?
Well, one thing we can do is we can re code the step and speed
by a 50 percent reduction
if it's at difficulty level equal to difficult.
And by doing that, in effect, we still have five variables
and those are indicated in the shaded, right?
So the recoded step, recoded speed are the two columns that are shaded.
And then we have the notes changed, the notes level, and the repeat.
So the DSD is still orthogonal.
We still have three levels.
We have five variables,
but actually we could incorporate up to six in the DSD design.
So how do we increase our value in effect by increasing that variable number to six?
Well, we can add the difficulty variable or the categorical variable
which indicates either easy or difficult.
So we decided to use step and speed, combined with these other three variables,
and the total sample size is still 18 plus 2 or 20
with the two center points and the one center point by default.
But now, we get five levels for speed and step, not three.
So by doing this little transformation,
we smartly create five levels on two variables
instead of just having three levels,
which is typically what we would have in a DSD.
So I think this is a unique approach
that's also quite specific to this problem context
and gives us more levels in our design.
Okay, so this is our design.
How are we going to create the...
What software are we going to use to basically generate the hearing tests?
Okay, well, this is just an overview of Music S oftware S ynthesizer,
which is what we use, soft synth.
And we utilize it to create
24 multiple choice music melody hearing tests.
It's obviously convenient and portable and fast.
All right, so how do we distribute this survey smartly?
Well, our approach is...
Many people do one sampling method.
But here, our approach is to integrate all the different sampling methodologies,
cluster sampling, stratified, and some additional clustering within
in order to distribute the survey to the right audience
to make the survey the most useful.
So when you're ready to send out the quiz, how do you do it?
Well, I have some examples here.
Who should play the music?
Well, there are people who know the music and people who don't.
So we only want to send the surveys
to people who are already familiar with the music, right?
Because ultimately, we want to use these people
to evaluate the performance of an orchestra.
In the stratified sampling sense, we have different kinds of instruments.
We may have five students in a particular pool
that know how to play piano,
we may have two that know how to play violin,
and we may want to sample smartly
so that we only pick a certain number within each strata of players,
people who play particular instruments.
So we may pick randomly within each of these strata
in a certain sampling rate.
And again, with respect to clustering,
we can think of location in terms of practice location or geography
as a selection from many different geographies.
In a sense, we cluster and limit our selection criteria
to only the San Francisco Bay area,
because practicing in person is much easier than practicing virtually.
Okay, so really the point is that this survey dissemination
and survey data collection processes is very holistic
and increases our chances of producing an effective test set, if you will,
of evaluators to help us form the most high- performing orchestra.
Okay, so quickly, to wrap everything up,
we studied the human hearing frequency range,
the instrument frequency spectrum, the music frequency formula,
and we designed an innovative music melody hearing test using DSD.
We also implemented two interesting approaches
to increase the difficulty of the test, hierarchical clustering,
as well as rescaling the levels
of the most important predictors on our responses for the test answers.
And we use the music synthesizer software
to basically disseminate the hearing test across the six music melody variables.
And in our strategy for dissemination, we use the holistic sampling methodology.
So this in closing, some of the approaches that we use
and the science that we developed could be used to develop a hearing aid,
a music melody hearing aid.
And in our current market that we're aware of,
hearing aids are really specially designed for people with hearing loss,
but the idea here would be, how about making a hearing aid
that's about amplifying a certain signal from noise, right?
And that would, in effect,
increase music melody hearing and detection, right?
And so the main objective here
would be to block out noise that's extraneous,
for example, noise from the audience, and then amplify the signal portion
for the particular frequencies that are important
for playing a particular instrument, or even using this type of technology
to even out the pitch, to amplify the transition between melodies.
And so in future work, a similar DSD design can be implemented
in terms of developing this kind of technology.
So thank you very much for listening and let us know if you have any questions.