This presentation showcases designing a special music hearing test to test a musician’s ability to hear melodies. The Definitive Screen Design (DSD) platform in JMP was utilized to consider six music script input variables (step, speed, notes changed, note level, repeat, difficulty) and then added two more center points for evaluating the Gage R&R performance. Each DSD run is a multiple-choice test allowing respondents to pick their response from four available choices.

 

JMP Hierarchical Clustering platform was used to group similar music scripts from the 20 scripts provided by DSD runs and assign the similar scripts for the other three non-correct choices. The correct choices were then added to make each hearing question more challenging. Next, a stratified cluster hybrid sampling method was adopted to select 30 candidates to participate in the survey. Once the scripts were determined, a commercial music synthetic software program was used to create this DSD melody hearing test. After collecting the survey results, the Fit Definitive Screening platform in JMP was used to analyze the DSD survey results. The goal was to determine the best rater (higher propensity for accurate rating of musical melodies) to serve as the judge for next project phase.

 

 

Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
  • Chapters
  • descriptions off, selected
  • captions off, selected

     

    All right.

    Well, thanks, everyone, for joining us.

    The title of our project

    is Design a Digital Music Melody Hearing Test.

    I'm Patrick Giuliano,

    and my co- presenters are Charles Chen and Mason Chen

    who couldn't be here today.

    So I'm going to be presenting on their behalf.

    And this is a project,

    a high- school STEM project inspired by ESTEEM's methodology,

    which is basically STEM but with AI, math, and statistics well- integrated.

    Okay, so just to introduce this project

    in the project management flavor with the project charter.

    The purpose of the project, in effect, is to design a test

    to test the hearing capability of a musician.

    The experimental design, philosophy or methodology we use

    is JMP's powerful, definitive screening design capability.

    And we designed the test based on six music melody variables

    in order to test hearing capability,

    where each question starts with a short melody

    followed by four choices, and where only one is repeated

    and the other three melodies are similar but not identical.

    From this test, each listener has to pick their best choice

    among the options available.

    Once we designed this test, we analyze the test survey results.

    We build a sensitivity model

    in consideration of six music hearing variables,

    and then screen the listeners to determine which ones performed the best

    in the music hearing test.

    And in doing so, in the screening process,

    we analyze the strengths and weaknesses of their hearing capability

    in the service of ultimately creating an orchestra

    with a grading of listeners who are highly capable to evaluate them.

    Okay, so in the service of science,

    we have an introduction to the mechanism of hearing

    where the ear is just basically a frequency- receiving apparatus

    that collects sound and vibration of the ossicles in the ear

    and cause the mechanical vibration to be converted

    into an electrical stimulus, which is interpreted by the brain

    by the auditory nerve and ultimately by the brain.

    All right, so before we get into the experiment

    and the variables that we analyzed,

    let's talk a little bit about the frequency range of hearing

    among individuals depending on their age.

    So people of all ages without hearing impairment

    should be able to hear at a frequency of approximately 8,000 Hertz,

    and gradual loss of sensitivity to higher frequencies with age

    is a normal occurrence.

    And so what the science tells us

    is that the auditory structures of younger people

    are typically more capable of absorbing or interpreting

    hearing higher frequency sounds, which is, of course,

    relevant in terms of which instruments people are playing,

    where the violin has a higher pitch than the cello,

    so perhaps a younger person might be more suited

    for playing the violin than an older person.

    And so this just gives you an idea that basically

    people that are in their fifties maybe may only be able to hear

    at 12,000 kilohertz... or 12 kilohertz rather, 12,000 Hertz,

    whereas people in their 20s can hear up to perhaps 18 kilohertz.

    And just to give some context, the average frequency range

    for what we listen for the sounds that we hear most often every day

    is between 250 Hertz and 6,000 Hertz.

    Okay, so what are some challenges associated with hearing

    in the context of sounds of different frequency?

    So people typically miss high frequency sounds

    more often than low frequency ones.

    And people with high frequency hearing loss,

    they have trouble hearing higher- pitched sounds, of course, right?

    And so higher pitch sounds can usually come from women or children

    and are in the upper two to eight kilohertz range.

    And what's also typical

    with high frequency hearing loss in many people

    is the presence of a phantom sound, which is the condition called tinnitus,

    and that competing sensation of sound can also inhibit a person's ability

    to distinguish other high frequency sounds.

    So clearly, age is an important factor in terms of designing

    an effective hearing test and developing an effective panel of listeners

    who are attuned to music.

    Although we didn't explicitly consider age in our experiment,

    as you'll see in the subsequent slides,

    it definitely could be a factor that we could explore further

    in our sampling strategy in terms of the survey respondents that we choose.

    Okay, so the basic measure of hearing performance

    is called an audiogram.

    And what you see in the graph on the right

    is just a plot of hearing threshold level in decibels on the vertical axis

    versus frequency on the horizontal.

    And you can clearly see that as hearing loss progresses,

    the threshold level of sound and decibels starts to increase

    and the degradation and the performance is shown as the plot

    splitting the performance by year moving down into the right.

    That's the trajectory of the line that's connecting the points

    moving down into the right.

    Okay, so just a little bit more background

    before we launch into the design of the survey and the analysis.

    The intent here is just to emphasize

    that frequency interference can be a problem

    in producing a melodious harmony in an orchestra in particular

    or in any sort of musical composition.

    And what we're basically showing here

    is the difference between what's called fundamental frequencies and harmonics

    in the context of a piano,

    at least at the note scale indicated at the bottom.

    Okay, so what do we know about the music note frequency spectrum?

    Well, each note has, not surprisingly, based on the introduction so far,

    each note has a particular frequency.

    As an example, middle C is at around 262 Hertz,

    and higher notes, of course, are going to have higher frequency

    and lower notes have lower frequency.

    And this slide just gives you a context

    for what frequency the notes correspond to.

    So note A is around much higher, 440,

    then note C at 261 in the second set on the right,

    in the lower portion of the slide.

    Okay, so there's some relationship between frequency and the number of notes.

    Frequency needs to double every 12 notes,

    and we have 12 notes in each octave, seven white and five black.

    And so you can see that this relationship,

    that frequency follows as a function of these notes,

    and n is a power- law type relationship.

    All right, so taking us back now to the project

    and the implementation and the analysis.

    So the project plan has three phases.

    The first phase is what I'm going to cover,

    it's the analysis that I'm going to discuss today.

    The first phase is effectively the process of identifying

    which people are best hearing performers from a collection of survey results

    that we send out based on the survey that we designed.

    The second phase is identifying the best hearing performers

    from the survey results in order to serve as judges.

    In this phase, basically, we try to work on forming the orchestra

    prior to phase three where we're actually doing the forming.

    But in this instance, we're thinking about things

    like which instruments have any potential limitation.

    And we may give the same melody to different test instruments,

    and not every instrument can play every melody, obviously.

    And so the idea is,

    how do we know that the individuals that are playing

    are playing these instruments accurately?

    Well, we need judges who have good listening capability.

    So the judges that we curate from phase one

    will provide that excellent evaluation in phase two.

    So once we have that in place, in phase three,

    we can actually really form the digital orchestra.

    And we'll think about things like how many players should be involved,

    who should play where, obviously.

    We'll have a good understanding of how the melodies could be difficult

    for certain instruments.

    And this is why we need phase two in the middle.

    Okay, so here's our design or survey question design.

    So we've identified six variables for this hearing test

    related to music, the parameters in music:

    step, speed, notes changed, notes level, a repeat variable,

    and a difficulty variable, a categorical variable, easy or difficult.

    The experiment, as I mentioned before, is we're using JMP's DSD,

    and in addition to the default, we're generating a default DSD,

    and then in effect, we're augmenting the design

    by adding two more center points.

    So we're doing an 18- run DSD,

    which includes one center point, which is row number three in this table,

    to have indicated with a zero and an arrow highlighting row three.

    And then we're adding two more center points

    at row 10 and row 20 respectively.

    And the idea here in terms of, we're replacing these center points,

    is we want to get an idea of how consistent the results are

    throughout the experiment.

    So we try to put a center point roughly at the beginning, in the middle,

    and the end of the experiment.

    And this is analogous to understanding whether a measurement process is stable,

    if you're in a manufacturing environment, getting a sense for that.

    And then the other important thing about our design here

    is that we're randomizing the test sequence,

    and that's something that we can do in JMP through the generation of the design.

    And I'll show a little bit about that briefly

    when we come to the next few slides.

    And that randomization is really important because it helps eliminate any bias

    due to factors that aren't in the experiment

    when we run the test.

    And that bias is referred to sometimes as lurking variation

    or variation due to lurking variables.

    Okay, so there's another consideration that I touched on.

    It's in the context of randomization, but it's slightly different context,

    which is a little bit more unique

    to this particular application and experiment.

    And so basically, what we did is generated an initial random variable

    and assigned a random sequence, one, two, three, four, and randomized.

    But we did a recoding on that.

    So we labeled one A, two B, three C, and four D,

    and that' s what we see

    in terms of identifying the correct answer.

    So in the two columns at the right in this table,

    in this 20- row table,

    we're identifying what the correct answer should be

    in terms of the letter,

    which is associated with a random variable of one, two, three, four,

    w here one corresponds to A, two to B, three to C, and four to D.

    And we're doing this to ensure a uniform distribution

    of the location variable.

    And basically what that means in practical terms is that A, B, C, and D

    all have equal percentage of being selected at random.

    And this is to avoid the biasing situation

    where a student may pick the same answer over and over again

    in order to possibly increase his or her chances of performing well,

    or perhaps because the survey respondent isn't paying attention

    or isn't engaged in the survey.

    All right, so here is where we come to an evaluation

    of the performance of the design, of the DSD design.

    The way we approach this

    is through the evaluation of the statistical power of the experiment

    which is shown on the left, on the panel on the left

    through an evaluation of the confounding pattern

    or the extent to which factors are correlated in the experimental design,

    and that's shown in the panel in the middle,

    and the uniformity, what we call the uniformity of the design,

    which is simply, what does the structure of the design look like

    in a multivariate space?

    Have we covered all of the design points in an approximately uniform way

    so that we're able to predict across the entire range of the experiment

    with the same degree of precision?

    And so what these three indicate, and going back over to the left,

    is that the overall power for each of the factors in the experiment

    is greater than 90 percent, which is good.

    And it shows us that we have good sensitivity to detect effects,

    if they're actually there in the population.

    The panel in the middle shows

    that the risk of what we call multicolinearity

    or excessive correlation among the experimental factors is low

    because all of the pairwise correlations in this correlation matrix,

    most of them are blue,

    where a more bluish correlation corresponds to a lower correlation,

    where solid blue indicates zero correlation.

    And the squares that are closer to a red shading

    indicate a higher extent of correlation among factors or terms in the experiment.

    And so overall, what we see is that we look for correlations

    that don't exceed 0.3,

    and that's typically all the squares in this plot

    with the exception of those slightly reddish squares

    where the correlation is a little bit higher.

    And that's because we have categorical factors, right?

    We have at least one categorical factor in this experiment.

    And if we didn't have the presence of a categorical factor,

    this plot would look even bluer.

    So we say that in DSD, we don't recommend

    adding too many categorical variables into the experiment,

    because if we do, then we increase this correlation problem,

    which affects our ability to produce estimates in our model that are precise,

    leads to inflation of variance in our estimates.

    And the final plot on the right, on the far right,

    which is an indication of the uniformity of this design,

    is a scatter plot matrix in JMP,

    and it shows each variable versus every other variable.

    And what we're looking for

    is for white space to be minimal in this plot.

    What I've drawn is a little circle here which your eye can easily pick out.

    There's a little bit of extra white space there

    at the intersection of Repeat and Step.

    And that, again, is because we have a categorical variable in our experiment.

    And so truthfully,

    there's no perfect zero in the main effects,

    no true center point in the main effects

    due to the presence of that difficulty variable,

    the categorical variable.

    And that's reflected in the non- symmetric pattern

    of the scatter plot matrix on the right, slightly non-symmetric,

    where that asymmetry is indicated in that white space

    and with the circle that I've drawn.

    Okay, so before I discuss this slide, I just want to quickly show you

    how I got to these design diagnostics.

    So what you're seeing here is the table that I just showed you.

    And I've generated this design using the DSD platform under the DOE menu,

    under Definitive Screening and Definitive Screening Design.

    And after I did that, JMP already generate,

    after I complete the design table generation process

    and fill in the results, JMP generates a DOE dialogue script,

    saves it to the data table,

    and I can actually relaunch the DOE dialog,

    and I can also evaluate the design.

    So I'm going to go ahead and quickly click on Design Evaluation.

    And this is just an overview of the design.

    And right here under Design Evaluation

    is where I get the diagnostics related to p ower,

    which I showed you on the left panel on that slide,

    the diagnostics that indicate

    to the extent to which factors or terms in the experiment are correlated.

    And that's shown here in the color map on the correlation.

    And to generate the plot,

    looking at the uniformity among the factors,

    I actually have to go in and do that in s catter plot matrix

    under the Graph menu, S catterplot Matrix.

    So that's just some context for you.

    And now, I'm just going to quickly bring up the next slide

    and then come back to JMP here

    just to dynamically show you what we're doing.

    So here's probably the most interesting part

    of this experiment.

    How do we increase the survey test difficulty

    and do it in a smart way?

    Well, we can use hierarchical clustering analysis to do that.

    Now, we already know the correct answer.

    It's indicated here in the corresponding column.

    The Choice column,

    the columns of the four variables on the right

    which indicate the choices corresponding to the 20 melody choices

    are indicated there.

    So we know, for example, in the first row,

    the correct answer is C corresponds to melody one

    where the C ID number is one.

    So we already know the correct answer

    where we've assigned it in terms of row order based on a random number,

    but how do we pick the other three answers?

    Well, based on hierarchical clustering,

    we can get a sense of how close each of the other three answers are

    to the correct answer.

    And in this way, we can make the test a little more difficult.

    So all the answer choices are from the 20 melodies.

    How do we pick the closer formalities for each question,

    or the closest formalities, if you will,

    or even maybe melodies that are relatively close together

    based on the clustering criterion, but not honoring that criterion strictly,

    right?

    So this might seem a little bit nebulous,

    but in effect, all we're really doing

    is telling JMP to assign a clustering scheme by row

    and based on some clustering criterion that we specify.

    And by default, that criterion is Ward.

    So I'm just going to show that dynamically here.

    So I have the table open.

    All I did here is run Hierarchical C luster under the C lustering menu.

    And once I ran this, I went ahead and invoke Cluster S ummaries,

    which I turned on here.

    And then watch what happens here when I click on each of these clusters.

    So you can see that when I click on each of these clusters.

    These are the clusters.

    So seven and 18 are associated with each othe r,

    14 and 17, eight and nine, row two and 13, and so on.

    So this is the idea.

    We're using the power of JMP to identify rows

    that are associated with each other.

    And in this way, by arranging the answer choices close to each other,

    we make it relatively close to each other by following some schema like this,

    we make the test more difficult.

    All right, so just launch back into slides here again.

    Okay.

    All right, so basically the last step here in terms of completing this experiment

    is in addition to using a passive criteria for increasing the difficulty of the test,

    we want an active criteria.

    So we want to be able to separate,

    in effect, the beginner level from the advanced level.

    So think of it like this.

    If every question was super difficult

    or if all the choices were very hard to discriminate from,

    then you wouldn't be able to distinguish between an advanced- level respondent

    and a beginner- level respondent

    because everybody would miss all of the questions.

    Similarly, if you made all the questions too easy,

    then you'd have all experts and no beginners,

    and so you have no differentiation.

    So based on the science, we have a hypothesis that step and speed

    are the most important factors for performance,

    for hearing performance, for discriminating between a good melody,

    a good composition, and a bad musical composition

    and a bad one.

    So are we sure about that?

    Well, one thing we can do is we can re code the step and speed

    by a 50 percent reduction

    if it's at difficulty level equal to difficult.

    And by doing that, in effect, we still have five variables

    and those are indicated in the shaded, right?

    So the recoded step, recoded speed are the two columns that are shaded.

    And then we have the notes changed, the notes level, and the repeat.

    So the DSD is still orthogonal.

    We still have three levels.

    We have five variables,

    but actually we could incorporate up to six in the DSD design.

    So how do we increase our value in effect by increasing that variable number to six?

    Well, we can add the difficulty variable or the categorical variable

    which indicates either easy or difficult.

    So we decided to use step and speed, combined with these other three variables,

    and the total sample size is still 18 plus 2 or 20

    with the two center points and the one center point by default.

    But now, we get five levels for speed and step, not three.

    So by doing this little transformation,

    we smartly create five levels on two variables

    instead of just having three levels,

    which is typically what we would have in a DSD.

    So I think this is a unique approach

    that's also quite specific to this problem context

    and gives us more levels in our design.

    Okay, so this is our design.

    How are we going to create the...

    What software are we going to use to basically generate the hearing tests?

    Okay, well, this is just an overview of Music S oftware S ynthesizer,

    which is what we use, soft synth.

    And we utilize it to create

    24 multiple choice music melody hearing tests.

    It's obviously convenient and portable and fast.

    All right, so how do we distribute this survey smartly?

    Well, our approach is...

    Many people do one sampling method.

    But here, our approach is to integrate all the different sampling methodologies,

    cluster sampling, stratified, and some additional clustering within

    in order to distribute the survey to the right audience

    to make the survey the most useful.

    So when you're ready to send out the quiz, how do you do it?

    Well, I have some examples here.

    Who should play the music?

    Well, there are people who know the music and people who don't.

    So we only want to send the surveys

    to people who are already familiar with the music, right?

    Because ultimately, we want to use these people

    to evaluate the performance of an orchestra.

    In the stratified sampling sense, we have different kinds of instruments.

    We may have five students in a particular pool

    that know how to play piano,

    we may have two that know how to play violin,

    and we may want to sample smartly

    so that we only pick a certain number within each strata of players,

    people who play particular instruments.

    So we may pick randomly within each of these strata

    in a certain sampling rate.

    And again, with respect to clustering,

    we can think of location in terms of practice location or geography

    as a selection from many different geographies.

    In a sense, we cluster and limit our selection criteria

    to only the San Francisco Bay area,

    because practicing in person is much easier than practicing virtually.

    Okay, so really the point is that this survey dissemination

    and survey data collection processes is very holistic

    and increases our chances of producing an effective test set, if you will,

    of evaluators to help us form the most high- performing orchestra.

    Okay, so quickly, to wrap everything up,

    we studied the human hearing frequency range,

    the instrument frequency spectrum, the music frequency formula,

    and we designed an innovative music melody hearing test using DSD.

    We also implemented two interesting approaches

    to increase the difficulty of the test, hierarchical clustering,

    as well as rescaling the levels

    of the most important predictors on our responses for the test answers.

    And we use the music synthesizer software

    to basically disseminate the hearing test across the six music melody variables.

    And in our strategy for dissemination, we use the holistic sampling methodology.

    So this in closing, some of the approaches that we use

    and the science that we developed could be used to develop a hearing aid,

    a music melody hearing aid.

    And in our current market that we're aware of,

    hearing aids are really specially designed for people with hearing loss,

    but the idea here would be, how about making a hearing aid

    that's about amplifying a certain signal from noise, right?

    And that would, in effect,

    increase music melody hearing and detection, right?

    And so the main objective here

    would be to block out noise that's extraneous,

    for example, noise from the audience, and then amplify the signal portion

    for the particular frequencies that are important

    for playing a particular instrument, or even using this type of technology

    to even out the pitch, to amplify the transition between melodies.

    And so in future work, a similar DSD design can be implemented

    in terms of developing this kind of technology.

    So thank you very much for listening and let us know if you have any questions.

    Published on ‎05-17-2024 02:08 PM by | Updated on ‎05-17-2024 02:26 PM

    This presentation showcases designing a special music hearing test to test a musician’s ability to hear melodies. The Definitive Screen Design (DSD) platform in JMP was utilized to consider six music script input variables (step, speed, notes changed, note level, repeat, difficulty) and then added two more center points for evaluating the Gage R&R performance. Each DSD run is a multiple-choice test allowing respondents to pick their response from four available choices.

     

    JMP Hierarchical Clustering platform was used to group similar music scripts from the 20 scripts provided by DSD runs and assign the similar scripts for the other three non-correct choices. The correct choices were then added to make each hearing question more challenging. Next, a stratified cluster hybrid sampling method was adopted to select 30 candidates to participate in the survey. Once the scripts were determined, a commercial music synthetic software program was used to create this DSD melody hearing test. After collecting the survey results, the Fit Definitive Screening platform in JMP was used to analyze the DSD survey results. The goal was to determine the best rater (higher propensity for accurate rating of musical melodies) to serve as the judge for next project phase.

     

     

    Video Player is loading.
    Current Time 0:00
    Duration 0:00
    Loaded: 0%
    Stream Type LIVE
    Remaining Time 0:00
     
    1x
    • Chapters
    • descriptions off, selected
    • captions off, selected

       

      All right.

      Well, thanks, everyone, for joining us.

      The title of our project

      is Design a Digital Music Melody Hearing Test.

      I'm Patrick Giuliano,

      and my co- presenters are Charles Chen and Mason Chen

      who couldn't be here today.

      So I'm going to be presenting on their behalf.

      And this is a project,

      a high- school STEM project inspired by ESTEEM's methodology,

      which is basically STEM but with AI, math, and statistics well- integrated.

      Okay, so just to introduce this project

      in the project management flavor with the project charter.

      The purpose of the project, in effect, is to design a test

      to test the hearing capability of a musician.

      The experimental design, philosophy or methodology we use

      is JMP's powerful, definitive screening design capability.

      And we designed the test based on six music melody variables

      in order to test hearing capability,

      where each question starts with a short melody

      followed by four choices, and where only one is repeated

      and the other three melodies are similar but not identical.

      From this test, each listener has to pick their best choice

      among the options available.

      Once we designed this test, we analyze the test survey results.

      We build a sensitivity model

      in consideration of six music hearing variables,

      and then screen the listeners to determine which ones performed the best

      in the music hearing test.

      And in doing so, in the screening process,

      we analyze the strengths and weaknesses of their hearing capability

      in the service of ultimately creating an orchestra

      with a grading of listeners who are highly capable to evaluate them.

      Okay, so in the service of science,

      we have an introduction to the mechanism of hearing

      where the ear is just basically a frequency- receiving apparatus

      that collects sound and vibration of the ossicles in the ear

      and cause the mechanical vibration to be converted

      into an electrical stimulus, which is interpreted by the brain

      by the auditory nerve and ultimately by the brain.

      All right, so before we get into the experiment

      and the variables that we analyzed,

      let's talk a little bit about the frequency range of hearing

      among individuals depending on their age.

      So people of all ages without hearing impairment

      should be able to hear at a frequency of approximately 8,000 Hertz,

      and gradual loss of sensitivity to higher frequencies with age

      is a normal occurrence.

      And so what the science tells us

      is that the auditory structures of younger people

      are typically more capable of absorbing or interpreting

      hearing higher frequency sounds, which is, of course,

      relevant in terms of which instruments people are playing,

      where the violin has a higher pitch than the cello,

      so perhaps a younger person might be more suited

      for playing the violin than an older person.

      And so this just gives you an idea that basically

      people that are in their fifties maybe may only be able to hear

      at 12,000 kilohertz... or 12 kilohertz rather, 12,000 Hertz,

      whereas people in their 20s can hear up to perhaps 18 kilohertz.

      And just to give some context, the average frequency range

      for what we listen for the sounds that we hear most often every day

      is between 250 Hertz and 6,000 Hertz.

      Okay, so what are some challenges associated with hearing

      in the context of sounds of different frequency?

      So people typically miss high frequency sounds

      more often than low frequency ones.

      And people with high frequency hearing loss,

      they have trouble hearing higher- pitched sounds, of course, right?

      And so higher pitch sounds can usually come from women or children

      and are in the upper two to eight kilohertz range.

      And what's also typical

      with high frequency hearing loss in many people

      is the presence of a phantom sound, which is the condition called tinnitus,

      and that competing sensation of sound can also inhibit a person's ability

      to distinguish other high frequency sounds.

      So clearly, age is an important factor in terms of designing

      an effective hearing test and developing an effective panel of listeners

      who are attuned to music.

      Although we didn't explicitly consider age in our experiment,

      as you'll see in the subsequent slides,

      it definitely could be a factor that we could explore further

      in our sampling strategy in terms of the survey respondents that we choose.

      Okay, so the basic measure of hearing performance

      is called an audiogram.

      And what you see in the graph on the right

      is just a plot of hearing threshold level in decibels on the vertical axis

      versus frequency on the horizontal.

      And you can clearly see that as hearing loss progresses,

      the threshold level of sound and decibels starts to increase

      and the degradation and the performance is shown as the plot

      splitting the performance by year moving down into the right.

      That's the trajectory of the line that's connecting the points

      moving down into the right.

      Okay, so just a little bit more background

      before we launch into the design of the survey and the analysis.

      The intent here is just to emphasize

      that frequency interference can be a problem

      in producing a melodious harmony in an orchestra in particular

      or in any sort of musical composition.

      And what we're basically showing here

      is the difference between what's called fundamental frequencies and harmonics

      in the context of a piano,

      at least at the note scale indicated at the bottom.

      Okay, so what do we know about the music note frequency spectrum?

      Well, each note has, not surprisingly, based on the introduction so far,

      each note has a particular frequency.

      As an example, middle C is at around 262 Hertz,

      and higher notes, of course, are going to have higher frequency

      and lower notes have lower frequency.

      And this slide just gives you a context

      for what frequency the notes correspond to.

      So note A is around much higher, 440,

      then note C at 261 in the second set on the right,

      in the lower portion of the slide.

      Okay, so there's some relationship between frequency and the number of notes.

      Frequency needs to double every 12 notes,

      and we have 12 notes in each octave, seven white and five black.

      And so you can see that this relationship,

      that frequency follows as a function of these notes,

      and n is a power- law type relationship.

      All right, so taking us back now to the project

      and the implementation and the analysis.

      So the project plan has three phases.

      The first phase is what I'm going to cover,

      it's the analysis that I'm going to discuss today.

      The first phase is effectively the process of identifying

      which people are best hearing performers from a collection of survey results

      that we send out based on the survey that we designed.

      The second phase is identifying the best hearing performers

      from the survey results in order to serve as judges.

      In this phase, basically, we try to work on forming the orchestra

      prior to phase three where we're actually doing the forming.

      But in this instance, we're thinking about things

      like which instruments have any potential limitation.

      And we may give the same melody to different test instruments,

      and not every instrument can play every melody, obviously.

      And so the idea is,

      how do we know that the individuals that are playing

      are playing these instruments accurately?

      Well, we need judges who have good listening capability.

      So the judges that we curate from phase one

      will provide that excellent evaluation in phase two.

      So once we have that in place, in phase three,

      we can actually really form the digital orchestra.

      And we'll think about things like how many players should be involved,

      who should play where, obviously.

      We'll have a good understanding of how the melodies could be difficult

      for certain instruments.

      And this is why we need phase two in the middle.

      Okay, so here's our design or survey question design.

      So we've identified six variables for this hearing test

      related to music, the parameters in music:

      step, speed, notes changed, notes level, a repeat variable,

      and a difficulty variable, a categorical variable, easy or difficult.

      The experiment, as I mentioned before, is we're using JMP's DSD,

      and in addition to the default, we're generating a default DSD,

      and then in effect, we're augmenting the design

      by adding two more center points.

      So we're doing an 18- run DSD,

      which includes one center point, which is row number three in this table,

      to have indicated with a zero and an arrow highlighting row three.

      And then we're adding two more center points

      at row 10 and row 20 respectively.

      And the idea here in terms of, we're replacing these center points,

      is we want to get an idea of how consistent the results are

      throughout the experiment.

      So we try to put a center point roughly at the beginning, in the middle,

      and the end of the experiment.

      And this is analogous to understanding whether a measurement process is stable,

      if you're in a manufacturing environment, getting a sense for that.

      And then the other important thing about our design here

      is that we're randomizing the test sequence,

      and that's something that we can do in JMP through the generation of the design.

      And I'll show a little bit about that briefly

      when we come to the next few slides.

      And that randomization is really important because it helps eliminate any bias

      due to factors that aren't in the experiment

      when we run the test.

      And that bias is referred to sometimes as lurking variation

      or variation due to lurking variables.

      Okay, so there's another consideration that I touched on.

      It's in the context of randomization, but it's slightly different context,

      which is a little bit more unique

      to this particular application and experiment.

      And so basically, what we did is generated an initial random variable

      and assigned a random sequence, one, two, three, four, and randomized.

      But we did a recoding on that.

      So we labeled one A, two B, three C, and four D,

      and that' s what we see

      in terms of identifying the correct answer.

      So in the two columns at the right in this table,

      in this 20- row table,

      we're identifying what the correct answer should be

      in terms of the letter,

      which is associated with a random variable of one, two, three, four,

      w here one corresponds to A, two to B, three to C, and four to D.

      And we're doing this to ensure a uniform distribution

      of the location variable.

      And basically what that means in practical terms is that A, B, C, and D

      all have equal percentage of being selected at random.

      And this is to avoid the biasing situation

      where a student may pick the same answer over and over again

      in order to possibly increase his or her chances of performing well,

      or perhaps because the survey respondent isn't paying attention

      or isn't engaged in the survey.

      All right, so here is where we come to an evaluation

      of the performance of the design, of the DSD design.

      The way we approach this

      is through the evaluation of the statistical power of the experiment

      which is shown on the left, on the panel on the left

      through an evaluation of the confounding pattern

      or the extent to which factors are correlated in the experimental design,

      and that's shown in the panel in the middle,

      and the uniformity, what we call the uniformity of the design,

      which is simply, what does the structure of the design look like

      in a multivariate space?

      Have we covered all of the design points in an approximately uniform way

      so that we're able to predict across the entire range of the experiment

      with the same degree of precision?

      And so what these three indicate, and going back over to the left,

      is that the overall power for each of the factors in the experiment

      is greater than 90 percent, which is good.

      And it shows us that we have good sensitivity to detect effects,

      if they're actually there in the population.

      The panel in the middle shows

      that the risk of what we call multicolinearity

      or excessive correlation among the experimental factors is low

      because all of the pairwise correlations in this correlation matrix,

      most of them are blue,

      where a more bluish correlation corresponds to a lower correlation,

      where solid blue indicates zero correlation.

      And the squares that are closer to a red shading

      indicate a higher extent of correlation among factors or terms in the experiment.

      And so overall, what we see is that we look for correlations

      that don't exceed 0.3,

      and that's typically all the squares in this plot

      with the exception of those slightly reddish squares

      where the correlation is a little bit higher.

      And that's because we have categorical factors, right?

      We have at least one categorical factor in this experiment.

      And if we didn't have the presence of a categorical factor,

      this plot would look even bluer.

      So we say that in DSD, we don't recommend

      adding too many categorical variables into the experiment,

      because if we do, then we increase this correlation problem,

      which affects our ability to produce estimates in our model that are precise,

      leads to inflation of variance in our estimates.

      And the final plot on the right, on the far right,

      which is an indication of the uniformity of this design,

      is a scatter plot matrix in JMP,

      and it shows each variable versus every other variable.

      And what we're looking for

      is for white space to be minimal in this plot.

      What I've drawn is a little circle here which your eye can easily pick out.

      There's a little bit of extra white space there

      at the intersection of Repeat and Step.

      And that, again, is because we have a categorical variable in our experiment.

      And so truthfully,

      there's no perfect zero in the main effects,

      no true center point in the main effects

      due to the presence of that difficulty variable,

      the categorical variable.

      And that's reflected in the non- symmetric pattern

      of the scatter plot matrix on the right, slightly non-symmetric,

      where that asymmetry is indicated in that white space

      and with the circle that I've drawn.

      Okay, so before I discuss this slide, I just want to quickly show you

      how I got to these design diagnostics.

      So what you're seeing here is the table that I just showed you.

      And I've generated this design using the DSD platform under the DOE menu,

      under Definitive Screening and Definitive Screening Design.

      And after I did that, JMP already generate,

      after I complete the design table generation process

      and fill in the results, JMP generates a DOE dialogue script,

      saves it to the data table,

      and I can actually relaunch the DOE dialog,

      and I can also evaluate the design.

      So I'm going to go ahead and quickly click on Design Evaluation.

      And this is just an overview of the design.

      And right here under Design Evaluation

      is where I get the diagnostics related to p ower,

      which I showed you on the left panel on that slide,

      the diagnostics that indicate

      to the extent to which factors or terms in the experiment are correlated.

      And that's shown here in the color map on the correlation.

      And to generate the plot,

      looking at the uniformity among the factors,

      I actually have to go in and do that in s catter plot matrix

      under the Graph menu, S catterplot Matrix.

      So that's just some context for you.

      And now, I'm just going to quickly bring up the next slide

      and then come back to JMP here

      just to dynamically show you what we're doing.

      So here's probably the most interesting part

      of this experiment.

      How do we increase the survey test difficulty

      and do it in a smart way?

      Well, we can use hierarchical clustering analysis to do that.

      Now, we already know the correct answer.

      It's indicated here in the corresponding column.

      The Choice column,

      the columns of the four variables on the right

      which indicate the choices corresponding to the 20 melody choices

      are indicated there.

      So we know, for example, in the first row,

      the correct answer is C corresponds to melody one

      where the C ID number is one.

      So we already know the correct answer

      where we've assigned it in terms of row order based on a random number,

      but how do we pick the other three answers?

      Well, based on hierarchical clustering,

      we can get a sense of how close each of the other three answers are

      to the correct answer.

      And in this way, we can make the test a little more difficult.

      So all the answer choices are from the 20 melodies.

      How do we pick the closer formalities for each question,

      or the closest formalities, if you will,

      or even maybe melodies that are relatively close together

      based on the clustering criterion, but not honoring that criterion strictly,

      right?

      So this might seem a little bit nebulous,

      but in effect, all we're really doing

      is telling JMP to assign a clustering scheme by row

      and based on some clustering criterion that we specify.

      And by default, that criterion is Ward.

      So I'm just going to show that dynamically here.

      So I have the table open.

      All I did here is run Hierarchical C luster under the C lustering menu.

      And once I ran this, I went ahead and invoke Cluster S ummaries,

      which I turned on here.

      And then watch what happens here when I click on each of these clusters.

      So you can see that when I click on each of these clusters.

      These are the clusters.

      So seven and 18 are associated with each othe r,

      14 and 17, eight and nine, row two and 13, and so on.

      So this is the idea.

      We're using the power of JMP to identify rows

      that are associated with each other.

      And in this way, by arranging the answer choices close to each other,

      we make it relatively close to each other by following some schema like this,

      we make the test more difficult.

      All right, so just launch back into slides here again.

      Okay.

      All right, so basically the last step here in terms of completing this experiment

      is in addition to using a passive criteria for increasing the difficulty of the test,

      we want an active criteria.

      So we want to be able to separate,

      in effect, the beginner level from the advanced level.

      So think of it like this.

      If every question was super difficult

      or if all the choices were very hard to discriminate from,

      then you wouldn't be able to distinguish between an advanced- level respondent

      and a beginner- level respondent

      because everybody would miss all of the questions.

      Similarly, if you made all the questions too easy,

      then you'd have all experts and no beginners,

      and so you have no differentiation.

      So based on the science, we have a hypothesis that step and speed

      are the most important factors for performance,

      for hearing performance, for discriminating between a good melody,

      a good composition, and a bad musical composition

      and a bad one.

      So are we sure about that?

      Well, one thing we can do is we can re code the step and speed

      by a 50 percent reduction

      if it's at difficulty level equal to difficult.

      And by doing that, in effect, we still have five variables

      and those are indicated in the shaded, right?

      So the recoded step, recoded speed are the two columns that are shaded.

      And then we have the notes changed, the notes level, and the repeat.

      So the DSD is still orthogonal.

      We still have three levels.

      We have five variables,

      but actually we could incorporate up to six in the DSD design.

      So how do we increase our value in effect by increasing that variable number to six?

      Well, we can add the difficulty variable or the categorical variable

      which indicates either easy or difficult.

      So we decided to use step and speed, combined with these other three variables,

      and the total sample size is still 18 plus 2 or 20

      with the two center points and the one center point by default.

      But now, we get five levels for speed and step, not three.

      So by doing this little transformation,

      we smartly create five levels on two variables

      instead of just having three levels,

      which is typically what we would have in a DSD.

      So I think this is a unique approach

      that's also quite specific to this problem context

      and gives us more levels in our design.

      Okay, so this is our design.

      How are we going to create the...

      What software are we going to use to basically generate the hearing tests?

      Okay, well, this is just an overview of Music S oftware S ynthesizer,

      which is what we use, soft synth.

      And we utilize it to create

      24 multiple choice music melody hearing tests.

      It's obviously convenient and portable and fast.

      All right, so how do we distribute this survey smartly?

      Well, our approach is...

      Many people do one sampling method.

      But here, our approach is to integrate all the different sampling methodologies,

      cluster sampling, stratified, and some additional clustering within

      in order to distribute the survey to the right audience

      to make the survey the most useful.

      So when you're ready to send out the quiz, how do you do it?

      Well, I have some examples here.

      Who should play the music?

      Well, there are people who know the music and people who don't.

      So we only want to send the surveys

      to people who are already familiar with the music, right?

      Because ultimately, we want to use these people

      to evaluate the performance of an orchestra.

      In the stratified sampling sense, we have different kinds of instruments.

      We may have five students in a particular pool

      that know how to play piano,

      we may have two that know how to play violin,

      and we may want to sample smartly

      so that we only pick a certain number within each strata of players,

      people who play particular instruments.

      So we may pick randomly within each of these strata

      in a certain sampling rate.

      And again, with respect to clustering,

      we can think of location in terms of practice location or geography

      as a selection from many different geographies.

      In a sense, we cluster and limit our selection criteria

      to only the San Francisco Bay area,

      because practicing in person is much easier than practicing virtually.

      Okay, so really the point is that this survey dissemination

      and survey data collection processes is very holistic

      and increases our chances of producing an effective test set, if you will,

      of evaluators to help us form the most high- performing orchestra.

      Okay, so quickly, to wrap everything up,

      we studied the human hearing frequency range,

      the instrument frequency spectrum, the music frequency formula,

      and we designed an innovative music melody hearing test using DSD.

      We also implemented two interesting approaches

      to increase the difficulty of the test, hierarchical clustering,

      as well as rescaling the levels

      of the most important predictors on our responses for the test answers.

      And we use the music synthesizer software

      to basically disseminate the hearing test across the six music melody variables.

      And in our strategy for dissemination, we use the holistic sampling methodology.

      So this in closing, some of the approaches that we use

      and the science that we developed could be used to develop a hearing aid,

      a music melody hearing aid.

      And in our current market that we're aware of,

      hearing aids are really specially designed for people with hearing loss,

      but the idea here would be, how about making a hearing aid

      that's about amplifying a certain signal from noise, right?

      And that would, in effect,

      increase music melody hearing and detection, right?

      And so the main objective here

      would be to block out noise that's extraneous,

      for example, noise from the audience, and then amplify the signal portion

      for the particular frequencies that are important

      for playing a particular instrument, or even using this type of technology

      to even out the pitch, to amplify the transition between melodies.

      And so in future work, a similar DSD design can be implemented

      in terms of developing this kind of technology.

      So thank you very much for listening and let us know if you have any questions.



      0 Kudos