Solved: How to get the confidence interval (CI ) and to make t-test with multi distribut...

Report Inappropriate Content · Jun 10, 2023 4:43 PM

Input data is already the histogram in %. There are 5 samples in total. Each sample include 2 distributions (bottom and top position). The goal is to calculate the difference between the 2 means (physically, it´s an height). Raw data (´´CI_t test_data_Q12345.jmp´´) is attached.

Questions:

How to obtain the statistics for each distribution (bottom and top position)?
How to deduce the statistics for height of each sample (1 to 5)?
How to obtain the t-test and equivalent test to compare the deduced 5 heights?

Many thanks for your help!

martindemel · Jan 27, 2022 5:13 AM

So you want to get the um locations of the peaks and compare them instead of the peak value? You probably need to use formula columns to extract the peak locations first and then do what has been described before. Not sure if that is exactly what you are looking for, but I added two columns to the data table.

other possible scripts to detect peaks in a distribution could be one of the other threads when searching for "peak" in the community and click on the results like e.g. this one https://community.jmp.com/t5/Discussions/find-multiple-peak-values-in-a-column/m-p/62919#M33840

/****NeverStopLearning****/

View solution in original post

SDF1 · Jan 26, 2022 02:36 PM

Hi @Kate ,

To me, it sounds like you need to do an ANOVA test of the data. Use the Fit Y by X platform, and use um as Y and sample number as X. This will show a picture like this:

Then, go to the red hot button next to Oneway and select Means/Anova. You should get this:

Notice that the "Prob > F" is <.0001, so this means that the Sample number has a high probability of explaining some of the variation in the Y (um). Before continuing, you will want to do a test of unequal variance to see if the variance from sample number to sample number is the same or different. Again, red hot button, then select unequal variance. You should get this:

All of the unequal variance tests have Prob > F to be <.0001, meaning that the variances are NOT equal. But, the Welch test also has a Prob > F at <.0001, so even though the variances are not equal, you can continue with the ANOVA comparison. Next, you CAN do a Student's-t test, but since you have more than two levels (the number of levels for Sample number is 5), you'll actually want to do a TukeyHSD test. If you do a Student's-t on this data set, you might be mistakenly saying there's no difference when there actually is a difference, Type II Error (I think). Again, go red hot button, then Compare Means > All Pairs, Tukey HSD. You should get this:

You can see there is some overlap between levels 2 and 5, but otherwise, there are distinct differences in the means. Within the Oneway platform, you can do an Equivalence test. Again, red hot button and then select Equivalence test. For the variance assumption, you'll need to select Unequal Variances since the tests all came back with very low p-values. You'll then need to enter a value that will be practically equivalent or not. You'll then get this (I put in 0.2):

What this tells you (given my value for equivalence) is that 3 out of the 10 pair-wise comparisons are different while the other 7 are practically equivalent.

To get the statistics on the distributions, Click on the Distribution icon in JMP (or go Analyze > Distribution, cast um into Y and Sample number into By. If you then click on the red hot button next to Summary Statistics, you can customize what you see. In this case, I have also selected Minimum and Maximum to be in the summary statistics. If you hold down the CTRL button while doing that, it will "broadcast" it to all the other distributions (for each sample) and you'll get all the data you need. If you want to further analyze things, you can right click the data in the Summary Statistics and select Make Into Combined Data Table. Very useful. Below is an example for sample number 1.

I'm attaching the original data table and scripts so you can see what I did.

Out of curiosity, if you can share, what NanoScope Analysis are you using (instrumentation)? I'm a nano-physicist.

Hope this helps!,

DS

Kate · Jan 27, 2022 06:12 AM

Hi, @SDF1 ,

Thank you very much for your help! Thanks for your script and explanations. Very helpful! The tricks you shared are nice! I like them.

Sorry I didn´t describe my request clearly. That has caused confusion. I try to make it better described in my answer to Martin below. Yes, I also work in micro and nano technologies. It´s an AFM measurement result. Glad to meet you here!!

martindemel · Jan 26, 2022 02:40 PM

Hi Kate, not sure I understand what you are trying to do. There are some things I need some clarification:

1. There are two tables attached which are completely identical (both from name and content - checked with compare data tables in JMP)

2. you are talking about top and bottom, what do you mean with that? If I do the distribution of um and % I see that for each sample um is distributed almost uniform, and % many at 0 or close to up to single values close to 18. I cannot see two different distributions in a sample. What do I miss?

I guess with this information (and check if the data is correct) we can work on your questions and how to calculate what you need.

3. What do you mean with histogram in %?

/****NeverStopLearning****/

martindemel · Jan 26, 2022 02:54 PM

OK, I think I didn't look at the picture carefully enough. So % has two peaks over the course of um and you would like to get the position of the peaks, or some information about the spread of the data around the first peak (bottom) and the same for the values around the second peak (top).

With height you want to get the mean um value for the top and bottom distributions and compare the 5 sample's top positions with each other as well the 5 bottom positions.

Looking at the distribution across um it looks very uniform as the measurements across um seem to be equidistant and therefore you have similar amount of values in one bin. Distributions are for one variable, you use two in relationship. I'll take a second look at it and come back to you soon.

/****NeverStopLearning****/

martindemel · Jan 26, 2022 03:21 PM

So my best bet to do this analysis is using Fit Curve and a peak model like gaussian peak. You can find it under analyze->Specialized Models-> Fit Curve. Use um as X, and % as Y. put samples as group, press ok. Now use in the hotspot gaussian peak under peak models. This will fit five peak models and in the hotspot of the Gaussian Peak outline you then have compare parameter estimates, equivalence test and so forth. It will concentrate on the first peak, therefore, if you want to do this for the bottom and the top seperately you first need to divide the um data into two spots. e.g. 0.75 seems to be a good separator. Then you can use the new column as a By group.

Hope this helps. Attached the data table with some scripts which should help you with a starting point. From there you can probably walk alone.

/****NeverStopLearning****/

Kate · Jan 27, 2022 05:46 AM

Hi, Martin,

That´s a very nice learning to use the fit curve and peak model! It automatically excludes the headache of noises, very helpful, Many thanks! Sorry that I didn´t describe my request clearly. Sorry for the confusion. Please excuse.

I still face difficulties. Because at the end, I need to compare the height and the statistics of the height in µm. The height of each sample can be defined as the difference between the z position at the bottom_peak (in um) to the z position at the top_peak (in um). The height should also has a confidence interval, which is deduced from its corresponding top and bottom distributions, correct? I don´t know how to do this.

I think the difficulty in the problem is that the input data is a histogram which includes the z position ( in um) and the frequency in %. The frequency indicates how the z positions are distributed. The corresponding physical module is to analysis and compare the height of the binary (step) samples. That´s why it´s calculated from the z position difference between top to bottom.

Trying to put the % column into the Analyze -->Distribution --> Freq, it seem to be good, but still I can´t get the height statistics and compare the sample heights by JMP.

One of the output I need is the parameter comparison plot as below (example copied from your solution), but to compare the height in µm (instead of the Peak Value, I know you just give it as an example). Thanks for your help!

martindemel · Jan 27, 2022 5:13 AM

So you want to get the um locations of the peaks and compare them instead of the peak value? You probably need to use formula columns to extract the peak locations first and then do what has been described before. Not sure if that is exactly what you are looking for, but I added two columns to the data table.

other possible scripts to detect peaks in a distribution could be one of the other threads when searching for "peak" in the community and click on the results like e.g. this one https://community.jmp.com/t5/Discussions/find-multiple-peak-values-in-a-column/m-p/62919#M33840

/****NeverStopLearning****/

Kate · Jan 27, 2022 01:42 PM

@martindemel, Thanks a lot!

Kate · Jan 28, 2022 12:11 PM

@martindemel, Taking your method to look at more data, I find it provides more info than my initial target set before. The parameters from the fit module can play as the metrics for the quality control. that´s fantastic, Thanks!

How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Re: How to get the confidence interval (CI ) and to make t-test with multi distributions in one histogram (as direct input data)?

Recommended Articles

Get Going with JMP: Essentials for Using JMP

Distribution new features for JMP 17

Analytics with Confidence 2: Models That Don't Generalise