cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • JMP 19 is here! See the new features at jmp.com/new.
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
Choose Language Hide Translation Bar

Kolmogorov-Smirnov Test for a given Distribution

Hello,

I want to use the Kolmogorov-Smirnov test to test the type of distribution of a group of Data. The idea is to see if the data is of weibull, gaussian or rayleigh distribution.


I thought about creating a column that creates levels between the tested data and data of a given distribution but that would mean filling the column by hand. 

Do you have any idea on how to use the implemented Kolmogorov-Smirnov function to compare the data with a given distribution?

Thanks in Advance

5 REPLIES 5
Victor_G
Super User

Re: Kolmogorov-Smirnov Test for a given Distribution

Hi @estelle-ippon,

 

Welcome in the Community !

 

To test the type of distribution if your data, you can simply use the platform Distributions, and in the red triangle options of the data, you can choose "Continuous Fit" and select all or only specific type of distributions :

Victor_G_0-1752656770600.png

If you have selected several distributions or all, you will have a first comparison view of the distributions with the AICc criterion :

Victor_G_1-1752656870433.png

If you want a statistical test, simply select the distribution you want to test, and in the red triangle of this distribution select "Goodness of Fit" :

Victor_G_2-1752656942995.png

For continuous fits, the goodness-of-fit test is the Anderson-Darling test : Fit Distributions

 

If you want to compare two distributions using Kolmogorov-Smirnov Test, then the platform Oneway Analysis in Fit Y by X menu may help you : The Oneway Platform Options

 

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Re: Kolmogorov-Smirnov Test for a given Distribution

Hi Victor, 

thank you for your answer.

The thing is I specifically want to use the K-S test to try to see if my data fit a Gaussian, Weibull or Rayleigh distribution. Could you tell me a little bit more about the oneway analysis?

Do you have an idea on how to implemented such a test ?

Victor_G
Super User

Re: Kolmogorov-Smirnov Test for a given Distribution

Hi @estelle-ippon,

 

You could technically do what you want, but it will need some preliminary work to get it done, and it will require you define the distribution quite precisely.

You have to prepare your datatable first :

  1. Generate a random formula column with the correct distribution in your datatable (in my case I'm using the dataset Big Class Families, and creating a new column with a Weibull formula distribution to compare it with the "Height" column) :
    Victor_G_1-1752658574029.png
  2. Stack the generated data with the column you want to test, adding a label and values to differentiate "real data" to "simulated data" :
    Victor_G_2-1752658605953.png
  3. Use the platform Fit Y by X, with the data as Y and the label/distribution type (real/simulated) as X :
    Victor_G_3-1752658664119.png
  4. In the red triangle, in "Nonparametric", choose "Kolmogorov-Smirnov Test" to compare your simulated data with the real data: 
    Victor_G_4-1752658741051.png

    You'll then have the statistics displayed for this test : Nonparametric Test Reports

 

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Re: Kolmogorov-Smirnov Test for a given Distribution

Thank you very much for your answer, I also think it would be a good idea.

But on the other hand do you have an idea on how to implement that test if we don't know the parameters of the distribution? I don't think I'll be able to create a column manually if I don't have them.

Do you think it's feasible? If not I'll have to create the test by hand I guess...

Victor_G
Super User

Re: Kolmogorov-Smirnov Test for a given Distribution

You could then use the first option I described using Distribution platform to estimate parameters for the distribution you intend to test. Then create a random formula column with the distribution and estimates found earlier, and then follow the procedure of my second reply to do the KS test to compare simulated data and measured data ? 

 

But I think the Anderson-Darling test available in the Distribution platform is the most direct option, and very close to what you intend to do.

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Recommended Articles