Share your ideas for the JMP Scripting Unsession at Discovery Summit by September 17th. We hope to see you there!
Choose Language Hide Translation Bar
Highlighted
DS
DS
Level VI

How to save the T2 outlier formula as Python, C, or use in Excel?

Hi JMP Community,

 

  Is there a way to save the full formula for the T^2 outlier analysis formula as Python, C or Excel equivalent formula? I would like to save it for use with other colleagues who need this test, but don't run JMP. I have JMP Pro 15.

 

  Another option would be, what is the full mathematical formula that is contained in the JSL that performs the Vec Quadratic(S, X) calculation. This could then be implemented in other programs, at least this is my thinking.

 

Thanks!,

DS

2 ACCEPTED SOLUTIONS

Accepted Solutions
Highlighted

Re: How to save the T2 outlier formula as Python, C, or use in Excel?

Here is how JMP implements Vec Quadratic in Python as part of its support for scoring code generation. This code fragment is from the jmp_score.py file distributed with JMP Pro.

 

import numpy as np

def vec_diag(M):
    return np.diag(M).reshape(M.shape[0], 1)

def vec_quadratic(S, X):
    if S.shape[1] == X.shape[0]:
        return vec_diag(np.dot(X.T, np.dot(S, X)))
    return vec_diag(np.dot(X, np.dot(S, X.T)))

You can call Python code from Excel using VBA and libraries such as xlwings:

https://www.xlwings.org/

https://towardsdatascience.com/how-to-supercharge-excel-with-python-726b0f8e22c2

 

View solution in original post

Highlighted
DS
DS
Level VI

Re: How to save the T2 outlier formula as Python, C, or use in Excel?

Hi @nascif_jmp,

 

  Thanks for your input and ideas on this. Although I have no doubt that would work, due to IT restrictions and difficulties with implementing the xlwings with Excel, I ended up going about it a different way.

 

  I actually learned quite a bit about how Excel can do matrix math (not so straight forward), but doable.

 

I ended up going with a bit of a hybrid approach:

First, I used the PCA platform in JMP to calculate the T2 values and saved that formula to the data table in order to get the inverse covariance matrix, S, as well as the mean and standard deviation vectors in order to standardize the data in excel.

 

After that, it wasn't too hard to just do the matrix math of X*S*X` in order to get the T2 value for a new entry of the data table. In order to get that to work, I did have to use some temporary arrays in Excel to do the matrix math. Unfortunately, this must be done step-wise in Excel, so it's not very efficient. Nonetheless, it does calculate the correct T2 values for new entries when comparing it to JMP.

 

  So, I learned about the MMULT command in Excel. Step-wise, you have to do: MMULT(X,S), which I called XS, a new 1xm array. Then, to get T2, just need to do the function call MMULT(XS,X`), where X` is the transpose of X.

 

  I chose to do the hybrid approach because the S matrix and the mean and standard deviation arrays won't change that often since they were generated using the training data for the model, and don't need to be updated with each new data row coming in. That makes doing the matrix math a lot easier in Excel. I figure that we can always update the Excel file with the correct matrices whenever we re-train the model and generate a new T2 formula in JMP.

 

  A good learning experience, and it works, but it's not as efficient or easy as JMP!

 

Thanks!,

DS

View solution in original post

5 REPLIES 5
Highlighted
txnelson
Super User

Re: How to save the T2 outlier formula as Python, C, or use in Excel?

Are you asking if one can get from JMP, the code that JMP uses to generate T^2? I assume that SAS Inst. would consider that code as proprietary and intellectual property.
Jim
Highlighted
DS
DS
Level VI

Re: How to save the T2 outlier formula as Python, C, or use in Excel?

I am not sure why the formula would be proprietary as it's based on the Mahalanobis distance, and is just matrix math.

 

The Vec Quadratic(S,X) formula gets the diagonal elements of the matrix computation X*S*X' where S is a symmetric inverse covariance matrix and X is a rectangular matrix with equal number of columns as S, and X' is the transpose of X. Supposedly Vec Quadratic(S,X) is the same calculation as Vec Diag(X*S*X')

 

In my case, the predictor columns I have make up the X matrix -- so let's say I have 8 predictors, so X = [8x1] matrix. S is the inverse covariance matrix that can be calculated from the PCA or Multivariate platforms. The S matrix would be an 8x8 matrix that doesn't change. However, with each new entry (row) for the 8 X-predictors, it would be nice to calculate in Excel the T2 value.

 

So, the math formula of how JMP deals with it is not so hidden. What I'm having a hard time with is how to go from that to an equivalent formula representation in Excel so that other colleagues who aren't running JMP can calculate the T2 statistic for the modeling I'm doing. I've never attempted to use Excel for matrix math, so I don't know how to go about implementing the equivalent calculation in Excel.

 

I think their functional call forms simplify the math, but one should be able to program in JSL the same thing, but explicitly lay out all the steps. Once that's obtained, making the equivalent in excel shouldn't be too hard. At least, this is what I'm thinking.

 

Thanks,

DS

Highlighted
txnelson
Super User

Re: How to save the T2 outlier formula as Python, C, or use in Excel?

I am pretty sure that one could write the code in JSL to do the T^2 calculation, but, JMP's code that produces the T^2 is written, I assume, in C++, or C, not in JSL.
Jim
Highlighted

Re: How to save the T2 outlier formula as Python, C, or use in Excel?

Here is how JMP implements Vec Quadratic in Python as part of its support for scoring code generation. This code fragment is from the jmp_score.py file distributed with JMP Pro.

 

import numpy as np

def vec_diag(M):
    return np.diag(M).reshape(M.shape[0], 1)

def vec_quadratic(S, X):
    if S.shape[1] == X.shape[0]:
        return vec_diag(np.dot(X.T, np.dot(S, X)))
    return vec_diag(np.dot(X, np.dot(S, X.T)))

You can call Python code from Excel using VBA and libraries such as xlwings:

https://www.xlwings.org/

https://towardsdatascience.com/how-to-supercharge-excel-with-python-726b0f8e22c2

 

View solution in original post

Highlighted
DS
DS
Level VI

Re: How to save the T2 outlier formula as Python, C, or use in Excel?

Hi @nascif_jmp,

 

  Thanks for your input and ideas on this. Although I have no doubt that would work, due to IT restrictions and difficulties with implementing the xlwings with Excel, I ended up going about it a different way.

 

  I actually learned quite a bit about how Excel can do matrix math (not so straight forward), but doable.

 

I ended up going with a bit of a hybrid approach:

First, I used the PCA platform in JMP to calculate the T2 values and saved that formula to the data table in order to get the inverse covariance matrix, S, as well as the mean and standard deviation vectors in order to standardize the data in excel.

 

After that, it wasn't too hard to just do the matrix math of X*S*X` in order to get the T2 value for a new entry of the data table. In order to get that to work, I did have to use some temporary arrays in Excel to do the matrix math. Unfortunately, this must be done step-wise in Excel, so it's not very efficient. Nonetheless, it does calculate the correct T2 values for new entries when comparing it to JMP.

 

  So, I learned about the MMULT command in Excel. Step-wise, you have to do: MMULT(X,S), which I called XS, a new 1xm array. Then, to get T2, just need to do the function call MMULT(XS,X`), where X` is the transpose of X.

 

  I chose to do the hybrid approach because the S matrix and the mean and standard deviation arrays won't change that often since they were generated using the training data for the model, and don't need to be updated with each new data row coming in. That makes doing the matrix math a lot easier in Excel. I figure that we can always update the Excel file with the correct matrices whenever we re-train the model and generate a new T2 formula in JMP.

 

  A good learning experience, and it works, but it's not as efficient or easy as JMP!

 

Thanks!,

DS

View solution in original post