cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Robust Regression Using Regular JMP 18 version powered by Python Integration

Enhance your regression analysis with the powerful combination of robust techniques and JMP 18, now featuring seamless integration with Python.

 

Simple Case Study

 

Here's a sample data table.

 

DaeYun_Kim_1-1713317381435.png

 

  • The data table contains 99 rows of simulated data
  • Data was designed to illustrate some types of outliers that are difficult to detect

Can you detect outliers in this sample data table?

Initially, we can examine the data through a univariate approach.

 

DaeYun_Kim_2-1713317856854.png

DaeYun_Kim_3-1713317867928.png

We were unable to identify outliers using a univariate approach.

However, if we broaden our scope to two dimensions, we may encounter outliers.

 

 

DaeYun_Kim_0-1713335974481.pngDaeYun_Kim_1-1713335989387.png

 

 

The preceding scenario isn't too dire since we can identify outliers through graphical analysis.

However, what if we're dealing with a plethora of variables, say, exceeding 10?

 

Robust Median Regression using Anscombe data table

 

To mitigate the impact of outliers, we can employ robust regression techniques like median regression.

Robust median regression is a statistical method that uses median regression to minimize absolute deviations and is resistant to violations of homoscedastic and normal assumptions.

 

In JMP Pro, effortlessly apply median regression by selecting the "Generalized Regression" personality and opting for the "Quantile Regression" distribution, specifying the quantile as 0.5 through the Fit Model platform.

 

The following presents the outcome of median regression applied to the sample dataset "Anscombe" within JMP.

 

DaeYun_Kim_6-1713319613183.pngDaeYun_Kim_2-1713336036957.png

 

DaeYun_Kim_3-1713336055321.png

 

If you are using JMP 18, not JMP Pro 18, you can apply median regression by JMP 18's new feature, Python integration.

 

import jmp
import jmputils

# Checking installed packages
# jmputils.jpip('list')

dt = jmp.open(jmp.SAMPLE_DATA + "anscombe.jmp")

""""
print(dt.name)
print(f'Number of columns: {dt.ncols}') 
dt['y3']
"""

# Install Packages
""""
jmp.run_jsl('''
Python Install Packages("statsmodels")
''')

jmp.run_jsl('''
Python Install Packages("matplotlib")
''')
"""

# Import Modules

import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression

# Reshape Data
x = np.reshape(dt['x3'],(-1,1))
y = np.reshape(dt['y3'],(-1,1))
print(x)
print(y)


# Mean Regression vs Median Regression

lr = LinearRegression()
lr.fit(x, y)
beta_ls = lr.intercept_, lr.coef_[0]

q = 0.5
rm = sm.QuantReg(y, sm.add_constant(x)).fit(q, prepend=True)
beta_md = rm.params[0], rm.params[1]

# Print Mean & Median Regression Coefficient

print(beta_ls)
print(beta_md)

The output of beta coefficients are listed LOG windows like below.

 

(array([3.00245455]), array([0.49972727]))

(4.009997266957748, 0.34500026527269356)

 

Just to let you know, the following results indicate the mean regression coefficient calculated using JMP.

You can compare the beta coefficients above for both mean and median regression.

 

DaeYun_Kim_9-1713320546026.png

 

Concluding Thoughts;

 

The demonstration provided offers a foundational understanding of JMP & Python interaction in JMP 18.

Harnessing the power of Python integration within JMP presents an exciting opportunity to elevate your analytical prowess, enabling you to broaden your expertise and achieve more robust insights.

 

 
 
 
 
 
 
 
Last Modified: Apr 17, 2024 9:01 AM
Comments
Paul_Nelson
Staff

Reformatted here to make it easier to read, copy & paste.  

import jmp 
# from jmputils import jpip
# jpip('install', 'statsmodels scikit-learn matplotlib numpy pandas')

dt = jmp.open(jmp.SAMPLE_DATA + "anscombe.jmp") 

# Import Modules 
import pandas as pd 
import statsmodels.api as sm 
import matplotlib.pyplot as plt 
import numpy as np 
from sklearn.linear_model import LinearRegression 

# Reshape Data 
x = np.reshape(dt['x3'],(-1,1)) 
y = np.reshape(dt['y3'],(-1,1)) 
print(x) 
print(y) 

# Mean Regression vs Median Regression 
lr = LinearRegression() 
lr.fit(x, y) 
beta_ls = lr.intercept_, lr.coef_[0] 
q = 0.5 
rm = sm.QuantReg(y, sm.add_constant(x)).fit(q, prepend=True) 
beta_md = rm.params[0], rm.params[1] 

# Print Mean & Median Regression Coefficient 
print(beta_ls) 
print(beta_md)

 

lala
Level VIII
  • Community still not offering jmp 18 trial?

DaeYun_Kim
Staff

@lala  Apologies for any inconvenience caused. Currently, the JMP 18 Trial isn't accessible. However, rest assured, we're diligently working to make a new trial version of JMP available very soon. Thank you for your patience and understanding.