cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Steve_Kim
Level IV

How can I perform Machine Learning Modeling using JMP18 Python Script?

Hello,
I am working on creating a LightGBM model using a JMP Python script (JMP18).
My current setup is as follows:
  - X variables: "train1.csv" file, columns D1 through D1776
  - Y variable: "y1.csv" file, "Activity" column

  - Prediction target "Activity": "test1.csv" file, columns D1 through D1776
  - Note: Original file "train_original.csv" file, columns Activity through D1776. from Kaggle.com  (Predicting a Biological Response)


I would appreciate it if somebody can provide the guidance for the following issues:
- Cannot perform LightGBM modeling  in JMP18 python environment

I've included my JMP Python script below for your reference. See the followings for details.
Thank you for your time in advance! : )

 

 

 

import jmp
import jmputils

# update to latest version of pip (Package Installer of python) and setuptools then install numpy & pandas
jmputils.jpip('install --upgrade', 'pip setuptools')
jmputils.jpip('install', 'pandas numpy scikit-learn keras lightgbm')

# Import package
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)



# Load data
train1 = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/train1.csv')
y1 = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/y1.csv')
test1 = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/test1.csv')

train1.head() # No Result in JMP, but it's okay. It's fine to open the datatable using 'jmp.open'
y1.head() # No Result in JMP, but it's okay. It's fine to open the datatable using 'jmp.open'

# train1 = jmp.open('D:/steve.kim/Kaggle/BI Biological Response/train1.csv')
# y1 = jmp.open('D:/steve.kim/Kaggle/BI Biological Response/y1.csv')



# 01 Modeling training - Library import
from lightgbm import LGBMClassifier

# 02 Modeling training - LGBM Baseline model without hyperparameter tuning
lgb = LGBMClassifier()

# 03 Modeling training - define X (factors) and Y (responses) variables
lgb.fit(train1, y1) # lightgbm.basic.LightGBMError: Length of labels differs from the length of #data

# 04 Predict
predslgb = lgb.predict_proba(test1)

 

2 ACCEPTED SOLUTIONS

Accepted Solutions
Steve_Kim
Level IV

Re: How can I perform Machine Learning Modeling (LGBM) using JMP18 Python Script?

Oh... The code 'head()' is woking well after I consider it as a terminal environment not the jupyter notebook! 

 

y1 = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/y1.csv')
print(y1.head()) 

then I can get the code result from embedded Log!

So I think I can resolve the rest of things as well. 

 

   Activity
0         1
1         1
2         1
3         1
4         1

 

View solution in original post

Steve_Kim
Level IV

Re: How can I perform Machine Learning Modelingusing JMP18 Python Script?

<Shelf Answer> 

Sorry about this question!

The python pandas and lightGBM are working well in JMP18 python script! : )

 

 

import jmp
import jmputils

# update to latest version of pip (Package Installer of python) and setuptools then install numpy & pandas
jmputils.jpip('install --upgrade', 'pip setuptools')
jmputils.jpip('install', 'pandas numpy scikit-learn keras lightgbm')

# Checking package version
jmputils.jpip('list')


# Import package
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Any results you write to the current directory are saved as output.
import os
print (os.listdir("D:/steve.kim/Kaggle/BI Biological Response"))

# Load data
train_x = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/train_x.csv')
print(train_x.head())

train_y = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/train_y.csv')
print(train_y.head())  

test_x = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/test_x.csv')
print(test_x)


# Modeling
from lightgbm import LGBMClassifier  
lgb = LGBMClassifier(colsample_bytree=0.6, subsample=0.8)
lgb.fit(train_x, train_y)
preds_lgb = lgb.predict_proba(test_x)

sub_lgb = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/test_y.csv')
sub_lgb["Activity"] = preds_lgb[:,1]
print(sub_lgb.head())
test_answer = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/test_answer.csv')

# Evaluation
from sklearn.metrics import mean_absolute_error, mean_squared_error, mean_absolute_percentage_error
print('LightGBM MAE')
print(mean_absolute_error(test_answer, sub_lgb))
print('LightGBM MSE')
print(mean_squared_error(test_answer, sub_lgb))

<Shelf  

View solution in original post

2 REPLIES 2
Steve_Kim
Level IV

Re: How can I perform Machine Learning Modeling (LGBM) using JMP18 Python Script?

Oh... The code 'head()' is woking well after I consider it as a terminal environment not the jupyter notebook! 

 

y1 = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/y1.csv')
print(y1.head()) 

then I can get the code result from embedded Log!

So I think I can resolve the rest of things as well. 

 

   Activity
0         1
1         1
2         1
3         1
4         1

 

Steve_Kim
Level IV

Re: How can I perform Machine Learning Modelingusing JMP18 Python Script?

<Shelf Answer> 

Sorry about this question!

The python pandas and lightGBM are working well in JMP18 python script! : )

 

 

import jmp
import jmputils

# update to latest version of pip (Package Installer of python) and setuptools then install numpy & pandas
jmputils.jpip('install --upgrade', 'pip setuptools')
jmputils.jpip('install', 'pandas numpy scikit-learn keras lightgbm')

# Checking package version
jmputils.jpip('list')


# Import package
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Any results you write to the current directory are saved as output.
import os
print (os.listdir("D:/steve.kim/Kaggle/BI Biological Response"))

# Load data
train_x = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/train_x.csv')
print(train_x.head())

train_y = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/train_y.csv')
print(train_y.head())  

test_x = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/test_x.csv')
print(test_x)


# Modeling
from lightgbm import LGBMClassifier  
lgb = LGBMClassifier(colsample_bytree=0.6, subsample=0.8)
lgb.fit(train_x, train_y)
preds_lgb = lgb.predict_proba(test_x)

sub_lgb = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/test_y.csv')
sub_lgb["Activity"] = preds_lgb[:,1]
print(sub_lgb.head())
test_answer = pd.read_csv('D:/steve.kim/Kaggle/BI Biological Response/test_answer.csv')

# Evaluation
from sklearn.metrics import mean_absolute_error, mean_squared_error, mean_absolute_percentage_error
print('LightGBM MAE')
print(mean_absolute_error(test_answer, sub_lgb))
print('LightGBM MSE')
print(mean_squared_error(test_answer, sub_lgb))

<Shelf