- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
New column based on the studentized residual outliers - if missing then create a copy+paste original column
I saved my studentized residuals and grouped, run a screening outliers as below:
Explore Outliers(
Y( Column Group( "Studentized Group" ) ),
Robust Fit Outliers( K Sigma( 2.5 ), Huber( 1 ) )
);
obj << Change to Missing(
Now, if there are any missing on the studentized columns I need to create a new column from that Y(variable) use all the original data without that row that is the outlier found on the studentized residual.
There is not only 1 variable, it can be many. Then it needs to be general.
I imagine that probably is a formula/script that check the string after the "Studentized resid" to find the same column that have the same name that will create a new column with and changing the one that the studentized residual column/row is missing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: New column based on the studentized residual outliers - if missing then create a copy+paste original column
Hi @Bmllnn,
The platform will allow you to create a script that can create a 'Culled' column that removes the values that are outliers in a new formula column.
I'm using the 'Diamond' data set in the sample data folder as an example, if I run the Robust Fit Outlier platform, I can press 'Formula Column' and it will create a new column that has the outliers removed
You can see in the diamonds data set below that the values have been removed:
If you wanted to script it, you can do it like this (I added a menu option to pick the column to make it more flexible):
dt=current data table();
cols=dt<<get column names(numeric, "Continuous");
//Select column to make the script applicable to any column name
nw=New Window( "Test",
V List Box(
Text Box( "Select your residual column"), List Box( cols ), Button Box("Ok",outlier_Expr)
)
);
//Run the Outlier platform and create new column
Outlier_Expr=expr(
nw<<close all(no save);//close the initial menu
obj=dt<<Explore Outliers(
Y( "Studentized Resid Price" ),//replace with the studentised column name
);
obj<< Robust Fit Outliers( K Sigma( 2.5 ), Huber( 1 ) );
wait(2);//wait to allow time for platform to run and perform calculations
obj<< Formula Columns( Suffix( "Culled" ) ););
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: New column based on the studentized residual outliers - if missing then create a copy+paste original column
Hello,
Thanks for your answer. What I current have is these:
obj << Studentized Residuals;
dt << Clear Selection;
colList = dt << Get Column Names( String );
For( i = 1, i <= N Cols( dt ), i++,
If( Contains( colList[i], "Studentized" ),
Column( colList[i] ) << Set Selected( 1 )
)
);
dt << group columns("Studentized Group");
// ANOTHER script
Names Default To Here(1);
dt=Current Data Table();
dt << Explore Outliers(
Y( Column Group("Studentized Group" ) ),
Robust Fit Outliers( K Sigma( 2.5 ), Huber( 1 ) ),
SendToReport(
Dispatch( {"Robust Fit Outliers"}, "", TabListBox, {Set Selected( 2 )} ),
Dispatch( {"Robust Fit Outliers"}, "", Tab Page Box( 2 ),
{Title( "Outliers by Cell" )}
),
Dispatch( {"Robust Fit Outliers"}, "", List Box( 9 ),
{Padding( {Left( 12 ), Top( 0 ), Right( 0 ), Bottom( 0 )} )}
),
Dispatch( {"Robust Fit Outliers"}, "", TextBox,
{Padding( {Left( 2 ), Top( 6 ), Right( 0 ), Bottom( 8 )} )}
),
Dispatch( {"Robust Fit Outliers"}, "", Panel Box( 3 ),
{Padding( {Left( 0 ), Top( 0 ), Right( 12 ), Bottom( 0 )} )}
),
Dispatch( {"Robust Fit Outliers"}, "", Lineup Box( 3 ), {Spacing( 1 )} ),
Dispatch( {"Robust Fit Outliers"}, "", Button Box( 11 ),
{Margin( {Left( 1 ), Top( 1 ), Right( 1 ), Bottom( 1 )} )}
),
Dispatch( {"Robust Fit Outliers"}, "", Button Box( 12 ),
{Margin( {Left( 1 ), Top( 1 ), Right( 1 ), Bottom( 1 )} )}
),
Dispatch( {"Robust Fit Outliers"}, "", Lineup Box( 4 ), {Spacing( 1 )} ),
Dispatch( {"Robust Fit Outliers"}, "", Button Box( 13 ),
{Margin( {Left( 1 ), Top( 1 ), Right( 1 ), Bottom( 1 )} )}
),
Dispatch( {"Robust Fit Outliers"}, "", Button Box( 14 ),
{Margin( {Left( 1 ), Top( 1 ), Right( 1 ), Bottom( 1 )} )}
)
)
);
That is all working. Then from here I need to select the outliers and do the IF formula in a new column for the studentized residual (as you mentioned). However, after it I need to create a New column for the parameters that were found with the outlier to change it to a column without outlier in the real data set.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: New column based on the studentized residual outliers - if missing then create a copy+paste original column
@Bmllnn ,
If I'm understanding correctly, you have a group of columns that are studentized residuals from several models, you want to run the Robust Fit outliers and then create copies of each of the individual studentised columns, but with the outlier rows removed for each column?
If so, you just need to click the 'Formula Column' and it will generate the new columns for you, or use the JSL as below:
Names Default to Here(1);
dt=current data table();
colList = dt << Get Column Names( String );
For( i = 1, i <= N Cols( dt ), i++,
If( Contains( colList[i], "Studentized" ),
Column( colList[i] ) << Set Selected( 1 )
)
);
dt << group columns("Studentized Group");
//Run the outliers
obj=Explore Outliers(
Y( Column Group( "Studentized Group" ) ),
);
obj<< Robust Fit Outliers( K Sigma( 2.5 ), Huber( 1 ) );//run the robust fit
wait(2);//wait to allow time for platform to run and perform calculations
obj<< Formula Columns( Suffix( "Culled" ) );//Create each column from the original studentised, with outliers removed