I run into problem when i run my model with some parameters. So i want to do some data cleaning before run the model. Is there a quick way to delete columns with same values? Please advise how to do it with JSL or any other way that may fit?
I'm not sure if I get you Right,
for modelling you can select the columns that you think are important, so there is no Need to delete the column.
You want to delete some rows where a column has the same value?
Below is the Code for it:
Otherwise please describe more in Detail what you Need, an example May be very helpful.
Names Default To Here( 1 ); dt = Open( "$SAMPLE_DATA/Big Class.jmp" ); wait(3); dt << Select duplicate rows( Match( :age ) ); dt << delete rows;
This one deletes columns:
Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
// make some copies of columns
dt << New Column( "weight_copy", set each value( :weight ) );
dt << New Column( "height_copy", set each value( :height ) );
to_delete_lst = {};
For( i = 1, i < N Col( dt ), i++,
For( j = i + 1, j <= N Col( dt ), j++,
If( (Column( i ) << get data type) == (Column( j ) << get data type),
// all celles are equal ?
If( Mean( (Column( i ) << get values) == (Column( j ) << get values) ) == 1,
Print( (Column( i ) << get name) || " equal to " || (Column( j ) << get name) );
insert into(to_delete_lst, j);
);
);
)
);
// show list of equals
Show(to_delete_lst);
wait(3);
dt << select columns(to_delete_lst);
dt << delete columns;
I'm not sure, too :)
This means, you Need to be more specific on what you Need.
If you simply want to explore the values of a large table, you May want to use the columns viewer
Columns->Columns viewer
There you easily can compare columns of a large table.
Here is a different take on how to do this. It uses RSquare as an indicator of matching data.
Names Default To Here( 1 );
dt = Current Data Table();
colNames = dt << get column names( continuous );
obj = dt <<
Response Screening(
Y( Eval( colNames ) ),
X( Eval( colNames ) )
);
dtCalc = obj << get PValues;
dtCalc << select where(Round(:RSquare,4) != 1 | :X == :Y );
try( dtCAlc << delete rows );
dtCompare = dtCalc << subset( selected rows(0), columns("X","Y"));
close(dtCalc,nosave);
for each row(
If(:X > :Y,
hold=:X;
:X = :Y;
:Y = hold;
)
);
dtFinal = dtCompare << Summary(
Group( :X, :Y ),
Freq( "None" ),
Weight( "None" ),
Link to original data table( 0 )
);
close( dtCompare, nosave );
Hi nelson,
Thanks for advice. But the response screening is take way too much processing time and it is not ideal. Running 45s and haven't half way finish processing.