Apologies for how basic my question is, I am just starting to use JMP for the first time.
I have a massive dataset (14 columns x 20,000 rows) and I am trying to write a script that will identify and delete all rows that that do not have either a 0 or 1 for a specific range of columns (columns 4-14). This dataset is from an even larger dataset, so there are some rows in my subset that are completely empty. Here's an example of what I was hoping to do:
- - - - - 0 0 1 - 0 - - - - - 1 - - - -
0 0 1 - 0 1 - - - -
I've started writing the code, but I'm stuck–I'm not sure what to do after the For Each Row. I was hoping that this script would look at one row at a time, and if a particular cell did not have a 0 or a 1, it would look for a 0 or a 1 in the next column, and if it got to the end of the row with no 0s or 1s, then it would delete that entire row. Here is my code so far:
dt=Open("File Name") colnames = dt << get column names( numeric,string ); For( i = 4, i <= N Items( colnames ), i++, For Each Row(If(:colnames != 0 | :colnames != 1), Continue() ) );
I was thinking about using
<< Delete Rows
, which I've seen on some other questions on this forum. If this is too complicated with a script, is there a better way to do this?
Does this need to be a script?
If you find just one row that is missing those specific cells, select the missing cells.
Now go to the Rows menu and select Row Selection > Select Matching Cells.
Now every row that is missing data in those columns will be selected. Choose Rows > Delete Rows.
I think you need to look at the Missing Data Pattern platform:
Tables=>Missing Data Pattern
From your descripting, it sounds like it will meet your needs. It will analyze the columns you specify and then from the data table it produces, you can select the columns will all missing values, and then delete them.
or if you want to script it, this is my suggestion:
dt << New Column("Selected", Character, Nominal, Formula( If(Is Missing(:Column4) | Is Missing(:Column5)|...until column14, "Yes", "No" ) )); dt << Select Where(:Selected == "Yes"); dt << Delete Rows;
This script will take care of your request. It is not entirely unique of the other replies.
Names Default to Here( 1 ); // get reference to target data table dt = Current Data Table(); // get all the columns for missing value check cols = dt << Get Column Names; // obtain missing value patterns missing = Current Data Table() << Missing Data Pattern( Columns( Eval( cols ) ), Output Table( "Missing Data Pattern" ) ); // make character string representing missing values in all columns n missing = N Items( cols ); all missing = Repeat( "1", n missing ); // find the row of patterns where all columns are missing. missing << Select Where( :Patterns == all missing ); // delete the linked rows in the original table dt << Delete Rows; // clean up Close( missing, No Save );
As shown by earlier replies, there are often many ways to perform a task in JMP.
If all columns 4-14 are numeric, this script should also work to delete rows with missing values only.
dt = Current Data Table(); dt << Delete Rows(Loc(Is Missing(V Mean(dt[0, 4 :: 14]`))));