Discussions

steven_w_daniel · Jun 9, 2023 4:23 PM

I have a table that looks like the table below:

operator	M type	description	description 2	M1	M2	M3	M4	M5	M6
415	Nxyz	A hot	Hot clean 1	415	415	415	415	415	415
615	Nxyz	B cold	Cold clean 1	615	615	615		615	615
815	Nxyz	A hot	Hot clean 2	815	815	815	815	815	815
1015	Nxyz	B cold	Cold clean 2	1015		1015	1015	1015	1015
1215	Nxyz	A hot	Hot clean 1	1215	1215	1215	1215	1215	1215
1415	Nxyz	B cold	Cold clean 1	1415	1415	1415	1415	1415
1615	Nxyz	A hot	Hot clean 2	1615	1615	1615	1615	1615	1615
1815	Nxyz	B cold	Cold clean 2	1815	1815	1815			1815

I want to select only the rows that is empty. Since the columns M1-M6 can vary each time I pull the data I can't use the column names in the JSL script. I'm trying to use a FOR loop below to select the rows of empty data, but this only selects the rows where the last cell is empty.

dt = Current Data Table();
nc = dt << get column names( Numeric );
For( i = 1, i <= N Items( nc ), i++,
dt <<select where( Is Missing( nc[] )));

Eventually I want to get the output table to look like the following

operator	M type	description	description 2	Machine
615	Nxyz	B cold	Cold clean 1	M4
1015	Nxyz	B cold	Cold clean 2	M2
1415	Nxyz	B cold	Cold clean 1	M6
1815	Nxyz	B cold	Cold clean 2	M4
1815	Nxyz	B cold	Cold clean 2	M5

The big trick to this is to do this without coding in the column names for (M1,M2 etc...) as they are going to change.

Any Ideas

Jeff_Perkinson · Jul 2, 2021 2:01 PM

If I'm reading your message correctly, I think you've actually got two questions.

How to find the rows that are all missing for a given set of columns.
How to make your code not dependent on the names of those columns and how many there are.

Not surprisingly there are lots of different ways to do 1. PMroz , MS and Craige@JMP have given you some options there.

I'll throw in one more. You could use the Missing Data Pattern utility to find them.

dt = Open( "$SAMPLE_DATA\Cities.jmp" );

//delete some columns from this table to make it fit your example

dt << delete columns( "PM10", "X", "Y" );

//change name of data table to ensure this modified version doesn't get saved 
//over the sample data

dt << set name( "CitiesModified" );

//I want to use the columns starting at column 9, 
//so I build a list of references to them

mycols = {};

For( i = 9, i <= N Cols( dt ), i++,
	Insert Into( mycols, Column( dt, i ) )
);

mdp = Dt << Missing Data Pattern( columns( Eval( mycols ) ), Output Table( "Missing Data Pattern" ) );

//The patterns columns shows the pattern of missing values
//the Missing Data Pattern data table is sorted, 
//so if there are rows with all missing data
//the last row of the Missing Data Pattern data table will be all 1s.
//The Repeat() function just builds a string of 1s for as many columns as I'm working with

If( Column( mdp, "Patterns" )[N Rows( mdp )] == Repeat( "1", N Items( mycols ) ), 

//The Missing Data Pattern data table is linked to the source table so selecting a row
	//in it, selects all the rows that it represents in the source table

	mdp << Select Rows( N Rows( mdp ) );

//now I can delete the rows in the source table

	dt << delete rows;
,
	Show( "No all missing rows" )
);

I've tried to show how to make this code work for any number of columns. In this case, because I needed a list of columns to pass to the Missing Data Pattern utility, I built a list based on the column numbers.

Even having written all of this and having shown how to use the Missing Data Pattern Utility, I think my favorite solution here would be PMroz's to use Stack Columns. You really want to end up with the data stacked anyway, so doing that first makes it easy to simply select the missing rows in the stacked column and delete them.

dt = Open( "$SAMPLE_DATA\Cities.jmp" );

//delete some columns from this table to make it fit your example

dt << delete columns( "PM10", "X", "Y" );

//change name of data table to ensure this modified version doesn't get saved 
//over the sample data

dt << set name( "CitiesModified" );

//again I need a list of columns to use in the Stack()

mycols = {};

For( i = 9, i <= N Cols( dt ), i++,
	Insert Into( mycols, Column( dt, i ) )
);

//stack those columns creating a new data table

stacked_dt = dt << Stack(
	columns( Eval( mycols ) ),
	Source Label Column( "Gas" ),
	Stacked Data Column( "PPM" )
);

missing_rows = stacked_dt << get rows where( Is Missing( :PPM ) );

stacked_dt << delete rows( missing_rows );

-Jeff

View solution in original post

pmroz · Jun 25, 2015 08:10 AM

You can use the STACK command to stack your data, including the values of M1-Mn . Then select rows that are empty and delete them.

Or try this code:

dt = Current Data Table();
col_list = dt << get column names( Numeric, string );
 
where_clause = "";
 
For( i = 1, i <= N Items( nc ), i++,
     one_col = col_list[i];
     if (i == 1,
     // then
           where_clause =evalinsert("\[dt << select where(is missing(as column(dt, "^one_col^"))]\");
     // else
           ,
           where_clause = evalinsert("\[^where_clause^ | is missing(as column(dt, "^one_col^"))]\");
     );
);
 
where_clause = where_clause || ")";
eval(parse(where_clause));

Craige_Hales · Jul 2, 2021 2:02 PM

Same answer as PMroz using expression manipulation:

dt = Current Data Table();
e = Expr( Or() ); // the OR of the ismissing(columns) will be built here
nc = dt << get column names( Numeric );
For( i = 1, i <= N Items( nc ), i++,
    Insert Into( e, Eval Expr( Is Missing( Expr( nc ) ) ) )
);
dt << selectwhere( e );

the expression "e" looks like

nameexpr(e)
/*:
Is Missing( :m1 ) | Is Missing( :m2 ) | Is Missing( :m3 ) | Is Missing( :m4 ) | Is Missing( :m5 ) | Is Missing( :m6 )

Head Arg Recurse Formula

Introspection

Craige

pmroz · Jun 25, 2015 09:54 AM

Very slick. Didn't know you could insert into an expression (e). I thought insert into needed a list.

ms · Jul 2, 2021 2:03 PM

Yet another way to select rows with empty cells:

dt << select where(!Contains(Loc Nonmissing(dt << get as matrix), Row()));

pmroz · Jun 25, 2015 03:39 PM

MS that's even slicker - now let's see you do it in 0 lines of code! Ha ha just goes to show there are several ways to do the same thing in JSL.

Jeff_Perkinson · Jul 2, 2021 2:01 PM

If I'm reading your message correctly, I think you've actually got two questions.

How to find the rows that are all missing for a given set of columns.
How to make your code not dependent on the names of those columns and how many there are.

Not surprisingly there are lots of different ways to do 1. PMroz , MS and Craige@JMP have given you some options there.

I'll throw in one more. You could use the Missing Data Pattern utility to find them.

dt = Open( "$SAMPLE_DATA\Cities.jmp" );

//delete some columns from this table to make it fit your example

dt << delete columns( "PM10", "X", "Y" );

//change name of data table to ensure this modified version doesn't get saved 
//over the sample data

dt << set name( "CitiesModified" );

//I want to use the columns starting at column 9, 
//so I build a list of references to them

mycols = {};

For( i = 9, i <= N Cols( dt ), i++,
	Insert Into( mycols, Column( dt, i ) )
);

mdp = Dt << Missing Data Pattern( columns( Eval( mycols ) ), Output Table( "Missing Data Pattern" ) );

//The patterns columns shows the pattern of missing values
//the Missing Data Pattern data table is sorted, 
//so if there are rows with all missing data
//the last row of the Missing Data Pattern data table will be all 1s.
//The Repeat() function just builds a string of 1s for as many columns as I'm working with

If( Column( mdp, "Patterns" )[N Rows( mdp )] == Repeat( "1", N Items( mycols ) ), 

//The Missing Data Pattern data table is linked to the source table so selecting a row
	//in it, selects all the rows that it represents in the source table

	mdp << Select Rows( N Rows( mdp ) );

//now I can delete the rows in the source table

	dt << delete rows;
,
	Show( "No all missing rows" )
);

I've tried to show how to make this code work for any number of columns. In this case, because I needed a list of columns to pass to the Missing Data Pattern utility, I built a list based on the column numbers.

Even having written all of this and having shown how to use the Missing Data Pattern Utility, I think my favorite solution here would be PMroz's to use Stack Columns. You really want to end up with the data stacked anyway, so doing that first makes it easy to simply select the missing rows in the stacked column and delete them.

dt = Open( "$SAMPLE_DATA\Cities.jmp" );

//delete some columns from this table to make it fit your example

dt << delete columns( "PM10", "X", "Y" );

//change name of data table to ensure this modified version doesn't get saved 
//over the sample data

dt << set name( "CitiesModified" );

//again I need a list of columns to use in the Stack()

mycols = {};

For( i = 9, i <= N Cols( dt ), i++,
	Insert Into( mycols, Column( dt, i ) )
);

//stack those columns creating a new data table

stacked_dt = dt << Stack(
	columns( Eval( mycols ) ),
	Source Label Column( "Gas" ),
	Stacked Data Column( "PPM" )
);

missing_rows = stacked_dt << get rows where( Is Missing( :PPM ) );

stacked_dt << delete rows( missing_rows );

-Jeff

steven_w_daniel · Jun 26, 2015 02:54 AM

Thanks everyone.....Everything worked except the Loc Nonmissing.....I only have JMP9 at the moment (too lazy to upgrade) and that feature doesn't work in JMP9......I'm still a JSL newbie.....Thanks again

Discussions

Selecting cells with empty data

Re: Selecting cells with empty data

Re: Selecting cells with empty data

Re: Selecting cells with empty data

Re: Selecting cells with empty data

Re: Selecting cells with empty data

Re: Selecting cells with empty data

Re: Selecting cells with empty data

Re: Selecting cells with empty data

Recommended Articles