cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
SpannerHead
Level VI

Delete Rows Which Are Empty For All Columns in a List

Solved: Re: Remove rows with one or more empty values using JSL - Page 2 - JMP User Community

Based on the above discussion, I have a script to delete empty rows but it appears to delete those where they are empty in ANY of the listed columns, whereas I want to delete those where ALL are empty for the listed columns.

 

Thanks

 

// Get the current data table
dt = Current Data Table();

// Get all column names as strings
cnmes = dt << Get Column Names("string");

// Initialize Start and End indices
Start = .;
End = .;

// Find the index of the "Start" column
For(i = 1, i <= N Items(cnmes), i++,
    If(Contains(cnmes[i], "Start"),
        Start = i + 1;
        Break(); // Exit loop once found
    );
);

// Find the index of the "End" column
For(j = 1, j <= N Items(cnmes), j++,
    If(Contains(cnmes[j], "End"),
        End = j-1;
        Break(); // Exit loop once found
    );
);

datacols = {};

For(k = start, k <= End, k++,
    Insert into (datacols, cnmes[k]) // Determine Data Columns
);

keeprows = Loc Nonmissing( datacols );
dt << select rows( keeprows ) << invert Row Selection << delete rows;

 

 


Slán



SpannerHead
3 ACCEPTED SOLUTIONS

Accepted Solutions
txnelson
Super User

Re: Delete Rows Which Are Empty For All Columns in a List

Have you studied the JSL enough so you understand why it does not do what you need it to do?

 

Take a look in the Scripting Guide on Select Where

You will need to find all of the rows that have missing values for the first column in datacols then from that list of rows, select the rows that have missing values for the second column, then the thrid, etc. until you reach the last column in the datacols list.

The rows that remain after that, will  only those rows with missing values for all of the columns in datacols.

Then all you have to do is delete those remaining rows.

Look into the Current Selection() element of the Select Where() function.  It will restrict the each subsequent Select Where to only look at the previously selected rows when applying the selection.

 

Jim

View solution in original post

jthi
Super User

Re: Delete Rows Which Are Empty For All Columns in a List

I'm not exactly sure what is going on on your first part of script or how it is related to this question (to create list of the columns?) but you can this: using JMP platform (missing data pattern), select where (like Jim suggested) or for example with data table subscripting + loop / data table subscripting + some matrix operations.

 

First I would try with Missing Data platform JMP offers. See if it can be used for this (it can), after that look into the Select Where option Jim suggested. After you can script AND understand how Jim's suggestion works, then you can start looking into the data table subscription options I suggest below if you still want to.

 

 Missing data pattern 

Using JMP > Enter and Edit Your Data > Organize Data in Data Tables > View Patterns of Missing Data... 

Scripting Guide > Data Tables > Advanced Data Table Scripting > Find Missing Data Patterns

jthi_0-1745642082516.png

jthi_1-1745642092366.png

 

This uses matrix operations and I think it should work

 

Names Default To Here(1);

dt = New Table("Untitled",
	Add Rows(11),
	Compress File When Saved(1),
	New Column("Column 1", Character, "Nominal", Set Values({"", "", "aa", "", "a", "a", "", "a", "", "a", ""})),
	New Column("Column 2", Character, "Nominal", Set Values({"", "a", "a", "a", "a", "", "", "a", "a", "", ""})),
	New Column("Column 3", Character, "Nominal", Set Values({"a", "", "a", "", "a", "", "a", "", "", "", "a"}))
);

cols_to_check = {"Column 2", "Column 3"};

m = Matrix(IsMissing(dt[0, cols_to_check]));
rows_to_delete = Loc(V Sum(m`) == 2);
dt << select rows(rows_to_delete); // for demo purposes
dt << Delete Rows(rows_to_delete); // use this in real use case

 

This would loop over the rows while checking for missing values

rows_to_delete = {};
For Each Row(dt,
	m = Loc Nonmissing(dt[Row(), cols_to_check]);
	If(N Items(m) == 0,
		Insert Into(rows_to_delete, Row());
	);
);
dt << Select Rows(rows_to_delete);

 

 

-Jarmo

View solution in original post

jthi
Super User

Re: Delete Rows Which Are Empty For All Columns in a List

Does your data cols include what you are looking for? Also, you have to adjust the 2 to fit your data, dynamic option such as

rows_to_delete = Loc(V Sum(m`) == N Items(cols_to_check));

might be a good idea

-Jarmo

View solution in original post

8 REPLIES 8
txnelson
Super User

Re: Delete Rows Which Are Empty For All Columns in a List

Have you studied the JSL enough so you understand why it does not do what you need it to do?

 

Take a look in the Scripting Guide on Select Where

You will need to find all of the rows that have missing values for the first column in datacols then from that list of rows, select the rows that have missing values for the second column, then the thrid, etc. until you reach the last column in the datacols list.

The rows that remain after that, will  only those rows with missing values for all of the columns in datacols.

Then all you have to do is delete those remaining rows.

Look into the Current Selection() element of the Select Where() function.  It will restrict the each subsequent Select Where to only look at the previously selected rows when applying the selection.

 

Jim
SpannerHead
Level VI

Re: Delete Rows Which Are Empty For All Columns in a List

Jim

 

I believe I understand it, born to fail it was.  Your idea however is Nice!  Here's how that looks.

 


dt = Current Data Table();

// Get all column names as strings
cnmes = dt << Get Column Names("string");

// Initialize Start and End indices
Start = .;
End = .;

// Find the index of the "Start" column
For(i = 1, i <= N Items(cnmes), i++,
    If(Contains(cnmes[i], "Start"),
        Start = i + 1;
        Break(); // Exit loop once found
    );
);

// Find the index of the "End" column
For(j = 1, j <= N Items(cnmes), j++,
    If(Contains(cnmes[j], "End"),
        End = j-1;
        Break(); // Exit loop once found
    );
);

For(l = Start, l <= End, l++,
full_rows = dt << get rows where( !Is Missing( As Column(cnmes[l]) ) );
For Each Row(If(is missing(cnmes[i]), Remove From(full_rows))));

dt << Select Rows(full_rows) << Invert Row Selection << Delete Rows;

Slán



SpannerHead
txnelson
Super User

Re: Delete Rows Which Are Empty For All Columns in a List

Does this work?  I don't see how it would.  You are treating each column independently from each other.  You are resetting full_rows for each column.  

 

Here is an example of using multiple select where()_to narrow down the rows you want to select

names default to here(1);
dt = 
// Open Data Table: Big Class.jmp
// → Data Table( "Big Class" )
Open( "$SAMPLE_DATA/Big Class.jmp" );

// Select all males
dt << select where( :Sex == "M");

// From all of the selected males, select all of the 12 year olds
dt << select where( :age == 12, Current Selection("Restrict"));
Jim
jthi
Super User

Re: Delete Rows Which Are Empty For All Columns in a List

I'm not exactly sure what is going on on your first part of script or how it is related to this question (to create list of the columns?) but you can this: using JMP platform (missing data pattern), select where (like Jim suggested) or for example with data table subscripting + loop / data table subscripting + some matrix operations.

 

First I would try with Missing Data platform JMP offers. See if it can be used for this (it can), after that look into the Select Where option Jim suggested. After you can script AND understand how Jim's suggestion works, then you can start looking into the data table subscription options I suggest below if you still want to.

 

 Missing data pattern 

Using JMP > Enter and Edit Your Data > Organize Data in Data Tables > View Patterns of Missing Data... 

Scripting Guide > Data Tables > Advanced Data Table Scripting > Find Missing Data Patterns

jthi_0-1745642082516.png

jthi_1-1745642092366.png

 

This uses matrix operations and I think it should work

 

Names Default To Here(1);

dt = New Table("Untitled",
	Add Rows(11),
	Compress File When Saved(1),
	New Column("Column 1", Character, "Nominal", Set Values({"", "", "aa", "", "a", "a", "", "a", "", "a", ""})),
	New Column("Column 2", Character, "Nominal", Set Values({"", "a", "a", "a", "a", "", "", "a", "a", "", ""})),
	New Column("Column 3", Character, "Nominal", Set Values({"a", "", "a", "", "a", "", "a", "", "", "", "a"}))
);

cols_to_check = {"Column 2", "Column 3"};

m = Matrix(IsMissing(dt[0, cols_to_check]));
rows_to_delete = Loc(V Sum(m`) == 2);
dt << select rows(rows_to_delete); // for demo purposes
dt << Delete Rows(rows_to_delete); // use this in real use case

 

This would loop over the rows while checking for missing values

rows_to_delete = {};
For Each Row(dt,
	m = Loc Nonmissing(dt[Row(), cols_to_check]);
	If(N Items(m) == 0,
		Insert Into(rows_to_delete, Row());
	);
);
dt << Select Rows(rows_to_delete);

 

 

-Jarmo
hogi
Level XII

Re: Delete Rows Which Are Empty For All Columns in a List

is missing (list) and is missing (matrix) 
nice! - and unexpected.

hogi_0-1745653505288.png

hogi_1-1745653519976.png


is there a general overview which JSL functions can be "vectorized"?

SpannerHead
Level VI

Re: Delete Rows Which Are Empty For All Columns in a List

I thought the script worked but it has a flaw as you point out.  I tried @Jarmo's idea and it works but it doesn't populate the matrix for me.

// Get the current data table
dt = Current Data Table();

// Get all column names as strings
cnmes = dt << Get Column Names("string");

// Initialize Start and End indices
Start = .;
End = .;

// Find the index of the "Device 2" column
For(i = 1, i <= N Items(cnmes), i++,
    If(Contains(cnmes[i], "Device 2"),
        Start = i + 1;
        Break(); // Exit loop once found
    );
);

// Find the index of the "Process Script" column
For(j = 1, j <= N Items(cnmes), j++,
    If(Contains(cnmes[j], "start_date"),
        End = j-1;
        Break(); // Exit loop once found
    );
);

datacols = {};

For(k = start, k <= End, k++,
    Insert into (datacols, cnmes[k]) // Determine Data Columns
);

m = Matrix(IsMissing(dt[0, datacols]));
rows_to_delete = Loc(V Sum(m`) == 2);
dt << select rows(rows_to_delete); // for demo purposes
dt << Delete Rows(rows_to_delete); // use this in real use case

 

 


Slán



SpannerHead
jthi
Super User

Re: Delete Rows Which Are Empty For All Columns in a List

Does your data cols include what you are looking for? Also, you have to adjust the 2 to fit your data, dynamic option such as

rows_to_delete = Loc(V Sum(m`) == N Items(cols_to_check));

might be a good idea

-Jarmo
SpannerHead
Level VI

Re: Delete Rows Which Are Empty For All Columns in a List

Jarmo

 

That did the trick.  Was actually wondering the significance of the 2.

 

Thanks


Slán



SpannerHead

Recommended Articles