cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
Jemster
Level III

Deleting Columns with N Null Row Values

I have a table of values with each column being an instrument and each row being a time. Some instruments have not had data recorded at enough time values in order for me to fit a meaningful model.

I am attempting to use a loop to allow me to cycle through each column in turn and check the number of values.

I have some code which find the number of values for column 80.

col_a = Column(dt_2, 80);
val_mat = col_a << Get Values;
y = sum(!is missing(val_mat));

I am attempting to adapt this code into a loop for each column in turn and delete columns with too few values.

N_loop = N Cols(dt_2);
i_loop = 2;
While (i_loop <= N_loop,
	col_loop = Column(dt_2, i_loop),
	val_loop = col_loop << Get Values,
	y_loop = sum(!is missing(val_loop)),
	If(y_loop <= 8,
		dt_2 << Delete Columns( i_loop ),
	);
	i_temp+=1,
);

The error message says too many arguments.

2 ACCEPTED SOLUTIONS

Accepted Solutions
jthi
Super User

Re: Deleting Columns with N Null Row Values

Many ways to do this. One option is to use Col Number or Col N Missing and For Each (requires JMP16)

Names Default To Here(1);
dt = Open("$SAMPLE_DATA/Big Class.jmp");
//delete some values
dt[1::20, "name"] = "";
dt[1::30, "sex"] = "";
dt[1::40, "height"] = .;

wait(2);

row_count = N Rows(dt);
For Each({col_name}, dt << Get Column Names("String"),
//  missing_vals = Col N Missing(Column(dt, col_name));
//	found_values = row_count - missing_vals;
	found_values = Col Number(Column(dt, col_name));
	If(found_values <= 8,
		dt << Delete Column(col_name);
	)
);
-Jarmo

View solution in original post

txnelson
Super User

Re: Deleting Columns with N Null Row Values

The reason there are too many arguments, is that, there is a second comma in the While() function.

The structure of a While() function is

While( testExpr, bodyExpr )

Note: there are only 2 arguments to the function.  The test

i_loop <= N_loop

and then the statements that are executed if the textExpr is true.  Your bodyExpr has commas after each statement, rather than semicolons.  Thus, making each statement a separate argument.

 

Please do not just change the commas to semicolons and run this code.  There is a logic error, which will not allow the code to ever end.  i_loop and N_loop never change in value.

 

There is also a faster way to check for the number of valid values in a column.  Additionally, if you are deleting columns, and your N_loop value remains fixed, N_loop is no longer the correct number of columns in the data table, once one column is deleted.  It is best to work from the last column to the first column in such cases.

Below is a rework of your script which will delete the columns where the number of valid data values is less than or equal to 8.

Names Default To Here( 1 );
dt_2 = Current Data Table();

N_loop = N Cols( dt_2 );

i_loop = 2;
While( N_loop > i_loop,
	If( Col Number( As Column( dt_2, N_loop ) ) <= 8,
		dt_2 << Delete Columns( N_loop )
	);
	N_loop -= 1;
);

P.S. @jthi solution is a better solution.   Your approach, with some corrections, does work.....so good job!

Jim

View solution in original post

2 REPLIES 2
jthi
Super User

Re: Deleting Columns with N Null Row Values

Many ways to do this. One option is to use Col Number or Col N Missing and For Each (requires JMP16)

Names Default To Here(1);
dt = Open("$SAMPLE_DATA/Big Class.jmp");
//delete some values
dt[1::20, "name"] = "";
dt[1::30, "sex"] = "";
dt[1::40, "height"] = .;

wait(2);

row_count = N Rows(dt);
For Each({col_name}, dt << Get Column Names("String"),
//  missing_vals = Col N Missing(Column(dt, col_name));
//	found_values = row_count - missing_vals;
	found_values = Col Number(Column(dt, col_name));
	If(found_values <= 8,
		dt << Delete Column(col_name);
	)
);
-Jarmo
txnelson
Super User

Re: Deleting Columns with N Null Row Values

The reason there are too many arguments, is that, there is a second comma in the While() function.

The structure of a While() function is

While( testExpr, bodyExpr )

Note: there are only 2 arguments to the function.  The test

i_loop <= N_loop

and then the statements that are executed if the textExpr is true.  Your bodyExpr has commas after each statement, rather than semicolons.  Thus, making each statement a separate argument.

 

Please do not just change the commas to semicolons and run this code.  There is a logic error, which will not allow the code to ever end.  i_loop and N_loop never change in value.

 

There is also a faster way to check for the number of valid values in a column.  Additionally, if you are deleting columns, and your N_loop value remains fixed, N_loop is no longer the correct number of columns in the data table, once one column is deleted.  It is best to work from the last column to the first column in such cases.

Below is a rework of your script which will delete the columns where the number of valid data values is less than or equal to 8.

Names Default To Here( 1 );
dt_2 = Current Data Table();

N_loop = N Cols( dt_2 );

i_loop = 2;
While( N_loop > i_loop,
	If( Col Number( As Column( dt_2, N_loop ) ) <= 8,
		dt_2 << Delete Columns( N_loop )
	);
	N_loop -= 1;
);

P.S. @jthi solution is a better solution.   Your approach, with some corrections, does work.....so good job!

Jim