cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
BabyDoragon
Level II

JSL Implementation for Deleting Columns with All Missing Values by Same Category

In the attached file, the field "Store" is divided into two stores, A and B. If all cells in the entire column corresponding to the same Store are missing values, then delete the entire column, as shown in the attached "Expected result". How can this be implemented using JSL?

 

I am able to count the missing values in each column, but I cannot find a way to verify if all the cells corresponding to the same Store are missing values. Do you have a good method? Since there are actually many columns—tens of thousands—using a time-consuming calculation method would take too long.

3 REPLIES 3

Re: JSL Implementation for Deleting Columns with All Missing Values by Same Category

Why are :Banana Price and :Durian Price kept?  All "A" :Banana Price is missing and all "B" :Durian Price is missing.

Thierry_S
Super User

Re: JSL Implementation for Deleting Columns with All Missing Values by Same Category

Hi,

Following your rule to the letter, I devised a non-scripting option. However, as @mmarchandFSLR pointed out, the outcome would not match your "Results" because the Banana and Durian would be excluded/deleted.

1) Stack all your columns (see example attached)

2) Use the formula:

Col Sum( Is Missing( :Data ), :Store, :Label ) / Col Sum( 1, :Store, :Label )

3) Add a master check column "ROWS TO DELETE" with the formula:

Col Max( :MISSING by STORE PCT, :Label )

4) Delete the marked rows

5) Split the resulting table by "Label."

Best,

TS

 

Thierry R. Sornasse
txnelson
Super User

Re: JSL Implementation for Deleting Columns with All Missing Values by Same Category

I suggest you create a table that looks like this.

txnelson_0-1765547956735.png

You can then read through the table and build a list of the columns that meet whatever N vs. N Missing for each store you need to delete.

Here is a sample script to create the table

names default to here(1);
dt=data table("Fruite Price");
colNames = dt << get column names(continuous);

dtMissing = dt << Summary( invisible,
	Group( :Store ),
	N Missing( colNames ),
	Freq( "None" ),
	Weight( "None" ),
	statistics column name format( "column" ),
	output table name( "N Missing" )
);

dtStack = dtMissing << Stack( invisible,
	columns(
		colNames
	),
	Source Label Column( "Product" ),
	Stacked Data Column( "N Missing" ),
	Output Table( "Stack of N Missing" )
);

dtEval = dtStack <<
Split(
	Split By( :Store ),
	Split( :N Missing, :N Rows ),
	Group( :Product ),
	Output Table( "Eval Table" ),
	Sort by Column Property
);

You could use the Tabulate Platform to create the table, but you indicated you have thousands of columns.  I don't believe Tabulate would be the answer in that case, however the above script should handle it.

Jim

Recommended Articles