cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Lharbicht
Level III

Perform Formula on Subset Columns

I have created a script to make a subset table of a larger table, but it isn't working correctly. My goal is to have the user select which columns they would like in the subset table, and then have the script perform a formula on the columns before it places the data into the new table. So far my script will pull up the dialogue box for the user interaction, but when I select the columns in the dialogue box, they aren't the ones that show up in the subset table. I'm not sure where my problem lays with that. (Note: the user will be selecting columns with data in them, rather than sample information, so in the example, columns with column names starting with "Column 10...".)

 

Additionally, I'm not sure where I would insert the formula. Basically the columns the user selects from the dialogue box will have some cells with < symbols, for the new subset table, I would like any cell without a less than symbol to turn to zero and any cell with a < symbol to have the < symbol removed placing only the number beside it in the subset table. So the new subset table with have zeros for detected values and the detection limit for the non detected compounds.

 

Here's my script so far.

Names Default To Here( 1 );

dt = Data Table("Example Data");

col_dlg = new window("Column Select",
	panelbox("Select columns to create NonDetect table with",
		col_clist = collistbox(all, width(200)),
		buttonbox("Ok",
			selected_column_list = col_clist << getselected;
			dt2 = dt << Subset(Output Table("NonDetect"),columns(col_clist));
			col_dlg << close window;
	)
	)
);

Thank you for any help!

 

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Perform Formula on Subset Columns

It turns out that the Substr() function does not work across the selected rows(not sure why), so I changed up the code a bit and tested it out(I obviously had not tested my last version)

Names Default To Here( 1 );

dt = Data Table( "Example Data" );

col_dlg = New Window( "Column Select",
	Panel Box( "Select columns to create NonDetect table with",
		col_clist = Col List Box( all, width( 200 ) ),
		Button Box( "Ok",
			selected_column_list = col_clist << getselected;
			dt2 = dt << Subset( Output Table( "NonDetect" ), columns( selected_column_list ) );
			// Loop across the columns in the new data table and convert them to numeric
			// Cells with nonnumeric values will be set to Missing Values, which then can
			// be identified and set to 0
			For( i = 1, i <= N Items( selected_column_list ), i++, 
				// Select all of the currently non selected columns and unselect the current ones
				dt2 << select where( Contains( As Column( dt2, selected_column_list[i] ), "<" ) == 0 );
				// Set for the target column all of the currently selected rows to "0"	
				Column( dt2, selected_column_list[i] )[dt2 << get selected rows] = "0";
				// Find the rows with the "<" in them
				dt2 << invert row selection;
				selRows = dt2 << get selected rows;
				//selRows= dt2 << Get Rows Where(Contains(As Column( dt2, selected_column_list[i] ),"<" )==1);
				// Get rid of the "<" making the value a valid numeric value
				For( k = 1, k <= N Rows( selRows ), k++,
					Column( dt2, selected_column_list[i] )[selRows[k]] = Substr( As Column( dt2, selected_column_list[i] )[selRows[k]], 2 )
				);
				// Convert the column to a numeric/continuous column
				Column( dt2, selected_column_list[i] ) << data type( numeric ) << modeling type( continuous );
				
			);
			col_dlg << close window;
		)
	)
);
Jim

View solution in original post

14 REPLIES 14
txnelson
Super User

Re: Perform Formula on Subset Columns

You were referencing the Display Object, col_clist, rather than the contents of it, which you had placed into Selected_column_list.

Names Default To Here( 1 );

dt = Data Table( "Example Data" );

col_dlg = New Window( "Column Select",
	Panel Box( "Select columns to create NonDetect table with",
		col_clist = Col List Box( all, width( 200 ) ),
		Button Box( "Ok",
			selected_column_list = col_clist << getselected;
			dt2 = dt << Subset( Output Table( "NonDetect" ), columns( selected_column_list ) );
			col_dlg << close window;
		)
	)
);
Jim
Lharbicht
Level III

Re: Perform Formula on Subset Columns

Ah, that works. 

 

Is there a way to then complete a formula on the data before its placed in the new table? If not, my thought was then to create new columns in the subset table, complete the calculation in the new columns and then delete the original columns (using script). Is there a more efficient way to do this?

 

Thanks!

txnelson
Super User

Re: Perform Formula on Subset Columns

What is the formula you want to calculate?  Does it really require the creation of a new column?  Remember, this isn't Excel.  To generate values or results, they do not have to be placed into columns....there are other structures.  Can you describe in more detail what you are looking to do?

Jim
Lharbicht
Level III

Re: Perform Formula on Subset Columns

I think that is where I have my trouble, I'm very proficent in excel and have been doing all of my data analysis in it for 9 years. I'm just trying to transfer over and I have a hard time with my if/then statements in jmp (among other things). I'll write it in terms of excel and maybe you could help?

 

I will be making two of these subset tables. One for detected values and one for non-detected values. For the detected values I would do an if/then statement such as =if(isnumber(J2),J2,0). For the nondetected values I would do something like =if(isnumber(J2),0,J2).

 

I was trying to switch it in JMP to be 

If(Contains(:Column 10 1,2,3,4,6,7,8-HpCDD,"<"),0,:Column 10 1,2,3,4,6,7,8-HpCDD)

but it doesn't seem to work even if I'm creating a column myself and inserting the formula.

 

So basically, in my script I will include two parts, one for the NonDetect table and one for the Detect table and I'd like the columns to be transformed as such, using the if then statement? I will in the end, send the two data tables to R to impute the non detect values using the DLs remaining in the NonDetect subset. I hope that explains it?

 

txnelson
Super User

Re: Perform Formula on Subset Columns

Here is a change in the script that I think you might find as a good approach.  What I interpret as what you want to do, is to create a numeric column that contains the values from the character columns you select, but if the values contain a "<" symbol, make those entries a zero(0).  The script below does that, without creating new columns.  It converts the columns to numeric, and sets the "<" cells to zero.

Names Default To Here( 1 );

dt = Data Table( "Example Data" );

col_dlg = New Window( "Column Select",
	Panel Box( "Select columns to create NonDetect table with",
		col_clist = Col List Box( all, width( 200 ) ),
		Button Box( "Ok",
			selected_column_list = col_clist << getselected;
			dt2 = dt << Subset( Output Table( "NonDetect" ), columns( selected_column_list ) );
			// Loop across the columns in the new data table and convert them to numeric
			// Cells with nonnumeric values will be set to Missing Values, which then can
			// be identified and set to 0
			For( i = 1, i <= N Items( selected_column_list ), i++,
				// Change the column to numeric/continuous
				// The names in the data table are very complex, containing symbols, etc.
				// that JMP will confuse as operators.  Therefore, using the Column()
				// function, or a :Name(" ") function to wrap around the column name
				// needs to be used
				Column( dt2, selected_column_list[i] ) << data type( numeric ) <<
				modeling type( continuous );
				// Now find all of the rows in the table that have Missing Values
				// in the target column, and then set them all to "0" in one 
				// statement
				Column( dt2, selected_column_list[i] )[dt << get rows where(
					Is Missing( As Column( dt2, selected_column_list[i] ) )
				)] = 0;
			);
			col_dlg << close window;
		)
	)
);

You could create new columns and set a formula as you had indicated in your last response.  The issue you are having with what you provided, is dealing with the complex column names. Here is how that would work

If( Contains( :Name( "Column 10 1,2,3,4,6,7,8-HpCDD" ), "<" ),
	0,
	Num( :Name( "Column 10 1,2,3,4,6,7,8-HpCDD" ) )
)
Jim
Lharbicht
Level III

Re: Perform Formula on Subset Columns

That works great for the Detect table, so for my second table I would like to switch the values that turn to zero so I don't think this approach will work. I would like to keep the cells that have < symbol and turn the rest of the cells to zero. Then I would like to remove the < symbol leaving only numeric values. In my head I imagine I would do the loop as you have it, then add in an if/else statement?

 

And thank you for posting the if/else statement. My problem was not having the num() before my column name for the else! 

txnelson
Super User

Re: Perform Formula on Subset Columns

Here is the code to handle your second case

Names Default To Here( 1 );

dt = Data Table( "Example Data" );

col_dlg = New Window( "Column Select",
	Panel Box( "Select columns to create NonDetect table with",
		col_clist = Col List Box( all, width( 200 ) ),
		Button Box( "Ok",
			selected_column_list = col_clist << getselected;
			dt2 = dt << Subset( Output Table( "NonDetect" ), columns( selected_column_list ) );
			// Loop across the columns in the new data table and convert them to numeric
			// Cells with nonnumeric values will be set to Missing Values, which then can
			// be identified and set to 0
			For( i = 1, i <= N Items( selected_column_list ), i++,
				// Select the rows where there is a "<" for the target column
				dt2 << Select where(Contains(As Column( dt2, selected_column_list[i] ),"<" )==1);
				// Get rid of the "<" making the value a valid numeric value
				Column( dt2, selected_column_list[i] )[dt2 << get selected rows )] = 
					substr(As Column( dt2, selected_column_list[i] ),1);
				// Select all of the currently non selected columns and unselect the current ones
				dt2 << invert row selection;
				// Set for the target column all of the currently selected rows to "0"	
				Column( dt2, selected_column_list[i] )[dt2 << get selected rows )] = "0";
				// Convert the column to a numeric/continuous column
				Column( dt2, selected_column_list[i] ) << data type( numeric ) <<
					modeling type( continuous );
			);
			col_dlg << close window;
		)
	)
);
Jim
Lharbicht
Level III

Re: Perform Formula on Subset Columns

The second code that you sent for the second table doesn't seem to be working. I've spent a few hours trying different things. I noticed that two ('s were missing. That helped the code to run without error, however the portion of code that removes the "<" symbol doesn't seem to operate properly. If I remove that bit of code the table ends up right with the detected values as 0 and the ND values still intact, with the < symbol. I've tried fiddling with it but haven't figured it out yet. Any clues?  :)


Thanks!

txnelson
Super User

Re: Perform Formula on Subset Columns

oops.....my error, change the "1" in the substr() function to a "2" and that should fix it

Jim