cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
KRT
KRT
Level II

Script to find and replace

I use a GCMS program which adds "saturated (####)" when a peak area count is saturated.  After I export these results from excel into a JMP data table, I would like to run a script which finds these occurrences and removes the "saturated ()" but leaves the number in numeric, continuous 12,0 format.

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Script to find and replace

Try this on your Area column

Names Default To Here( 1 );

dt = Current Data Table();

// Loop across all character columns and find those who have the pattern (###)
// and if found, then convert all columns

For( k = 1, k <= N Rows( dt ), k++,
	If( Contains( :Area[k], "(" ),
		:Area[k] = Word( 2, :Area[k], "()" )
	)
);
:Area << data type( numeric ) << modeling type( continuous ) << Format( "Fixed Dec", 12, 0 );
Jim

View solution in original post

7 REPLIES 7
txnelson
Super User

Re: Script to find and replace

Interactively, it is quite easy to use the Find capability to do a global replace of "(" and ")".  Then, using Standardize Attributes you can select all of the columns in question, and change them to numeric, continuous, format(Fixed Dec, 12,0)

Find is available at

     Edit=>Search=>Find

Standardize Attributes is available under the red triangle in the Columns Panel on the left side if the data table.

 

Here is a little script that will also do the trick, but with a large data table, it might be slow

Names Default To Here( 1 );

dt = Current Data Table();

// Get all character columns
colList = dt << get column names( character, string );

// Loop across all character columns and find those who have the pattern (###)
// and if found, then convert all columns
For( I = 3, i <= N Items( colList ), i++,
	If( Left( char(Column( dt, colList[i] )[1]), 1 ) == "(" & Right( char(Column( dt, colList[i] )[1]), 1 ) == ")",
		For( k = 1, k <= N Rows( dt ), k++,
			If( Left( (Column( dt, colLIst[i] )[k]), 1 ) == "(",
				Column( dt, colList[i] )[k] = Substr( Column( dt, colList[i] )[k] ,
				2)
			);
			If( Right( (Column( dt, colList[i] )[k]), 1 ) == ")",
				Column( dt, colList[i] )[k] = Word( 1, Column( dt, colList[i] )[k], ")" )
			);
		);
		
		// Convert to numeric, etc.
		Column( dt, colList[i] ) << data type( numeric ) << modeling type( continuous ) <<
		Format( "Fixed Dec", 12, 0 );
	)
);
Jim
KRT
KRT
Level II

Re: Script to find and replace

Thank you for the script, but it appears I was not specific enough.  Writing script is very new for me and I assumed the length of the number did not matter.  I am guessing that was wrong.  I played around with the script but could not get it to turn "saturated( 635403558 )" into 635403558.  I appreciate your help.

txnelson
Super User

Re: Script to find and replace

The script expects the "(" to be ithe first character, and the ")" to be the last character.  Is it possible that blank characters are at the beginning or end?

try this version of the script and see if it helps

Names Default To Here( 1 );

dt = Current Data Table();

// Get all character columns
colList = dt << get column names( character, string );

// Loop across all character columns and find those who have the pattern (###)
// and if found, then convert all columns
For( I = 1, i <= N Items( colList ), i++,
	If( Left( Trim( Char( Column( dt, colList[i] )[1] ) ), 1 ) == "(" & Right( Trim( Char( Column( dt, colList[i] )[1] ) ), 1 ) == ")",
		For( k = 1, k <= N Rows( dt ), k++,
			If( Left( (Trim( Column( dt, colLIst[i] ) )[k]), 1 ) == "(",
				Column( dt, colList[i] )[k] = Substr( Trim( Column( dt, colList[i] )[k] ), 2 )
			);
			If( Right( Trim( Column( dt, colList[i] )[k] ), 1 ) == ")",
				Column( dt, colList[i] )[k] = Word( 1, Column( dt, colList[i] )[k], ")" )
			);
		);
		
		// Convert to numeric, etc.
		Column( dt, colList[i] ) << data type( numeric ) << modeling type( continuous ) << Format( "Fixed Dec", 12, 0 );
	)
);
Jim
ian_jmp
Level X

Re: Script to find and replace

If I understand correctly, this might be a chance to use JMP's pattern matching (thanks to @Craige_Hales

NamesDefaultToHere(1);

// Define a function
removeSaturated =
Function({inStr}, {Default Local},
	// A pattern that will match a span of digits, with or without a leading '-'
	digits = (""|"-") + patSpan("0123456789");
	// Put the match into 'found' (might be identical to 'inStr')
	patMatch(inStr, digits >? found);
	// Return 'found' as a number
	Num(found);
	);

// Try it out
Print(removeSaturated("saturated( 635403558 )"));
Print(removeSaturated("saturated(635403558)"));
Print(removeSaturated("635403558"));
Print(removeSaturated("635403"));
Print(removeSaturated("-635403"));
Print(removeSaturated("saturated(  -635403  )"));

 

KRT
KRT
Level II

Re: Script to find and replace

Sadly nothing happened when I ran the script.  I included a screen shot of my data along with the column attributes, if that helps.  Thanks again!

KRT_0-1608765066194.png

 

txnelson
Super User

Re: Script to find and replace

Try this on your Area column

Names Default To Here( 1 );

dt = Current Data Table();

// Loop across all character columns and find those who have the pattern (###)
// and if found, then convert all columns

For( k = 1, k <= N Rows( dt ), k++,
	If( Contains( :Area[k], "(" ),
		:Area[k] = Word( 2, :Area[k], "()" )
	)
);
:Area << data type( numeric ) << modeling type( continuous ) << Format( "Fixed Dec", 12, 0 );
Jim
ian_jmp
Level X

Re: Script to find and replace

The code I gave above just writes to the JMP Log. Looks like @txnelson has already told you how to handle your 'Area' column. If you wanted to, you could use a combination of the pattern matching code and his JSL.

 

With all these things it comes down to what you can safely assume about the regularity of the input. Whilst the pattern matching approach is (slightly) more general (and could easily be made more so), in your case it may not matter.