cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
RA899
Level III

Searching every cell in one column to find a specific content

Hi all, 

I have a jmp data table with a column that contains different strings at each row. I would like to know if I can loop that column to find a specific part of the string in that cell. 

 

This an example of the strings: 

 

Class: 10 Table:15 Stat:NA loc:NA teacher:Sam --> content of the cell

Class: 11 Table:11 Stat:NA loc:NA teacher:Sam

 

 

The challenge is that I have different class # and different table #. However, I only need to find the cells the starts with "Class: #" and grab the rest of the cells into a list. 

 

So my question is: can I do a partial cell search at that specifc column? 

 

 

6 REPLIES 6
txnelson
Super User

Re: Searching every cell in one column to find a specific content

This is one possible way to handle this

Names Default To Here( 1 );

dt = Current Data Table();

foundList = {};

For Each Row(
	If(
		Word( 1, thestringcolumn, " " ) & Is Missing(
			Num( word * 2, thestringcolumn, " " )
		) == 0,
		Insert Into(
			foundList,
			Substr(
				thestringcolumn,
				Length( Word( 1, thestringcolumn, " " ) )
				+Length( Word( 2, thestringcolumn, " " ) ) + 2
			)
		)
	)
);
Jim
jthi
Super User

Re: Searching every cell in one column to find a specific content

Depending on your string format, you could maybe use Recode (note that Keep text if no match isn't checked)

jthi_0-1710788001887.png

or script using For Each Row. If you use For Each Row, you have many different options for parsing the string and how to store your results, below is another example

Names Default To Here(1);

dt = New Table("Untitled",
	Add Rows(2),
	Compress File When Saved(1),
	New Column("Column 1",
		Character,
		"Nominal",
		Set Values(
			{"Class: 10 Table:15 Stat:NA loc:NA teacher:Sam",
			"Class: 11 Table:11 Stat:NA loc:NA teacher:Sam"}
		)
	)
);

result = {};

For Each Row(dt,
	res = Regex(:Column 1, "^(Class: \d+) (.*)?", "\2");
	If(!IsMissing(res),
		Insert Into(result, res);
	);
);

show(result); 
// result = {"Table:15 Stat:NA loc:NA teacher:Sam", "Table:11 Stat:NA loc:NA teacher:Sam"};
-Jarmo
RA899
Level III

Re: Searching every cell in one column to find a specific content

Thanks @jthi. This works perfectly for the example I provided above. 

 

I do have a follow up question, let's say we have the following String list, 

 

10::Class: A  

11::Class: A1 

12::Class: A2

..

20::Class B

21::Class B1

22::Class B1

 

30::Class C

31::Class C1

32::Class C2

 

Say I want to make a list for each class. I want the first to be ClassAList = (11::Class: A  ,11::Class: A  ,11::Class: A ); 

 

I tried modifying regex() to look for the leading digit in a string and then move it into the corresponding list. I used something like this:  Regex("10:::Class" , "[1]")

 

The issue here is that it is looking for [1] across all the string. Is it possible to select the index where the regex should look for a match?

 

The other issue here is that returns 1 only, not the whole string. 

 

Any guidance is appreciated.  

 

 

jthi
Super User

Re: Searching every cell in one column to find a specific content

This might give some idea what you could do

Names Default To Here(1);

mylist = {"10::Class: A", "11::Class: A1", "12::Class: A2", "20::Class B", "21::Class B1",
"22::Class B1", "30::Class C", "31::Class C1", "32::Class C2"};

class_pattern = "\d+::(Class:? [A-Z])";

aa = Associative Array();
For Each({myitem}, mylist,
	res = Regex(myitem, class_pattern, "\1");
	If(!Contains(aa, res),
		aa[res] = {};
	);
	Insert Into(aa[res], myitem);
);

show(aa);

https://regex101.com/r/itqqww/1 for explanation of the regex pattern

-Jarmo
RA899
Level III

Re: Searching every cell in one column to find a specific content

Thanks @jthi. This works perfectly. 

 

Follow up question:

I would like to group the list into subgroups based on only and only the first digit. I tried to use regex Anchors to do so. However, instead of matching only the first digit, it matches the first and whatever digits comes after.  Any guidance is appreciated. 

jthi
Super User

Re: Searching every cell in one column to find a specific content

Do you mean the Class B1 part? You can modify the regex by adding \d? inside brackets (https://regex101.com/r/Yfh6pZ/1). \d will match a digit and ? will match the token before it ZERO or ONE times. If you don't want to capture groups which have no digits, leave out ? or replace it with {1}

jthi_0-1711602075375.png

[
	"Class B" => {"20::Class B"}, 
	"Class B1" => {"21::Class B1", "22::Class B1"}, 
	"Class C" => {"30::Class C"}, 
	"Class C1" => {"31::Class C1"}, 
	"Class C2" => {"32::Class C2"}, 
	"Class: A" => {"10::Class: A"}, 
	"Class: A1" => {"11::Class: A1"}, 
	"Class: A2" => {"12::Class: A2"}
]
-Jarmo