cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
ealindahl
Level II

Editing strings in a list

I am making a listbox that I use in a script to select the batch numbers I want to analyze. I would like to make the list of batch numbers based on file names in a folder. I cannot figure out how to edit the name of the file to be only the batch number.

 

This is what the list of file names looks like 

"data_20229.csv", "data_20230.csv", "data_20231.csv", "data_20232.csv",
"data_20236.csv", "data_20237.csv", "data_20238.csv", "data_20239.csv",
"data_20240.csv", "data_20241.csv", "data_20242.csv", "data_20243.csv",
"data_20244.csv", "data_20245.csv", "data_20247.csv", "data_20248.csv",
"data_20251.csv", "data_20252.csv", "data_20253.csv", "data_20254.csv",
"data_20255.csv", "data_20256.csv", "data_20257.csv", "data_20258.csv",
"data_20259.csv", "data_20260.csv", "data_20261.csv", "data_20262.csv",
"data_20263.csv", "data_20264.csv", "data_20265.csv", "data_20266.csv",
"data_20267.csv", "data_20268.csv", "data_20269.csv", "data_20270.csv"

I want the list to end up like this

"20229", "20230", "20231", "20232", "20233", "20234", "20235", "20236",
"20237", "20238", "20239", "20240", "20241", "20242", "20243", "20244", "20245",
"20246", "20247", "20248", "20249", "20250", "20251", "20252", "20253", "20254",
"20255", "20256", "20257", "20258", "20259", "20260", "20261", "20262", "20263",
"20264", "20265", "20266", "20267", "20268", "20269", "20270"

 

5 REPLIES 5
txnelson
Super User

Re: Editing strings in a list

The easiest way to do this is to use the Word() function.

 

Names Default To Here( 1 );
exampleList = {"data_20229.csv", "data_20230.csv", "data_20231.csv",
"data_20232.csv", "data_20236.csv", "data_20237.csv", "data_20238.csv",
"data_20239.csv", "data_20240.csv", "data_20241.csv", "data_20242.csv",
"data_20243.csv", "data_20244.csv", "data_20245.csv", "data_20247.csv",
"data_20248.csv", "data_20251.csv", "data_20252.csv", "data_20253.csv",
"data_20254.csv", "data_20255.csv", "data_20256.csv", "data_20257.csv",
"data_20258.csv", "data_20259.csv", "data_20260.csv", "data_20261.csv",
"data_20262.csv", "data_20263.csv", "data_20264.csv", "data_20265.csv",
"data_20266.csv", "data_20267.csv", "data_20268.csv", "data_20269.csv",
"data_20270.csv"};

For( i = 1, i <= N Items( exampleList ), i++,
	exampleList[i] = Word( 2, exampleList[i], "_." )
);

show( exampleList );

Documentation on the Word() function can be found in the Scripting Index.

Jim
Georg
Level VII

Re: Editing strings in a list

Thanks, Jim.

Word function is a nice and easy to understand function for such purposes.

An even more General Approach May be using regex,

so simply replacing one line of Jims Script by

	exampleList[i] = regex(exampleList[i], "([0-9]+)", "\1")

would search for the first occurrence of a number.

This May be useful, when data is not that clean as in this example. Regex is incredibly powerful.

Georg
pmroz
Super User

Re: Editing strings in a list

This approach doesn't use a for loop.

Names Default To Here( 1 );
exampleList = {"data_20229.csv", "data_20230.csv", "data_20231.csv",
"data_20232.csv", "data_20236.csv", "data_20237.csv", "data_20238.csv",
"data_20239.csv", "data_20240.csv", "data_20241.csv", "data_20242.csv",
"data_20243.csv", "data_20244.csv", "data_20245.csv", "data_20247.csv",
"data_20248.csv", "data_20251.csv", "data_20252.csv", "data_20253.csv",
"data_20254.csv", "data_20255.csv", "data_20256.csv", "data_20257.csv",
"data_20258.csv", "data_20259.csv", "data_20260.csv", "data_20261.csv",
"data_20262.csv", "data_20263.csv", "data_20264.csv", "data_20265.csv",
"data_20266.csv", "data_20267.csv", "data_20268.csv", "data_20269.csv",
"data_20270.csv"};

str      = char(examplelist);
newstr   = substitute(str, "data_", "", ".csv", "");
new_list = parse(newstr);

show( new_list );
ealindahl
Level II

Re: Editing strings in a list

Thanks Georg! That works really well. I've never used regex before and I tried to do some googling on it to solve this one issue. I do have some junk files in the same folder which do not contain any numbers. so the name of those files is not coming into the list, but the period is. How do I get rid of the period?

 

 

 

It looks like this   {., ., ., ., ., ., ., "20131", "20132", "20133",}

Georg
Level VII

Re: Editing strings in a list

Dear @ealindahl,

 

the periods belong to missing values, i.e. the regex did not find a match.

You can remove them afterwords, or avoid them during List Generation, and have to Change the loop slightly.

 

An easy way to remove them afterwords is the following (first way via string Manipulation, second via loop through).

BR

 

names default to here(1);

list = {., ., ., ., ., ., ., "20131", "20132", "20133"};
new_list = parse(substitute(char(list),".,", ""));

show (list);
show (new_list);

new_list={};
for (i=1, i<= n items(list), i++,
	if (!is missing(list[i]), new_list = insert(new_list, list[i]));
);
show(new_list);

 

Georg