cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Stepper
Level I

Converting A Range of Alphanumeric Characters Into A List (X1-3 to X1,X2,X3)

Hello,

I'm trying to convert a range of alphanumeric characters into a list. I'm pretty new to JSL and I'm not sure exactly how to go about doing this. This is what I would like to achieve:

 

Orginal: CH1-3

Return: CH1,CH2,CH3

 

The character prefix does not always contain two characters (Could be CH1-3 or X5-8 for example).

 

I know that if I had a range of just two numbers, like CH1-2,  I could use the following script to get what I want:

Regex("CH1-2", "([A-Z]+)([0-9]+)(-)([0-9]+)", "\1\2,\1\4", GLOBALREPLACE);

Which returns CH1,CH2

 

 

However, I have no idea how to do this for ranges with more than two numbers.

 

Any assistance would be greatly appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Converting A Range of Alphanumeric Characters Into A List (X1-3 to X1,X2,X3)

This script illustrates a way to do it:

Names Default to Here( 1 );

dt = New Table( "Col Range Test Cases",
	New Column( "A",
		"Character",
		"Nominal",
		Values( { "CH1-2,CH17,CH6-8", "T7", "MR101-103", "P4-5", "U2", "AR6-7" } )
	),
	New Column( "B",
		"Character",
		"Nominal"
	)
);

For Each Row(
	// tokenize current string with comma delimiter
	// Row() = 1;
	range = Words( Column( dt, 1 )[], "," );
	For( i = 1, i <= N Items( range ), i++,
		// look for col range
		result = Regex Match(
			range[i],
			"^(\w+?)(\d+)-(\d+)\b"
		);
		If( N Items( result ) == 0,
			// regex failed, not a col range
			If( i == 1,
				// start string result
				answer = range[i],
				// add to string result
				answer = Concat( answer, ", ", range[i] );
			),
			// regex succeeded, a col range
			letters = result[2];
			first   = Num( result[3] );
			last    = Num( result[4] );
			// append columns in the range
			For( n = first, n <= last, n++,
				col = Char( letters ) || Char( n );
				If( i == 1 & n == first,
					answer = col,
					answer = Concat( answer, ", ", col );
				);
			);
		);
	);
	// Row() = 1;
	Column( dt, 2 )[] = answer;
);

View solution in original post

5 REPLIES 5
david707
Level III

Re: Converting A Range of Alphanumeric Characters Into A List (X1-3 to X1,X2,X3)

This almost certainly isn't the most efficient or cleanest solution, but should do the trick:

 

list = {}; // new blank list
letters = Regex("CH13-18", "([A-Z]+)([0-9]+)(-)([0-9]+)", "\1");
From = Num(Regex("CH13-18", "([A-Z]+)([0-9]+)(-)([0-9]+)", "\2"););
To = Num(Regex("CH13-18", "([A-Z]+)([0-9]+)(-)([0-9]+)", "\4"););
For( x = From, x <= To, x++,
	string = Char(letters) || Char(x);
	Insert Into (list, string);
);
print(list);
Stepper
Level I

Re: Converting A Range of Alphanumeric Characters Into A List (X1-3 to X1,X2,X3)

Thanks for your response! It's very helpful.

 

I've been trying to incoperate it with what I'm trying to do, but I'm having some difficulties. Perhaps you could lend me your assistance again?

 

I'm dealing with a data table like the one below, where I have multiple columns filled with data. I'm trying to replace the ranged values wherever they occur

Orginal:

ABC
CH1-2,CH17,CH6-8MR101-103U2
T7P4-5AR6-7

Desired:

ABC
CH1,CH2,CH17,CH6,CH7,CH8MR101,MR102,MR103U2
T7P4,P5AR6,AR7

 

This is what I've come up with so far to try to replace the ranged values that appear in column A, but it isn't working.

for (i=1, i<=nrows(dt), i++,

	A_List=words(column(dt, "A")[i], ",");
	
	letters = Regex(A_List, "([A-Z]+)([0-9]+)(-)([0-9]+)", "\1");
	From = Num(Regex(A_List, "([A-Z]+)([0-9]+)(-)([0-9]+)", "\2"););
	To = Num(Regex(A_List, "([A-Z]+)([0-9]+)(-)([0-9]+)", "\4"););
	For( x = From, x <= To, x++,
		string = Char(letters) || Char(x);
		Insert Into (A_List, string);
		);
	);

I appreciate your help!

 

Re: Converting A Range of Alphanumeric Characters Into A List (X1-3 to X1,X2,X3)

This script illustrates a way to do it:

Names Default to Here( 1 );

dt = New Table( "Col Range Test Cases",
	New Column( "A",
		"Character",
		"Nominal",
		Values( { "CH1-2,CH17,CH6-8", "T7", "MR101-103", "P4-5", "U2", "AR6-7" } )
	),
	New Column( "B",
		"Character",
		"Nominal"
	)
);

For Each Row(
	// tokenize current string with comma delimiter
	// Row() = 1;
	range = Words( Column( dt, 1 )[], "," );
	For( i = 1, i <= N Items( range ), i++,
		// look for col range
		result = Regex Match(
			range[i],
			"^(\w+?)(\d+)-(\d+)\b"
		);
		If( N Items( result ) == 0,
			// regex failed, not a col range
			If( i == 1,
				// start string result
				answer = range[i],
				// add to string result
				answer = Concat( answer, ", ", range[i] );
			),
			// regex succeeded, a col range
			letters = result[2];
			first   = Num( result[3] );
			last    = Num( result[4] );
			// append columns in the range
			For( n = first, n <= last, n++,
				col = Char( letters ) || Char( n );
				If( i == 1 & n == first,
					answer = col,
					answer = Concat( answer, ", ", col );
				);
			);
		);
	);
	// Row() = 1;
	Column( dt, 2 )[] = answer;
);

Re: Converting A Range of Alphanumeric Characters Into A List (X1-3 to X1,X2,X3)

A more efficient function for multiple results is Regex Match():

Names Default to Here( 1 );

// test case
original = "CH002-033";

result = Regex Match(
	original,
	"^(\w+?)(\d+)-(\d+)\b"
);

 

(Note a slight change to first capturing group because "+" is greedy.)

In this case, you get all the back references in a list form:

{"CH002-033", "CH", "002", "033"}

 

 

pmroz
Super User

Re: Converting A Range of Alphanumeric Characters Into A List (X1-3 to X1,X2,X3)

Crude version that handles numbers that are greater than 9

ch_split = function({ch_string}, {default local},
	first_num = regex(ch_string, "([0-9])");
	num_loc   = contains(ch_string, first_num);
	prefix    = substr(ch_string, 1, num_loc - 1);
	num_list  = words(substr(ch_string, num_loc), "-");
	start     = num(num_list[1]);
	end       = num(num_list[2]);

	ch_list = {};
	k = 0;
	for (i = start, i <= end, i++,
		k++;
		ch_list[k] = prefix || char(i);
	);
	ch_list
);

a1 = "ABC20-29";
ch_split(a1);

a2 = "CH1-2";
ch_split(a2);