BookmarkSubscribeRSS Feed

Re: Converting A Range of Alphanumeric Characters Into A List (X1-3 to X1,X2,X3)

Stepper

New Contributor

Joined:

Jun 4, 2018

Hello,

I'm trying to convert a range of alphanumeric characters into a list. I'm pretty new to JSL and I'm not sure exactly how to go about doing this. This is what I would like to achieve:

 

Orginal: CH1-3

Return: CH1,CH2,CH3

 

The character prefix does not always contain two characters (Could be CH1-3 or X5-8 for example).

 

I know that if I had a range of just two numbers, like CH1-2,  I could use the following script to get what I want:

Regex("CH1-2", "([A-Z]+)([0-9]+)(-)([0-9]+)", "\1\2,\1\4", GLOBALREPLACE);

Which returns CH1,CH2

 

 

However, I have no idea how to do this for ranges with more than two numbers.

 

Any assistance would be greatly appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
markbailey

Staff

Joined:

Jun 23, 2011

Solution

This script illustrates a way to do it:

Names Default to Here( 1 );

dt = New Table( "Col Range Test Cases",
	New Column( "A",
		"Character",
		"Nominal",
		Values( { "CH1-2,CH17,CH6-8", "T7", "MR101-103", "P4-5", "U2", "AR6-7" } )
	),
	New Column( "B",
		"Character",
		"Nominal"
	)
);

For Each Row(
	// tokenize current string with comma delimiter
	// Row() = 1;
	range = Words( Column( dt, 1 )[], "," );
	For( i = 1, i <= N Items( range ), i++,
		// look for col range
		result = Regex Match(
			range[i],
			"^(\w+?)(\d+)-(\d+)\b"
		);
		If( N Items( result ) == 0,
			// regex failed, not a col range
			If( i == 1,
				// start string result
				answer = range[i],
				// add to string result
				answer = Concat( answer, ", ", range[i] );
			),
			// regex succeeded, a col range
			letters = result[2];
			first   = Num( result[3] );
			last    = Num( result[4] );
			// append columns in the range
			For( n = first, n <= last, n++,
				col = Char( letters ) || Char( n );
				If( i == 1 & n == first,
					answer = col,
					answer = Concat( answer, ", ", col );
				);
			);
		);
	);
	// Row() = 1;
	Column( dt, 2 )[] = answer;
);
Learn it once, use it forever!
5 REPLIES
david707

New Contributor

Joined:

Feb 20, 2018

This almost certainly isn't the most efficient or cleanest solution, but should do the trick:

 

list = {}; // new blank list
letters = Regex("CH13-18", "([A-Z]+)([0-9]+)(-)([0-9]+)", "\1");
From = Num(Regex("CH13-18", "([A-Z]+)([0-9]+)(-)([0-9]+)", "\2"););
To = Num(Regex("CH13-18", "([A-Z]+)([0-9]+)(-)([0-9]+)", "\4"););
For( x = From, x <= To, x++,
	string = Char(letters) || Char(x);
	Insert Into (list, string);
);
print(list);
Stepper

New Contributor

Joined:

Jun 4, 2018

Thanks for your response! It's very helpful.

 

I've been trying to incoperate it with what I'm trying to do, but I'm having some difficulties. Perhaps you could lend me your assistance again?

 

I'm dealing with a data table like the one below, where I have multiple columns filled with data. I'm trying to replace the ranged values wherever they occur

Orginal:

ABC
CH1-2,CH17,CH6-8MR101-103U2
T7P4-5AR6-7

Desired:

ABC
CH1,CH2,CH17,CH6,CH7,CH8MR101,MR102,MR103U2
T7P4,P5AR6,AR7

 

This is what I've come up with so far to try to replace the ranged values that appear in column A, but it isn't working.

for (i=1, i<=nrows(dt), i++,

	A_List=words(column(dt, "A")[i], ",");
	
	letters = Regex(A_List, "([A-Z]+)([0-9]+)(-)([0-9]+)", "\1");
	From = Num(Regex(A_List, "([A-Z]+)([0-9]+)(-)([0-9]+)", "\2"););
	To = Num(Regex(A_List, "([A-Z]+)([0-9]+)(-)([0-9]+)", "\4"););
	For( x = From, x <= To, x++,
		string = Char(letters) || Char(x);
		Insert Into (A_List, string);
		);
	);

I appreciate your help!

 

Highlighted
markbailey

Staff

Joined:

Jun 23, 2011

Solution

This script illustrates a way to do it:

Names Default to Here( 1 );

dt = New Table( "Col Range Test Cases",
	New Column( "A",
		"Character",
		"Nominal",
		Values( { "CH1-2,CH17,CH6-8", "T7", "MR101-103", "P4-5", "U2", "AR6-7" } )
	),
	New Column( "B",
		"Character",
		"Nominal"
	)
);

For Each Row(
	// tokenize current string with comma delimiter
	// Row() = 1;
	range = Words( Column( dt, 1 )[], "," );
	For( i = 1, i <= N Items( range ), i++,
		// look for col range
		result = Regex Match(
			range[i],
			"^(\w+?)(\d+)-(\d+)\b"
		);
		If( N Items( result ) == 0,
			// regex failed, not a col range
			If( i == 1,
				// start string result
				answer = range[i],
				// add to string result
				answer = Concat( answer, ", ", range[i] );
			),
			// regex succeeded, a col range
			letters = result[2];
			first   = Num( result[3] );
			last    = Num( result[4] );
			// append columns in the range
			For( n = first, n <= last, n++,
				col = Char( letters ) || Char( n );
				If( i == 1 & n == first,
					answer = col,
					answer = Concat( answer, ", ", col );
				);
			);
		);
	);
	// Row() = 1;
	Column( dt, 2 )[] = answer;
);
Learn it once, use it forever!
markbailey

Staff

Joined:

Jun 23, 2011

A more efficient function for multiple results is Regex Match():

Names Default to Here( 1 );

// test case
original = "CH002-033";

result = Regex Match(
	original,
	"^(\w+?)(\d+)-(\d+)\b"
);

 

(Note a slight change to first capturing group because "+" is greedy.)

In this case, you get all the back references in a list form:

{"CH002-033", "CH", "002", "033"}

 

 

Learn it once, use it forever!
pmroz

Super User

Joined:

Jun 23, 2011

Crude version that handles numbers that are greater than 9

ch_split = function({ch_string}, {default local},
	first_num = regex(ch_string, "([0-9])");
	num_loc   = contains(ch_string, first_num);
	prefix    = substr(ch_string, 1, num_loc - 1);
	num_list  = words(substr(ch_string, num_loc), "-");
	start     = num(num_list[1]);
	end       = num(num_list[2]);

	ch_list = {};
	k = 0;
	for (i = start, i <= end, i++,
		k++;
		ch_list[k] = prefix || char(i);
	);
	ch_list
);

a1 = "ABC20-29";
ch_split(a1);

a2 = "CH1-2";
ch_split(a2);