cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
HSS
HSS
Level IV

Need to create a column based on sequence of variable appear in another column

Hi All,
I want to create a column based on :Client_ID and how many times ( a number of sequence is counted as 1 count only, see the the :Comment column) it is appearing in the data set. My requirement is :Final Requirement  and would be happy to have :Required Column 1 as well. I can not re-arrange/sort the data.

Any help please ?

Many thanks, -HSS

HSS_1-1664455758344.png

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: Need to create a column based on sequence of variable appear in another column

Using associative array to store the value and then increase as needed should work here.

 

Formula for Required Column 1:

As Constant(
	aa = Associative Array(:Client_ID);
	aa = Associative Array(aa << get keys, Repeat(0, N Items(aa)));
	retval = 1;
);
If(:Client_ID != Lag(:Client_ID),
	aa[:Client_ID]++;
);
aa[:Client_ID];

And Final Requirement:

Eval Insert("^:Client_ID^_^:Required Column 2^");

Example script (uses different column names):

Names Default To Here(1);

dt = Open("$DOWNLOADS/data.jmp");

dt << New Column("Required Column 2", Numeric, Ordinal, Formula(
	As Constant(
		aa = Associative Array(:Client_ID);
		aa = Associative Array(aa << get keys, Repeat(0, N Items(aa)));
		retval = 1;
	);
	If(:Client_ID != Lag(:Client_ID),
		aa[:Client_ID]++;
	);
	aa[:Client_ID];
));

dt << New Column("Final Requirement 2", Character, Ordinal, Formula(
	Eval Insert("^:Client_ID^_^:Required Column 2^");
));

jthi_0-1664457060139.png

 

-Jarmo

View solution in original post

8 REPLIES 8
jthi
Super User

Re: Need to create a column based on sequence of variable appear in another column

Using associative array to store the value and then increase as needed should work here.

 

Formula for Required Column 1:

As Constant(
	aa = Associative Array(:Client_ID);
	aa = Associative Array(aa << get keys, Repeat(0, N Items(aa)));
	retval = 1;
);
If(:Client_ID != Lag(:Client_ID),
	aa[:Client_ID]++;
);
aa[:Client_ID];

And Final Requirement:

Eval Insert("^:Client_ID^_^:Required Column 2^");

Example script (uses different column names):

Names Default To Here(1);

dt = Open("$DOWNLOADS/data.jmp");

dt << New Column("Required Column 2", Numeric, Ordinal, Formula(
	As Constant(
		aa = Associative Array(:Client_ID);
		aa = Associative Array(aa << get keys, Repeat(0, N Items(aa)));
		retval = 1;
	);
	If(:Client_ID != Lag(:Client_ID),
		aa[:Client_ID]++;
	);
	aa[:Client_ID];
));

dt << New Column("Final Requirement 2", Character, Ordinal, Formula(
	Eval Insert("^:Client_ID^_^:Required Column 2^");
));

jthi_0-1664457060139.png

 

-Jarmo
HSS
HSS
Level IV

Re: Need to create a column based on sequence of variable appear in another column

Hi @jthi, Can I do the grouping also on "Associated Array" using another columns ? Something like  this ? "Associated Array(:Client_ID, :Region)" and my data set has more than 700000 rows, will the associate command still work ! If I can group by :Region, it may still have more than 10000 rows. In any case, I will certainly try this.
Thank you for your help. Regards, - HSS

jthi
Super User

Re: Need to create a column based on sequence of variable appear in another column

You would have to first create new "grouping" column if you wanted to use Associative Array like that. With multiple grouping columns, it might be easier to use Col Cumulative Sum (might be always easier, just came up with this). You will have to figure out correct comparison and the groupings

Col Cumulative Sum(:Client_ID != Lag(:Client_ID), :Client_ID)
-Jarmo
HSS
HSS
Level IV

Re: Need to create a column based on sequence of variable appear in another column

@jthi Col Cumulative Sum will work but I am not able to find a correct initialization of the loop/starting point with grouping variable :Region.
Any other suggestion?
Thanks

Col Cumulative Sum( :Client_ID != Lag( :Client_ID ) & :Region == Lag( :Region ), :Client_ID )

jthi
Super User

Re: Need to create a column based on sequence of variable appear in another column

Easiest thing is most likely to create new column and then use that as grouping instead of :Client_ID (original formula)

jthi_0-1664460322457.png

Col Cumulative Sum(:"Concatenate[Region,Client_ID]"n != Lag(:"Concatenate[Region,Client_ID]"n), :"Concatenate[Region,Client_ID]"n)

 

Otherwise the part which is summed in cumulative sum get more complicated quite fast (there might be easier ones than this, but this should handle wrong values on :Region change):

Col Cumulative Sum(
	If(Row() == Col Minimum(Row(), :Client_ID, :Region),
		1
	,
		:Client_ID != Lag(:Client_ID)
	)
	, :Client_ID, :Region
)
-Jarmo
HSS
HSS
Level IV

Re: Need to create a column based on sequence of variable appear in another column

@jthi Thank you for your help.
I will work along this direction.


HSS
HSS
Level IV

Re: Need to create a column based on sequence of variable appear in another column

Hi All,

There is a further requirement - to add one more column :Req_3, where all the texts from :Group would be combined and stored. :Group text can be as long as 2 lines. Separator can be any thing.

The problem is - number of cells need to be combined is not fix, it depends upon the value in column :Req_1.

"Req_3" is required.
Any help, Thanks.

HSS_0-1664543873481.png

 

jthi
Super User

Re: Need to create a column based on sequence of variable appear in another column

Using SQL query (especially GROUP_CONCAT) or JMP's different table operations (split, transpose, text to columns and combine columns, join back) might be easier than formula (and faster), but you could do it with a formula for example by using Loc() and indexing of different columns.

 

Edit: Added interactive method

First split data:

jthi_0-1664550332275.png

Next transpose the split data:

jthi_1-1664550375947.png

From transposed data, first use text to columns to separate Req2 and Region from each other

jthi_2-1664550425860.png

Then use combine columns to combine Row columns

jthi_3-1664550461910.png

Rename and remove unnecessary columns and you should end up with something like this

jthi_5-1664550570000.png

Join this with the original data2

jthi_6-1664550598695.png

jthi_7-1664550609644.png

Remove extra columns and you should have what you need.

 

Enhanced Log might have script ready for you, but you might have to clean it up a bit

jthi_8-1664550635747.png

 

-Jarmo