- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Need to create a column based on sequence of variable appear in another column
Hi All,
I want to create a column based on :Client_ID and how many times ( a number of sequence is counted as 1 count only, see the the :Comment column) it is appearing in the data set. My requirement is :Final Requirement and would be happy to have :Required Column 1 as well. I can not re-arrange/sort the data.
Any help please ?
Many thanks, -HSS
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
Using associative array to store the value and then increase as needed should work here.
Formula for Required Column 1:
As Constant(
aa = Associative Array(:Client_ID);
aa = Associative Array(aa << get keys, Repeat(0, N Items(aa)));
retval = 1;
);
If(:Client_ID != Lag(:Client_ID),
aa[:Client_ID]++;
);
aa[:Client_ID];
And Final Requirement:
Eval Insert("^:Client_ID^_^:Required Column 2^");
Example script (uses different column names):
Names Default To Here(1);
dt = Open("$DOWNLOADS/data.jmp");
dt << New Column("Required Column 2", Numeric, Ordinal, Formula(
As Constant(
aa = Associative Array(:Client_ID);
aa = Associative Array(aa << get keys, Repeat(0, N Items(aa)));
retval = 1;
);
If(:Client_ID != Lag(:Client_ID),
aa[:Client_ID]++;
);
aa[:Client_ID];
));
dt << New Column("Final Requirement 2", Character, Ordinal, Formula(
Eval Insert("^:Client_ID^_^:Required Column 2^");
));
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
Using associative array to store the value and then increase as needed should work here.
Formula for Required Column 1:
As Constant(
aa = Associative Array(:Client_ID);
aa = Associative Array(aa << get keys, Repeat(0, N Items(aa)));
retval = 1;
);
If(:Client_ID != Lag(:Client_ID),
aa[:Client_ID]++;
);
aa[:Client_ID];
And Final Requirement:
Eval Insert("^:Client_ID^_^:Required Column 2^");
Example script (uses different column names):
Names Default To Here(1);
dt = Open("$DOWNLOADS/data.jmp");
dt << New Column("Required Column 2", Numeric, Ordinal, Formula(
As Constant(
aa = Associative Array(:Client_ID);
aa = Associative Array(aa << get keys, Repeat(0, N Items(aa)));
retval = 1;
);
If(:Client_ID != Lag(:Client_ID),
aa[:Client_ID]++;
);
aa[:Client_ID];
));
dt << New Column("Final Requirement 2", Character, Ordinal, Formula(
Eval Insert("^:Client_ID^_^:Required Column 2^");
));
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
Hi @jthi, Can I do the grouping also on "Associated Array" using another columns ? Something like this ? "Associated Array(:Client_ID, :Region)" and my data set has more than 700000 rows, will the associate command still work ! If I can group by :Region, it may still have more than 10000 rows. In any case, I will certainly try this.
Thank you for your help. Regards, - HSS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
You would have to first create new "grouping" column if you wanted to use Associative Array like that. With multiple grouping columns, it might be easier to use Col Cumulative Sum (might be always easier, just came up with this). You will have to figure out correct comparison and the groupings
Col Cumulative Sum(:Client_ID != Lag(:Client_ID), :Client_ID)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
@jthi Col Cumulative Sum will work but I am not able to find a correct initialization of the loop/starting point with grouping variable :Region.
Any other suggestion?
Thanks
Col Cumulative Sum( :Client_ID != Lag( :Client_ID ) & :Region == Lag( :Region ), :Client_ID )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
Easiest thing is most likely to create new column and then use that as grouping instead of :Client_ID (original formula)
Col Cumulative Sum(:"Concatenate[Region,Client_ID]"n != Lag(:"Concatenate[Region,Client_ID]"n), :"Concatenate[Region,Client_ID]"n)
Otherwise the part which is summed in cumulative sum get more complicated quite fast (there might be easier ones than this, but this should handle wrong values on :Region change):
Col Cumulative Sum(
If(Row() == Col Minimum(Row(), :Client_ID, :Region),
1
,
:Client_ID != Lag(:Client_ID)
)
, :Client_ID, :Region
)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
@jthi Thank you for your help.
I will work along this direction.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
Hi All,
There is a further requirement - to add one more column :Req_3, where all the texts from :Group would be combined and stored. :Group text can be as long as 2 lines. Separator can be any thing.
The problem is - number of cells need to be combined is not fix, it depends upon the value in column :Req_1.
"Req_3" is required.
Any help, Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Need to create a column based on sequence of variable appear in another column
Using SQL query (especially GROUP_CONCAT) or JMP's different table operations (split, transpose, text to columns and combine columns, join back) might be easier than formula (and faster), but you could do it with a formula for example by using Loc() and indexing of different columns.
Edit: Added interactive method
First split data:
Next transpose the split data:
From transposed data, first use text to columns to separate Req2 and Region from each other
Then use combine columns to combine Row columns
Rename and remove unnecessary columns and you should end up with something like this
Join this with the original data2
Remove extra columns and you should have what you need.
Enhanced Log might have script ready for you, but you might have to clean it up a bit