Community Trekker

Joined:

Feb 13, 2015

Combining entries of N columns into single column based on unique entries

Hi,

I have N columns, which have character data that I want to consolidate into a single column.  For a given row many of the columns can be blank or contain repeated/same values ( as compared to the entry in another column ).

I know there is the function under Cols --> Utilities --> Combine Columns & this will generate a new column with all the fields side by side & delimited.  I need advise on how to do this process while maintaining uniqueness ( i.e. only append the values for that column if there are no other columns w/in that row which have the same value )

Ideally I'd like to get guidance on how to script this in JSL. However, clever ( yet more tedious ) approaches w/in JMP data tables/functions are welcome also!

Thanks.

3 ACCEPTED SOLUTIONS

Accepted Solutions

Super User

Joined:

Jun 22, 2012

Solution

Re: Combining entries of N columns into single column based on unique entries

Here is one way to solve this question.  I have created a formula that combines the unique values from a list of columns.  I have attached a sample data table with the formula applied to the last column

Below is the formula used

``````VarList = {"One", "Two", "Three", "Four"};
DataList = {};
CountList = {};
DataValue = "";
For( i = 1, i <= N Items( VarList ), i++,
Insert Into( DataList, As Column( VarList[i] ) );
Insert Into( CountList, 0 );
);
For( i = 1, i <= N Items( VarList ) - 1, i++,
For( k = i + 1, k <= N Items( VarList ), k++,
If( DataList[i] == DataList[k],
CountList[k] = CountList[k] + 1
)
)
);
For( i = 1, i <= N Items( VarList ), i++,
If( CountList[i] == 0 & DataList[i] != "",
DataValue = DataValue || "," || DataList[i]
)
);
DataValue = Substr( DataValue, 2 );``````
Jim

Super User

Joined:

Jun 23, 2011

Solution

Re: Combining entries of N columns into single column based on unique entries

Here's an alternative to Jim's formula. The difference is that the unique values of the combined string are sorted alphabetically rather than in order of occurence.

The number of columns can generalized by setting the local parameter N.

``````dt = Current Data Table();
dt << New Column("Combined",
Character,
Formula(
Local({dt = Current Data Table(), N = 4},
Concat Items(Associative Array(dt[Row(), 1 :: N]) << get keys, ",")
)
)
);
``````

(It uses data table matrix notation, and will thus only in JMP 13 or later)

Staff

Joined:

Jun 9, 2012

Solution

Re: Combining entries of N columns into single column based on unique entries

I prefer the previous answers, but you can also do this via point-and-click, which is attractive to those who don't script (Apparently, there exist people who don't script!!)

As with the associative array approach, the resulting column has the values sorted least to greatest.

1) Select the original columns, and using the combine columns utility, create a multiple response column.

2) Select the new column and apply the text to columns utility, being sure to check the check the "make indicators" box.

3) After selecting the 10 new indicator columns, apply the combine columns utility, being sure to check the "columns are indicator columns" box.

6 REPLIES

Super User

Joined:

Jun 22, 2012

Solution

Re: Combining entries of N columns into single column based on unique entries

Here is one way to solve this question.  I have created a formula that combines the unique values from a list of columns.  I have attached a sample data table with the formula applied to the last column

Below is the formula used

``````VarList = {"One", "Two", "Three", "Four"};
DataList = {};
CountList = {};
DataValue = "";
For( i = 1, i <= N Items( VarList ), i++,
Insert Into( DataList, As Column( VarList[i] ) );
Insert Into( CountList, 0 );
);
For( i = 1, i <= N Items( VarList ) - 1, i++,
For( k = i + 1, k <= N Items( VarList ), k++,
If( DataList[i] == DataList[k],
CountList[k] = CountList[k] + 1
)
)
);
For( i = 1, i <= N Items( VarList ), i++,
If( CountList[i] == 0 & DataList[i] != "",
DataValue = DataValue || "," || DataList[i]
)
);
DataValue = Substr( DataValue, 2 );``````
Jim

Community Trekker

Joined:

Feb 13, 2015

Re: Combining entries of N columns into single column based on unique entries

Thank you Txnelson for the quick reply & solution!

This works fine, however, my column names are somewhat long & I have many so prefer not to have to manually specify the list.

I could get the column names using a simple dt << Get Column Names and store it.

I also really liked how super user "ms" arranged the entries in alphabetical order which reduces the # of permutations ( though I didn't ask for it )

Sorry for the delayed response.

Super User

Joined:

Jun 23, 2011

Solution

Re: Combining entries of N columns into single column based on unique entries

Here's an alternative to Jim's formula. The difference is that the unique values of the combined string are sorted alphabetically rather than in order of occurence.

The number of columns can generalized by setting the local parameter N.

``````dt = Current Data Table();
dt << New Column("Combined",
Character,
Formula(
Local({dt = Current Data Table(), N = 4},
Concat Items(Associative Array(dt[Row(), 1 :: N]) << get keys, ",")
)
)
);
``````

(It uses data table matrix notation, and will thus only in JMP 13 or later)

Community Trekker

Joined:

Feb 13, 2015

Re: Combining entries of N columns into single column based on unique entries

Thanks a lot Ms!!

I like the flexibility of this solution including being able to specify the start and end column # to be used.  I really like that you also made the sorting in alphabetical order to reduce the # of permutations!

Sorry for the delayed response. I initially thought there was a corner case problem where it was not working but after revisiting it I don't see an issue.

Staff

Joined:

Jun 9, 2012

Solution

Re: Combining entries of N columns into single column based on unique entries

I prefer the previous answers, but you can also do this via point-and-click, which is attractive to those who don't script (Apparently, there exist people who don't script!!)

As with the associative array approach, the resulting column has the values sorted least to greatest.

1) Select the original columns, and using the combine columns utility, create a multiple response column.

2) Select the new column and apply the text to columns utility, being sure to check the check the "make indicators" box.

3) After selecting the 10 new indicator columns, apply the combine columns utility, being sure to check the "columns are indicator columns" box.

Community Trekker

Joined:

Feb 13, 2015