cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
mpanggabean
Level II

Combining entries of N columns into single column based on unique entries

Hi,

I have N columns, which have character data that I want to consolidate into a single column.  For a given row many of the columns can be blank or contain repeated/same values ( as compared to the entry in another column ). 

I know there is the function under Cols --> Utilities --> Combine Columns & this will generate a new column with all the fields side by side & delimited.  I need advise on how to do this process while maintaining uniqueness ( i.e. only append the values for that column if there are no other columns w/in that row which have the same value )

 

Ideally I'd like to get guidance on how to script this in JSL. However, clever ( yet more tedious ) approaches w/in JMP data tables/functions are welcome also!

 

Thanks.

3 ACCEPTED SOLUTIONS

Accepted Solutions
txnelson
Super User

Re: Combining entries of N columns into single column based on unique entries

Here is one way to solve this question.  I have created a formula that combines the unique values from a list of columns.  I have attached a sample data table with the formula applied to the last column

Combined Unique Values.PNG

Below is the formula used

VarList = {"One", "Two", "Three", "Four"};
DataList = {};
CountList = {};
DataValue = "";
For( i = 1, i <= N Items( VarList ), i++,
	Insert Into( DataList, As Column( VarList[i] ) );
	Insert Into( CountList, 0 );
);
For( i = 1, i <= N Items( VarList ) - 1, i++,
	For( k = i + 1, k <= N Items( VarList ), k++,
		If( DataList[i] == DataList[k],
			CountList[k] = CountList[k] + 1
		)
	)
);
For( i = 1, i <= N Items( VarList ), i++,
	If( CountList[i] == 0 & DataList[i] != "",
		DataValue = DataValue || "," || DataList[i]
	)
);
DataValue = Substr( DataValue, 2 );
Jim

View solution in original post

ms
Super User (Alumni) ms
Super User (Alumni)

Re: Combining entries of N columns into single column based on unique entries

Here's an alternative to Jim's formula. The difference is that the unique values of the combined string are sorted alphabetically rather than in order of occurence.

The number of columns can generalized by setting the local parameter N.

 

dt = Current Data Table();
dt << New Column("Combined",
    Character,
    Formula(
        Local({dt = Current Data Table(), N = 4},
            Concat Items(Associative Array(dt[Row(), 1 :: N]) << get keys, ",")
        )
    )
);

(It uses data table matrix notation, and will thus only in JMP 13 or later) 

View solution in original post

Re: Combining entries of N columns into single column based on unique entries

I prefer the previous answers, but you can also do this via point-and-click, which is attractive to those who don't script (Apparently, there exist people who don't script!!)

 

As with the associative array approach, the resulting column has the values sorted least to greatest.

 

1) Select the original columns, and using the combine columns utility, create a multiple response column.

2) Select the new column and apply the text to columns utility, being sure to check the check the "make indicators" box.

3) After selecting the 10 new indicator columns, apply the combine columns utility, being sure to check the "columns are indicator columns" box.

 

 

View solution in original post

6 REPLIES 6
txnelson
Super User

Re: Combining entries of N columns into single column based on unique entries

Here is one way to solve this question.  I have created a formula that combines the unique values from a list of columns.  I have attached a sample data table with the formula applied to the last column

Combined Unique Values.PNG

Below is the formula used

VarList = {"One", "Two", "Three", "Four"};
DataList = {};
CountList = {};
DataValue = "";
For( i = 1, i <= N Items( VarList ), i++,
	Insert Into( DataList, As Column( VarList[i] ) );
	Insert Into( CountList, 0 );
);
For( i = 1, i <= N Items( VarList ) - 1, i++,
	For( k = i + 1, k <= N Items( VarList ), k++,
		If( DataList[i] == DataList[k],
			CountList[k] = CountList[k] + 1
		)
	)
);
For( i = 1, i <= N Items( VarList ), i++,
	If( CountList[i] == 0 & DataList[i] != "",
		DataValue = DataValue || "," || DataList[i]
	)
);
DataValue = Substr( DataValue, 2 );
Jim
mpanggabean
Level II

Re: Combining entries of N columns into single column based on unique entries

Thank you Txnelson for the quick reply & solution!

This works fine, however, my column names are somewhat long & I have many so prefer not to have to manually specify the list.

 

I could get the column names using a simple dt << Get Column Names and store it.

 

I also really liked how super user "ms" arranged the entries in alphabetical order which reduces the # of permutations ( though I didn't ask for it )

 

Sorry for the delayed response.

ms
Super User (Alumni) ms
Super User (Alumni)

Re: Combining entries of N columns into single column based on unique entries

Here's an alternative to Jim's formula. The difference is that the unique values of the combined string are sorted alphabetically rather than in order of occurence.

The number of columns can generalized by setting the local parameter N.

 

dt = Current Data Table();
dt << New Column("Combined",
    Character,
    Formula(
        Local({dt = Current Data Table(), N = 4},
            Concat Items(Associative Array(dt[Row(), 1 :: N]) << get keys, ",")
        )
    )
);

(It uses data table matrix notation, and will thus only in JMP 13 or later) 

mpanggabean
Level II

Re: Combining entries of N columns into single column based on unique entries

Thanks a lot Ms!!

I like the flexibility of this solution including being able to specify the start and end column # to be used.  I really like that you also made the sorting in alphabetical order to reduce the # of permutations!

 

Sorry for the delayed response. I initially thought there was a corner case problem where it was not working but after revisiting it I don't see an issue.

Re: Combining entries of N columns into single column based on unique entries

I prefer the previous answers, but you can also do this via point-and-click, which is attractive to those who don't script (Apparently, there exist people who don't script!!)

 

As with the associative array approach, the resulting column has the values sorted least to greatest.

 

1) Select the original columns, and using the combine columns utility, create a multiple response column.

2) Select the new column and apply the text to columns utility, being sure to check the check the "make indicators" box.

3) After selecting the 10 new indicator columns, apply the combine columns utility, being sure to check the "columns are indicator columns" box.

 

 

mpanggabean
Level II

Re: Combining entries of N columns into single column based on unique entries

Thanks Brady_Brady!!
Works for me too!!