cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
CharlesIII
Level II

Creating a subset of data based on a list of columns

Hello,

I have a large data file with many columns of data. I would like to create an automated way to subset the raw data based on a list of columns. 

 

If this table is my raw data 

12345678910
##########
##########

and my list of columns that I would like to subset may be [1,4,6,9]

I want to generate a new table that looks like this

1469
####
####

 

The reason I can't just hardcode what columns to take is the list will is updated every so often so I would like a more hands-free way of dealing with this. 

 

If I need to provide more information let me know. 

 

Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
cwillden
Super User (Alumni)

Re: Creating a subset of data based on a list of columns

This is probably really easy, but I'm confused about your list.  You say you can't hardcode in which columns to use because it will change, but doesn't someone need to ultimately enumerate the columns to use when the list gets updated?  Otherwise, where does that list come from? If you have a variable right at the top of the script that enumerates all the columns to use, is that sufficiently easy to maintain?  If so, here's a simple script that does that:

//list of columns
subset_cols = {:Name("1"), :Name("4"), :Name("6"), :Name("9")};

dt = Current Data Table();
dt << Subset(	All rows, columns( subset_cols ) )
-- Cameron Willden

View solution in original post

5 REPLIES 5
cwillden
Super User (Alumni)

Re: Creating a subset of data based on a list of columns

This is probably really easy, but I'm confused about your list.  You say you can't hardcode in which columns to use because it will change, but doesn't someone need to ultimately enumerate the columns to use when the list gets updated?  Otherwise, where does that list come from? If you have a variable right at the top of the script that enumerates all the columns to use, is that sufficiently easy to maintain?  If so, here's a simple script that does that:

//list of columns
subset_cols = {:Name("1"), :Name("4"), :Name("6"), :Name("9")};

dt = Current Data Table();
dt << Subset(	All rows, columns( subset_cols ) )
-- Cameron Willden
Niros
Level II

Re: Creating a subset of data based on a list of columns

Hello,

This doesn't work for me when I use:

SubTable3[i] = Tables[i] << subset( selected rows(1), columns(Column(MT_Col_Names[1]), Column(MT_Col_Names[2]), Column(MT_Col_Names[3]), Column(MT_Col_Names[4]), Column(MT_Col_Names[5]), Column(MT_Col_Names[6])), Invisible);

The error is:

 

Niros_0-1640978562769.png

And if I change 'columns' to 'As columns', It works (no error), but all columns show up...

 

Thanks for any help,

Nir

cwillden
Super User (Alumni)

Re: Creating a subset of data based on a list of columns

I don't think its necessary for you to enumerate all the items you have in MT_Col_Names or to wrap those in Column().  What type of data is inside "MT_Col_Names"?

 

Here's a working example with a list of strings representing the names of the columns I want to keep:

dt = Open("$SAMPLE_DATA/Big Class.jmp");
dt << Select Rows({1,2,3});

sub_cols = {"name", "age", "sex"}; //same role as MT_Col_Names

dt2 = dt << Subset(selected rows(1), columns(sub_cols));
-- Cameron Willden
jj_jmp
Level II

Re: Creating a subset of data based on a list of columns

Maybe try 

 

eval(evalexpr((

SubTable3[i] = Tables[i] << subset( selected rows(1), columns(
As Column(expr(MT_Col_Names[1])), As Column(expr(MT_Col_Names[2])) ), Invisible);

));

))

jj_jmp
Level II

Re: Creating a subset of data based on a list of columns

eval(evalexpr(

SubTable3[i] = Tables[i] << subset( selected rows(1), columns(
As Column(expr(MT_Col_Names[1])), As Column(expr(MT_Col_Names[2])) ), Invisible);

));