cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
joshua
Level III

Deleting multiple columns based on strings in their name

Hi,

I need to delete multiple columns based the strings they contain but sometimes I want to keep some others even if they don't contain those strings.

So by following this post's solution Deleting multiple columns based on column name 

 

dt = Open( "$sample_data/Hollywood Movies.jmp" );
col_names =  dt << Get Column Names( String );

For( i = N Items( col_names ), i > 0, i--,
	If(
		Not(
			Contains( col_names[i], "Gross")  | contains(col_names[i], "Genre"   ) | col_names[i] == "theme"),

	Remove From( col_names, i)
	)
);

So, what I want to have is remove anything not contains with "Gross" or "Genre" but if the column name is == "theme" don't remove it.

 

It seems it does not work for now ! Why?

image.png

 

The second bullet to this question

How can I see col_names as a data table so I can know what is the output without hovering on the col_names. Because sometimes I have a long list of col names and hovering is not efficient to see what is in that column list.

 

Thanks!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Deleting multiple columns based on strings in their name

dt = Open( "$sample_data/Hollywood Movies.jmp" );
col_names =  dt << Get Column Names( String );


dt:Profitability << delete formula;

For( i = N Items( col_names ), i > 0, i--,
	If(
		Not(
			Contains( col_names[i], "Gross")  | contains(col_names[i], "Genre"   ) | col_names[i] == "theme"),

	Remove From( col_names, i)
	)
);
// At this point in the script, you have a data table 
// with all of the original columns(Profitability is now a static column)
// And you have a JSL List called "col_names", which is stored in memory
// so you can do anythin you want given this
// You could just write the contents of the list to the log.  Remember, the 
// list "col_names", only contains the names of the columns, it does not 
// contain the whole column(name, properties and data)
Print( col_names );
// or
Write( col_names );
// or
Show( col_names );
// That will give you the list of columns that are going to be deleted
// or before you want to see the columns in their entirety you can create
// a new data table with just those columns

newdt = dt << subset( all rows, columns( col_names ) );

// The statement below, is a perfectly valid statement, however, it will
// not work the way you think it will.  The end result of this statement
// is to create a new list of column names. 
// {"Genre", "Domestic Gross", "Foreign Gross", "World Gross", "Opening Wknd Gross"}
deleted_columns = delete columns( col_names );

 
Jim

View solution in original post

7 REPLIES 7
txnelson
Super User

Re: Deleting multiple columns based on strings in their name

From the JSL you provided, you have left out the most critical statement

dt << delete columns( col_names );

This is the line that actually does the deletions.  However, in this sample case, it has an additional issue.  The final list of columns 

{"Genre", "Domestic Gross", "Foreign Gross", "World Gross", "Opening Wknd Gross"}

has the column "World Gross" in it, which is used in the calculation of column "Profitability".  When the "delete columns" message is processed it detects the issue and displays in the log

Cannot delete the selected columns while some are referenced by the formulas of remaining columns.
Removing a formula leaves the data unchanged. Removing references replaces each reference with an empty value. These effects are permanent and cannot be undone.

If you delete the formula from the "Profitability" column, it will convert the values to static values, 

dt:Profitability << delete formula;

then the "delete columns" will work properly.

Concerning your issue about hovering over a variable to see it's value, and wanting to create a data table to be able to see the values, there is a much simpler way to do that.  Rather than hovering over the "col_names" variable, what you need to do, is to highlight it, and then right click and select "Run Script".  By rule, if anything in a JMP script is highlighted, the "Run Script" will only run the highlighted lines/statements/variables.  The results will be displayed in the log.

Jim
joshua
Level III

Re: Deleting multiple columns based on strings in their name

Hi Jim,

 

Thanks for your answer. Ok I see about what you saying.

 

Lets say we don't care about the formulas. Let's delete all of the formula connections and remove the columns that I wanted.

 

and I want to create (and see ) those deleted columns as a datatable

 

by following your suggestion

dt = Open( "$sample_data/Hollywood Movies.jmp" );
col_names =  dt << Get Column Names( String );


dt:Profitability << delete formula;

For( i = N Items( col_names ), i > 0, i--,
	If(
		Not(
			Contains( col_names[i], "Gross")  | contains(col_names[i], "Genre"   ) | col_names[i] == "theme"),

	Remove From( col_names, i)
	)
);

deleted_columns = delete columns( col_names );

this does not create still what I need

 

txnelson
Super User

Re: Deleting multiple columns based on strings in their name

dt = Open( "$sample_data/Hollywood Movies.jmp" );
col_names =  dt << Get Column Names( String );


dt:Profitability << delete formula;

For( i = N Items( col_names ), i > 0, i--,
	If(
		Not(
			Contains( col_names[i], "Gross")  | contains(col_names[i], "Genre"   ) | col_names[i] == "theme"),

	Remove From( col_names, i)
	)
);
// At this point in the script, you have a data table 
// with all of the original columns(Profitability is now a static column)
// And you have a JSL List called "col_names", which is stored in memory
// so you can do anythin you want given this
// You could just write the contents of the list to the log.  Remember, the 
// list "col_names", only contains the names of the columns, it does not 
// contain the whole column(name, properties and data)
Print( col_names );
// or
Write( col_names );
// or
Show( col_names );
// That will give you the list of columns that are going to be deleted
// or before you want to see the columns in their entirety you can create
// a new data table with just those columns

newdt = dt << subset( all rows, columns( col_names ) );

// The statement below, is a perfectly valid statement, however, it will
// not work the way you think it will.  The end result of this statement
// is to create a new list of column names. 
// {"Genre", "Domestic Gross", "Foreign Gross", "World Gross", "Opening Wknd Gross"}
deleted_columns = delete columns( col_names );

 
Jim
joshua
Level III

Re: Deleting multiple columns based on strings in their name

why 

col_names[i] == "theme"

 is not still working ?

 

image.png

 

txnelson
Super User

Re: Deleting multiple columns based on strings in their name

If you print out the value of "col_names" right after it is created, it shows all of the columns in the data table

{"Movie Name", "Lead Studio Name", "Rotten Tomatoes Score", "Audience Score",
"Theme", "Genre", "Theaters Opening Wknd", "BOA Opening Wknd", "Domestic Gross",
"Foreign Gross", "World Gross", "Production Budget", "Profitability",
"Opening Wknd Gross"}

If you look closely, you will see there is not an entry with the value "theme".   There is an entry with the value "Theme".  You are comparing values in character strings, and you have to properly match uppercase and lowercase.

I strongly suggest that you read the Scripting Guide!

     Help==>JMP Documentation Library         Scripting Guide

Jim
joshua
Level III

Re: Deleting multiple columns based on strings in their name

Jim,
Sorry, for misunderstanding. But most of the things that I search is not well very described or just briefly explained in the jsl pdf. Especially recursive scripting (if-else, for loop) when you want to do complex things they are not included in the manual. It's very hard to track the errors as well. I did not until you say to check the 'log'.


For example, there is no explanation about creating a new data table or how to assign a new data table.

What I was thinking as straight forward like

 

dt_new << Data Table( col_names );

 

should create a new data table but I learned that it is not straight forward in jmp. We need to ask or practice about the syntax that's I'm doing.
Thanks a lot!

txnelson
Super User

Re: Deleting multiple columns based on strings in their name

Your confusion about the data table is something I have seen before.  What is different about JMP and it's data tables, is that a data table needs to be thought of as a base element.  Much of the time, one will see a line of JSL such as:

dt = Open("$SAMPLE_DATA/big class.jmp");

In some languages, dt would be the data table.  This is not the case.  The above statement only assigns a pointer to the data table "Big Class".  And if you hover over the variable dt you will see

data table( "Big Class" )

as it's value.

Therefore, if I run the statement

xyz = dt;

I do not get a new instance of the data table Big Class, as I would in some languages, instead, all I get is variable called xyz which is a pointer to the same data table called Big Class.

To create a new copy (it is not a new instance, because once created it typically is a totally independent new data table)  one uses one of the Table platforms to complete the task

dt2 = dt << subset( all rows, all columns);

The above command creates a new data table and will set variable dt2 as a pointer to the new table.

Jim