- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
JSL delete the entire column based on header and all rows with specific value
Hi,
I want to delete all columns that contain "apple" and delete all rows (regardless of columns) with specific value -9999. I had start the code to delete column but I'm not getting it.
Thanks.
dt = Open("file.csv");
//dt = Open( "$SAMPLE_DATA/Cities.jmp" );
col_list = dt << get column names( string );
For( i = N Items( col_list ), i > 0, i--,
If(Contains( col_list[i], "apple" ),
Remove From( col_list, i ),
)
);
//dt << Select Where( );
//dt << Delete Rows;
dt << save( "testOUT.txt" );
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: JSL delete the entire column based on header and all rows with specific value
Here is a script that will do what you want. A couple of points about the script:
1. The Open() function has the "|" added as a delimiter in addition to the ",". This was easily created by using the "Data Preview" when opening the .csv file. It allowed for the selection of additional characters to be used as delimiters. Then all that had to be done, is to open the file, and then go to the Source entry in the Tables Panel, and Edit it and copy out the Source Code that JMP created to open the file.
2. The Deletion of the columns containing Apple had to be generalized, since in one column it was spelled Apple and in another column it was spelled apple.
3. I really don't think you should be deleting a complete row, based upon finding an occurrence of -9999. I really think you should be just setting the value -9999 to a missing value. I included a piece of code that can handle your data that way, if you want.
names default to Here(1);
dt=Open(
"file.csv",
columns(
New Column( "a", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "b", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "c", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "d", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "e", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "g", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "apple", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "orange", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "candy", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "pizza", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "Apple2_2", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "orange2", Numeric, "Continuous", Format( "Best", 12 ) )
),
Import Settings(
End Of Line( CRLF, CR, LF ),
End Of Field( Comma, Other( "|" ), CSV( 0 ) ),
Strip Quotes( 1 ),
Use Apostrophe as Quotation Mark( 0 ),
Use Regional Settings( 0 ),
Scan Whole File( 1 ),
Treat empty columns as numeric( 0 ),
CompressNumericColumns( 0 ),
CompressCharacterColumns( 0 ),
CompressAllowListCheck( 0 ),
Labels( 1 ),
Column Names Start( 1 ),
Data Starts( 2 ),
Lines To Read( "All" ),
Year Rule( "20xx" )
)
);
// Create a list to place columns to be deleted into
delList = {};
For( i = 1, i<= N Cols(dt), i++,
If( Contains( Uppercase( as Column( i ) << get name ), "APPLE" ) > 0,
insert into(delList, column(i)<<get name)
)
);
dt << delete columns( delList);
// Loop through the data and delete any rows that have
// the value -9999 in any of the columns
For( i = 1, i <= N Cols( dt ), i++,
If( Column( dt, i ) << get datatype == "Numeric",
dt << select where( As Column( dt, i ) == -9999 );
If( N Rows(dt << get selected rows) > 0,
dt << delete rows
);
)
);
/*// Substitute the code below for the last For Loop in the above
// script, if what you want to do is to just change the -9999
// values to missing values
For( i = 1, i <= N Cols( dt ), i++,
If( Column( dt, i ) << get datatype == "Numeric",
dt << select where( As Column( dt, i ) == -9999 );
If( N Rows(dt << get selected rows) > 0,
column(dt,i)[dt << get selected rows] = .;
);
)
);*/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: JSL delete the entire column based on header and all rows with specific value
It is a very simple change to the script. All of the special options for the inputting of the data, can be eliminated. The Open() function will just use the defaults.
Secondly, all that has to be done to the row deletions, is to search for a string value of "-9999" when the input column is determined to be character in nature.
dt=Open(
"file (3).csv");
delList = {};
If( Contains( Uppercase( as Column( i ) << get name ), "APPLE" ) > 0,
insert into(delList, column(i)<<get name)
)
);
// the value -9999 in any of the columns
For( i = 1, i <= N Cols( dt ), i++,
If( Column( dt, i ) << get datatype == "Numeric",
dt << select where( As Column( dt, i ) == -9999 ),
dt << select where( Contains( As Column( dt, i ), "-9999" ) == 1 |
Contains( As Column( dt, i ), "99" ) == 1
)
);
If( N Rows( dt << get selected rows ) > 0,
dt << delete rows
);
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: JSL delete the entire column based on header and all rows with specific value
-9999|123,1,1,1,1,1,2,30,40,1,10,99
1,1,1,1,1,1,2,30,40,1,10,99
1,1,1,1,1,1,2,30,40,1,10,99
1,1,1,1|87,1,-9999,2,30,40,1,10,99
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: JSL delete the entire column based on header and all rows with specific value
Here is a script that will do what you want. A couple of points about the script:
1. The Open() function has the "|" added as a delimiter in addition to the ",". This was easily created by using the "Data Preview" when opening the .csv file. It allowed for the selection of additional characters to be used as delimiters. Then all that had to be done, is to open the file, and then go to the Source entry in the Tables Panel, and Edit it and copy out the Source Code that JMP created to open the file.
2. The Deletion of the columns containing Apple had to be generalized, since in one column it was spelled Apple and in another column it was spelled apple.
3. I really don't think you should be deleting a complete row, based upon finding an occurrence of -9999. I really think you should be just setting the value -9999 to a missing value. I included a piece of code that can handle your data that way, if you want.
names default to Here(1);
dt=Open(
"file.csv",
columns(
New Column( "a", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "b", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "c", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "d", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "e", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "g", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "apple", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "orange", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "candy", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "pizza", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "Apple2_2", Numeric, "Continuous", Format( "Best", 12 ) ),
New Column( "orange2", Numeric, "Continuous", Format( "Best", 12 ) )
),
Import Settings(
End Of Line( CRLF, CR, LF ),
End Of Field( Comma, Other( "|" ), CSV( 0 ) ),
Strip Quotes( 1 ),
Use Apostrophe as Quotation Mark( 0 ),
Use Regional Settings( 0 ),
Scan Whole File( 1 ),
Treat empty columns as numeric( 0 ),
CompressNumericColumns( 0 ),
CompressCharacterColumns( 0 ),
CompressAllowListCheck( 0 ),
Labels( 1 ),
Column Names Start( 1 ),
Data Starts( 2 ),
Lines To Read( "All" ),
Year Rule( "20xx" )
)
);
// Create a list to place columns to be deleted into
delList = {};
For( i = 1, i<= N Cols(dt), i++,
If( Contains( Uppercase( as Column( i ) << get name ), "APPLE" ) > 0,
insert into(delList, column(i)<<get name)
)
);
dt << delete columns( delList);
// Loop through the data and delete any rows that have
// the value -9999 in any of the columns
For( i = 1, i <= N Cols( dt ), i++,
If( Column( dt, i ) << get datatype == "Numeric",
dt << select where( As Column( dt, i ) == -9999 );
If( N Rows(dt << get selected rows) > 0,
dt << delete rows
);
)
);
/*// Substitute the code below for the last For Loop in the above
// script, if what you want to do is to just change the -9999
// values to missing values
For( i = 1, i <= N Cols( dt ), i++,
If( Column( dt, i ) << get datatype == "Numeric",
dt << select where( As Column( dt, i ) == -9999 );
If( N Rows(dt << get selected rows) > 0,
column(dt,i)[dt << get selected rows] = .;
);
)
);*/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: JSL delete the entire column based on header and all rows with specific value
Jim,
The code works.
When I use the "|" pipe. It will create extra column(s).
I do not want any extra my column header in this case, it will be "12" columns total.
But in each column, I can have several pipe data. I only care the first value and ignore the rest.
Example:
"99|23|abc|x", I only care for "99" and ignore the rest.
"-9999|123", I only care for "-9999" and that row will get delete it later.
1 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 -9999|999|231 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 1|xyz|abc 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 -9999 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 99|23|abc|x 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 -9999|123 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1 1 1 2 30 40 1 10 99 1 1 1 1|87 1 -9999 2 30 40 1 10 99
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: JSL delete the entire column based on header and all rows with specific value
It is a very simple change to the script. All of the special options for the inputting of the data, can be eliminated. The Open() function will just use the defaults.
Secondly, all that has to be done to the row deletions, is to search for a string value of "-9999" when the input column is determined to be character in nature.
dt=Open(
"file (3).csv");
delList = {};
If( Contains( Uppercase( as Column( i ) << get name ), "APPLE" ) > 0,
insert into(delList, column(i)<<get name)
)
);
// the value -9999 in any of the columns
For( i = 1, i <= N Cols( dt ), i++,
If( Column( dt, i ) << get datatype == "Numeric",
dt << select where( As Column( dt, i ) == -9999 ),
dt << select where( Contains( As Column( dt, i ), "-9999" ) == 1 |
Contains( As Column( dt, i ), "99" ) == 1
)
);
If( N Rows( dt << get selected rows ) > 0,
dt << delete rows
);
);