cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
KKlear
Level I

How to subset a data table based on column value

Hello!  I have a large JMP file which contains the stability testing results for many (>75 product lots).  I want to create a script to subset the rows of testing data based on the column value (i.e. the lot number) while only including certain test result columns from the file. I am fairly new to JMP and JMP scripting, but I have managed to put together the following script:

 

Data Table( "Stability Trending" ) <<
Subset(
	Output Table( "Data Subset - Lot 12345" ),
	Copy formula( 0 ),
	Selected Rows( 0 ),
	Rows(
		[1314, 1315, 1316, 1317, 1318, 1319, 1320, 1321, 1322, 1323, 1324, 1325,
		1326, 1327, 1328, 1329]
	),
	columns(
		:Protocol No.,
		:Lot No.,
		:Final Fill Configuration,
		:Name( "Storage Temperature (°C)" ),
		:Time Point,
		:Time Interval,
		:Sterility Result,
		:Name( "Test 1 Result" ),
		:Name( "Test 2 Result" ),
		:pH Result,
		:Name( "Test 3 Result" ),
		:Name( "Test 4 Result" )
	)
)

Right now, the script is selecting the rows based on the row numbers where the data for a particular lot is found.  This is not optimal as rows are sometimes inserted within the table as stability studies are modified.  How would I modify the script above to subset rows of "Lot 12345" based on the "Lot No." column rather than the row number?  I am using JMP 15 (in case that is important).

 

1 ACCEPTED SOLUTION

Accepted Solutions
KKlear
Level I

Re: How to subset a data table based on column value

Thanks for the suggestions everyone.  Here is what I ended up doing:

dt = Data Table( "Stability Trending" );

LN={"Lot 12345"};

dt<<select where(contains(LN, Lot No.));

dt<<Subset(
	Output Table( "Data Subset - Lot 12345" ),
	Copy formula( 0 ),
	Selected Rows( 1 ),
	Columns(
		:Protocol No.,
		:Lot No.,
		:Final Fill Configuration,
		:Name( "Storage Temperature (°C)" ),
		:Time Point,
		:Time Interval,
		:Sterility Result,
		:Name( "Test 1 Result)" ),
		:Name( "Test 2 Result" ),
		:pH Result,
		:Name( "Test 3 Result" ),
		:Name( "Test 4 Result)" )
	)
)

The resulting data table is giving me the output I was looking for - only the rows containing the lot number of interest and the test results from the columns I specified.

 

 

View solution in original post

4 REPLIES 4
Georg
Level VII

Re: How to subset a data table based on column value

Perhaps that example will work for you, BR

 

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Semiconductor Capability.jmp" );

dt_lst = dt << Subset(
	By( :lot_id ),
	All rows,
	columns( :wafer, :Wafer ID in lot ID, :SITE, :NPN1, :PNP1, :PNP2, :NPN2, :PNP3 )
);

Wait( 1 );

For( i = 1, i <= N Items( dt_lst ), i++,
	Close( dt_lst[i], nosave );
	wait(0.2);
);
Georg
cweisbart
Staff

Re: How to subset a data table based on column value

I would recommend using Select Where function (found under Rows->Rows Selection). Something similar to this should work:

Data Table( "Stability Trending" ) << Row Selection(
	Select where( :Lot No. == "Lot 12345" ),
	Dialog( Edit( Equal( Source Column( :Lot No. ) ) ), Keep dialog open( 0 ) )
);
Subset( (Selected Rows), Output Table Name( "Subset-Lot 12345" ) );
Clovis Weisbart
cweisbart
Staff

Re: How to subset a data table based on column value

Made a typo (forgot to replace one of my column names with the column name "Lot No." in your example):

Data Table( "Stability Trending" ) << Row Selection(
	Select where( :Lot No. == "Lot 12345" ),
	Dialog( Edit( Equal( Source Column( :Lot No. ) ) ), Keep dialog open( 0 ) )
);
Subset( (Selected Rows), Output Table Name( "Subset-Lot 12345" ) );
Clovis Weisbart
KKlear
Level I

Re: How to subset a data table based on column value

Thanks for the suggestions everyone.  Here is what I ended up doing:

dt = Data Table( "Stability Trending" );

LN={"Lot 12345"};

dt<<select where(contains(LN, Lot No.));

dt<<Subset(
	Output Table( "Data Subset - Lot 12345" ),
	Copy formula( 0 ),
	Selected Rows( 1 ),
	Columns(
		:Protocol No.,
		:Lot No.,
		:Final Fill Configuration,
		:Name( "Storage Temperature (°C)" ),
		:Time Point,
		:Time Interval,
		:Sterility Result,
		:Name( "Test 1 Result)" ),
		:Name( "Test 2 Result" ),
		:pH Result,
		:Name( "Test 3 Result" ),
		:Name( "Test 4 Result)" )
	)
)

The resulting data table is giving me the output I was looking for - only the rows containing the lot number of interest and the test results from the columns I specified.