cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
txwrangler
Level I

how to assign the same number of rows to each case based on years

I have a vertical data set with uneven number of years' data (body weight, for example) for each case. Some have years 1,2,3,4,6 and some have 3,4,5. I want each case to have 6 rows with the year column containing 1,2,3,4,5,6 even though some of those years will not have a body weight. How can I assign each case 6 rows that recognize the appropriate missing year?

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: how to assign the same number of rows to each case based on years

Assuming you have a good deal of data, a simple script can be used to add in the missing years.  Below is an example that creates an original table with some missing years, and then from that point performs a couple of steps that ends up with the missing years being added in

names default to here(1);

dt = New Table( "Base",
	Add Rows( 10 ),
	New Column( "Case",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [1, 1, 1, 1, 1, 2, 2, 2, 2, 2] ),
		Set Display Width( 43 )
	),
	New Column( "Year",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [1, 2, 3, 5, 6, 1, 2, 4, 5, 6] )
	),
	New Column( "OtherStuff", formula(randomuniform()))
);

// Summarize original data table to get a data table of the cases
dtCase = dt << Summary(
	Group( :Case ),
	Freq( "None" ),
	Weight( "None" ),
	output table name( "Cases" )
);
dtCase << delete columns(:N Rows);

// Create a new table with just the years
dtYears = new table("years", new column("Year", values([1,2,3,4,5,6])));

// Perform a cartesian join to create a table with all combinations of cases and years
dtCY = Data Table( "Cases" ) << Join(
	With( Data Table( "years" ) ),
	Cartesian Join,
	Output Table( "Case and Years" )
);

// Use a join on the original table to add in the misssing years
dt << Join(
	With( dtCY ),
	Merge Same Name Columns,
	Match Flag( 0 ),
	By Matching Columns( :Case = :Case, :Year = :Year ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 1 ),
	Preserve main table order( 1 ),
	Output Table( "Final" )
);
Jim

View solution in original post

2 REPLIES 2
txnelson
Super User

Re: how to assign the same number of rows to each case based on years

Assuming you have a good deal of data, a simple script can be used to add in the missing years.  Below is an example that creates an original table with some missing years, and then from that point performs a couple of steps that ends up with the missing years being added in

names default to here(1);

dt = New Table( "Base",
	Add Rows( 10 ),
	New Column( "Case",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [1, 1, 1, 1, 1, 2, 2, 2, 2, 2] ),
		Set Display Width( 43 )
	),
	New Column( "Year",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [1, 2, 3, 5, 6, 1, 2, 4, 5, 6] )
	),
	New Column( "OtherStuff", formula(randomuniform()))
);

// Summarize original data table to get a data table of the cases
dtCase = dt << Summary(
	Group( :Case ),
	Freq( "None" ),
	Weight( "None" ),
	output table name( "Cases" )
);
dtCase << delete columns(:N Rows);

// Create a new table with just the years
dtYears = new table("years", new column("Year", values([1,2,3,4,5,6])));

// Perform a cartesian join to create a table with all combinations of cases and years
dtCY = Data Table( "Cases" ) << Join(
	With( Data Table( "years" ) ),
	Cartesian Join,
	Output Table( "Case and Years" )
);

// Use a join on the original table to add in the misssing years
dt << Join(
	With( dtCY ),
	Merge Same Name Columns,
	Match Flag( 0 ),
	By Matching Columns( :Case = :Case, :Year = :Year ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 1 ),
	Preserve main table order( 1 ),
	Output Table( "Final" )
);
Jim
txwrangler
Level I

Re: how to assign the same number of rows to each case based on years

This works! I had to assign to my original table to get the cases but the overall method works fine. Thanks much!