cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Choose Language Hide Translation Bar
matt7109
Level III

Sum specific values in certain rows/columns

Hi there,

I'm working with an oddly layed out table of data. If you see the table below (not actual data, but a resemblence), I need to find the sum of all the red numbers, and then the sum of all of the green numbers. Preferably, an additional column woul be created and each row for Bob would read the red sum, and each row for Bill would read the green sum.   Could anyone give me an overview of how I could go about doing this? 

NameAgeTest 1Test 2Test 3
Bob1102234
Bob1102234
Bob2122436
Bob2122436
Bob3142638
Bob3142638
Bill1162840
Bill1162840
Bill2183042
Bill2183042
Bill3203244
Bill3203244

Thanks in advance! 

1 ACCEPTED SOLUTION

Accepted Solutions
ian_jmp
Staff

Re: Sum specific values in certain rows/columns

I'm sorry, but in my haste I didn't spot the fact that the rows are genuine duplicates (as you might have seen from my example). I also didn't appreciate that you want to sum over columns.

If you can rely on this strict duplication, then you could do:

NamesDefaultToHere(1);
// Make some test data: Note that pairs of rows are duplicates!
dt = New Table( "Test Data",
				New Column( "Group", Numeric, "Nominal", Values([1,1,1,1,1,1,2,2,2,2,2,2])),
				New Column( "Test 1", Numeric, "Ordinal", Values([2,2,3,3,4,4,5,5,6,6,7,7])),
				New Column( "Test 2", Numeric, "Ordinal", Values([1,1,2,2,3,3,4,4,5,5,6,6]))
			);

// List of columns for Summary
colList = {:Test 1, :Test 2};
// Summary
dt2 = dt << Summary(Group( :Group ), Sum( Eval(colList) ), StatisticsColumnNameFormat("column"));
// Sum over the columns in the summary, but also divide by 2 to account for the duplicate rows in dt
// (1) Template exprtession
CMD = Expr(dt2 << New Column( "Required Sum", Numeric, "Continuous", Formula( Sum(colsTBD) / 2 )));
// (2) Substitute in the columns we have
SubstituteInto(CMD, Expr(colsTBD), Eval(colList));
// (3) Use the expression we have built
CMD;

 

View solution in original post

7 REPLIES 7
ian_jmp
Staff

Re: Sum specific values in certain rows/columns

Do you need the sums for every second row (starting with the first), within each group defined by the values in 'Name'?

matt7109
Level III

Re: Sum specific values in certain rows/columns

Yes! That's exaclty what I need.

ian_jmp
Staff

Re: Sum specific values in certain rows/columns

In the simple case of an even number of rows in each group, you could do something like this:

NamesDefaultToHere(1);
// Make some test data
dt = New Table( "Test Data",
				Add Rows( 20 ),
				New Column( "Group", Numeric, "Nominal", Values(VConcat(J(10, 1, 1), J(10, 1, 2)))),
				New Column( "Test 1", Numeric, "Ordinal", Values(J(20, 1, RandomInteger(1,30)))),
				New Column( "Test 2", Numeric, "Ordinal", Values(J(20, 1, RandomInteger(1,100))))
			);
// Add a 'Subgroup' column
dt << NewColumn("Subgroup", Numeric, Nominal, Values(Repeat((1::2)`, 10)));
// Make a summary table
dt2 = dt  << Summary(Group( :Group, :Subgroup ), Sum( :Test 1 ), Sum( :Test 2 ));
dt2 << selectWhere(:Subgroup == 2);
dt2 << hideAndExclude;
dt2 << clearSelect;
dale_lehman
Level VII

Re: Sum specific values in certain rows/columns

Couldn't yoou use Analyze - Tabulate?  Put Name follwed by Age in the rows and then use the Average of Test 1 (and columns for Test 2 and Test 3 as well) - the average is to avoid double counting the duplicate values.  Then make that into a data table.  From there, you are almost where you want to be.

matt7109
Level III

Re: Sum specific values in certain rows/columns

Thanks,

However, say I wanted to simply add all the values in a specific row for a range of columns.I found this snip of code below that I would think adds all cells in a row between columns number TR and TD. However, it simply returns the value of cell TD. Why is this and how can I make it work?

 

TotRow = dt << Get Column Names(numeric);
TR = N Items(TotRow);
TD = N Items(colnamesSix);
dt << New Column("Total", Numeric);
For Each Row( :Total[] = Sum( Column(TotRow[TD :: TR])[]) );

ian_jmp
Staff

Re: Sum specific values in certain rows/columns

I'm sorry, but in my haste I didn't spot the fact that the rows are genuine duplicates (as you might have seen from my example). I also didn't appreciate that you want to sum over columns.

If you can rely on this strict duplication, then you could do:

NamesDefaultToHere(1);
// Make some test data: Note that pairs of rows are duplicates!
dt = New Table( "Test Data",
				New Column( "Group", Numeric, "Nominal", Values([1,1,1,1,1,1,2,2,2,2,2,2])),
				New Column( "Test 1", Numeric, "Ordinal", Values([2,2,3,3,4,4,5,5,6,6,7,7])),
				New Column( "Test 2", Numeric, "Ordinal", Values([1,1,2,2,3,3,4,4,5,5,6,6]))
			);

// List of columns for Summary
colList = {:Test 1, :Test 2};
// Summary
dt2 = dt << Summary(Group( :Group ), Sum( Eval(colList) ), StatisticsColumnNameFormat("column"));
// Sum over the columns in the summary, but also divide by 2 to account for the duplicate rows in dt
// (1) Template exprtession
CMD = Expr(dt2 << New Column( "Required Sum", Numeric, "Continuous", Formula( Sum(colsTBD) / 2 )));
// (2) Substitute in the columns we have
SubstituteInto(CMD, Expr(colsTBD), Eval(colList));
// (3) Use the expression we have built
CMD;

 

matt7109
Level III

Re: Sum specific values in certain rows/columns

Very helpful! Thank you!!