Subscribe Bookmark RSS Feed

Sum specific values in certain rows/columns

matt7109

Contributor

Joined:

Jun 9, 2017

Hi there,

I'm working with an oddly layed out table of data. If you see the table below (not actual data, but a resemblence), I need to find the sum of all the red numbers, and then the sum of all of the green numbers. Preferably, an additional column woul be created and each row for Bob would read the red sum, and each row for Bill would read the green sum.   Could anyone give me an overview of how I could go about doing this? 

NameAgeTest 1Test 2Test 3
Bob1102234
Bob1102234
Bob2122436
Bob2122436
Bob3142638
Bob3142638
Bill1162840
Bill1162840
Bill2183042
Bill2183042
Bill3203244
Bill3203244

Thanks in advance! 

1 ACCEPTED SOLUTION

Accepted Solutions
ian_jmp

Staff

Joined:

Jun 23, 2011

Solution

I'm sorry, but in my haste I didn't spot the fact that the rows are genuine duplicates (as you might have seen from my example). I also didn't appreciate that you want to sum over columns.

If you can rely on this strict duplication, then you could do:

NamesDefaultToHere(1);
// Make some test data: Note that pairs of rows are duplicates!
dt = New Table( "Test Data",
				New Column( "Group", Numeric, "Nominal", Values([1,1,1,1,1,1,2,2,2,2,2,2])),
				New Column( "Test 1", Numeric, "Ordinal", Values([2,2,3,3,4,4,5,5,6,6,7,7])),
				New Column( "Test 2", Numeric, "Ordinal", Values([1,1,2,2,3,3,4,4,5,5,6,6]))
			);

// List of columns for Summary
colList = {:Test 1, :Test 2};
// Summary
dt2 = dt << Summary(Group( :Group ), Sum( Eval(colList) ), StatisticsColumnNameFormat("column"));
// Sum over the columns in the summary, but also divide by 2 to account for the duplicate rows in dt
// (1) Template exprtession
CMD = Expr(dt2 << New Column( "Required Sum", Numeric, "Continuous", Formula( Sum(colsTBD) / 2 )));
// (2) Substitute in the columns we have
SubstituteInto(CMD, Expr(colsTBD), Eval(colList));
// (3) Use the expression we have built
CMD;

 

7 REPLIES
ian_jmp

Staff

Joined:

Jun 23, 2011

Do you need the sums for every second row (starting with the first), within each group defined by the values in 'Name'?

matt7109

Contributor

Joined:

Jun 9, 2017

Yes! That's exaclty what I need.

ian_jmp

Staff

Joined:

Jun 23, 2011

In the simple case of an even number of rows in each group, you could do something like this:

NamesDefaultToHere(1);
// Make some test data
dt = New Table( "Test Data",
				Add Rows( 20 ),
				New Column( "Group", Numeric, "Nominal", Values(VConcat(J(10, 1, 1), J(10, 1, 2)))),
				New Column( "Test 1", Numeric, "Ordinal", Values(J(20, 1, RandomInteger(1,30)))),
				New Column( "Test 2", Numeric, "Ordinal", Values(J(20, 1, RandomInteger(1,100))))
			);
// Add a 'Subgroup' column
dt << NewColumn("Subgroup", Numeric, Nominal, Values(Repeat((1::2)`, 10)));
// Make a summary table
dt2 = dt  << Summary(Group( :Group, :Subgroup ), Sum( :Test 1 ), Sum( :Test 2 ));
dt2 << selectWhere(:Subgroup == 2);
dt2 << hideAndExclude;
dt2 << clearSelect;
dale_lehman

Community Trekker

Joined:

Jan 29, 2015

Couldn't yoou use Analyze - Tabulate?  Put Name follwed by Age in the rows and then use the Average of Test 1 (and columns for Test 2 and Test 3 as well) - the average is to avoid double counting the duplicate values.  Then make that into a data table.  From there, you are almost where you want to be.

matt7109

Contributor

Joined:

Jun 9, 2017

Thanks,

However, say I wanted to simply add all the values in a specific row for a range of columns.I found this snip of code below that I would think adds all cells in a row between columns number TR and TD. However, it simply returns the value of cell TD. Why is this and how can I make it work?

 

TotRow = dt << Get Column Names(numeric);
TR = N Items(TotRow);
TD = N Items(colnamesSix);
dt << New Column("Total", Numeric);
For Each Row( :Total[] = Sum( Column(TotRow[TD :: TR])[]) );

ian_jmp

Staff

Joined:

Jun 23, 2011

Solution

I'm sorry, but in my haste I didn't spot the fact that the rows are genuine duplicates (as you might have seen from my example). I also didn't appreciate that you want to sum over columns.

If you can rely on this strict duplication, then you could do:

NamesDefaultToHere(1);
// Make some test data: Note that pairs of rows are duplicates!
dt = New Table( "Test Data",
				New Column( "Group", Numeric, "Nominal", Values([1,1,1,1,1,1,2,2,2,2,2,2])),
				New Column( "Test 1", Numeric, "Ordinal", Values([2,2,3,3,4,4,5,5,6,6,7,7])),
				New Column( "Test 2", Numeric, "Ordinal", Values([1,1,2,2,3,3,4,4,5,5,6,6]))
			);

// List of columns for Summary
colList = {:Test 1, :Test 2};
// Summary
dt2 = dt << Summary(Group( :Group ), Sum( Eval(colList) ), StatisticsColumnNameFormat("column"));
// Sum over the columns in the summary, but also divide by 2 to account for the duplicate rows in dt
// (1) Template exprtession
CMD = Expr(dt2 << New Column( "Required Sum", Numeric, "Continuous", Formula( Sum(colsTBD) / 2 )));
// (2) Substitute in the columns we have
SubstituteInto(CMD, Expr(colsTBD), Eval(colList));
// (3) Use the expression we have built
CMD;

 

matt7109

Contributor

Joined:

Jun 9, 2017

Very helpful! Thank you!!