- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
summary analysis for dynamic data
I need to summarize a complex data set.Data set contains a bunch of items that processed in last 2 weeks and how they performed.In order to gauge performance we have rating for different categories.Categories are from XZ001 to XZ100.I also have sub categories for each main category.Eg - XZ008 has YZ801 to YZ809 sub categories.
Data set is dynamic.meaning 1 week's data set might have YZ 899 but other week might not.
I have certain categories and subcategories classified into bucket1 and bucket2.All other categories would be part of bucket3.
Below is the script I attempted to write but doesnt seem to work.
Any help appreciated.
cdt = open("C:\Documents\Data\out.csv");
cdt=current data table();
sum_lst2 = {};
sum_lst3 = {};
sum_lst4 = {};
col_lst2 = cdt << get column names( Numeric );
For( i = 1, i <= N Items( col_lst2 ), i++,
colname = Column( col_lst2[i] ) << get name;
If( Contains( {"XZ008","XZ010","XZ015","XZ053","YZ1374","YZ1375","YZ9429","YZ9792","YZ9908", "YZ1930","YZ1933","YZ2744","YZ4621","YZ4625","YZ4630","YZ4640","YZ4681","YZ4830"}, colname ) |(colname >= "YZ901" & colname <= "YZ916") |(colname >= "YZ954" & colname <= "YZ965") |(colname >= "YZ1360" & colname <= "YZ1371") |(colname >= "YZ2581" & colname <= "YZ2586") | (colname >= "YZ2900" & colname <= "YZ2950") | (colname >= "YZ3400" & colname <= "YZ3450"),
Insert Into( sum_lst2, col_lst2[i] )
); //Bucket1
If( Contains( {"XZ098","XZ090"}, colname ) | (colname >= "YZ9700" & colname <= "YZ9791") | (colname >= "YZ9793" & colname <= "YZ9799"),
Insert Into( sum_lst4, col_lst2[i] )
);
);//Bucket2
If(!Contains( {"XZ008","XZ010","XZ015","XZ053","XZ098","XZ090","YZ1374","YZ1375","YZ9429","YZ9792","YZ9908", "YZ1930","YZ1933","YZ2744","YZ4621","YZ4625","YZ4630","YZ4640","YZ4681","YZ4830"}, colname ) |(colname >= "YZ901" & colname <= "YZ916") |(colname >= "YZ954" & colname <= "YZ965") |(colname >= "YZ1360" & colname <= "YZ1371") |(colname >= "YZ2581" & colname <= "YZ2586") | (colname >= "YZ2900" & colname <= "YZ2950") | (colname >= "YZ3400" & colname <= "YZ3450")| (colname >= "YZ9700" & colname <= "YZ9791") | (colname >= "YZ9793" & colname <= "YZ9799"),
Insert Into( sum_lst3, col_lst2[i] )
); //Bucket3
Eval( Eval Expr( cdt << New Column( "Bucket1", formula( Sum( Expr( sum_lst2 ) ) ) ) ) );
Eval( Eval Expr( cdt << New Column( "Bucket3", formula( Sum( Expr( sum_lst3 ) ) ) ) ) );
Eval( Eval Expr( cdt << New Column( "Bucket2", formula( Sum( Expr( sum_lst4 ) ) ) ) ) );
Sample data set:
Item | YZ0100 | YZ0801 | YZ0805 | YZ0806 | YZ0807 | YZ0813 | YZ0814 | YZ0815 | YZ0817 | YZ0871 | YZ0874 | YZ0881 | YZ1367 | YZ1374 | YZ1541 | YZ1571 | YZ1573 | YZ1575 | YZ1577 | YZ1583 | YZ1585 | YZ1590 | YZ1591 | YZ1594 | YZ1933 | YZ2701 | YZ2708 | YZ2737 | YZ2740 | YZ2744 | YZ2745 | YZ2749 | YZ2750 | YZ3404 | YZ3407 | YZ3415 | YZ3416 | YZ3417 | YZ3418 | YZ3422 | YZ3423 | YZ3492 | YZ3493 | YZ3502 | YZ3534 | YZ4621 | YZ4625 | YZ4821 | YZ9031 | YZ9429 | YZ9901 | YZ9908 | XZ001 | XZ008 | XZ013 | XZ015 | XZ019 | XZ027 | XZ034 | XZ035 | XZ046 | XZ048 | XZ090 | XZ094 | XZ099 |
AAA666 | 455 | 2 | 1 | 1 | 5 | 1 | 1 | 1 | 5 | 3 | 1 | 22 | 455 | 2 | 1 | 7 | 2 | 5 | 4 | 22 | |||||||||||||||||||||||||||||||||||||||||||||
AAA755 | 118 | 6 | 118 | 6 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
AAA757 | 324 | 1 | 1 | 7 | 1 | 33 | 3 | 1 | 3 | 1 | 1 | 3 | 1 | 1 | 1 | 8 | 324 | 2 | 7 | 37 | 6 | 3 | 1 | 1 | 1 | 8 | |||||||||||||||||||||||||||||||||||||||
BBA696 | 480 | 1 | 3 | 1 | 1 | 1 | 5 | 3 | 1 | 100 | 480 | 4 | 1 | 1 | 1 | 5 | 3 | 1 | 100 | ||||||||||||||||||||||||||||||||||||||||||||||
BBA786 | 155 | 1 | 1 | 1 | 1 | 155 | 1 | 1 | 1 | 1 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
BBA787 | 197 | 1 | 1 | 1 | 1 | 2 | 7 | 1 | 1 | 59 | 197 | 1 | 2 | 1 | 2 | 7 | 1 | 1 | 59 | ||||||||||||||||||||||||||||||||||||||||||||||
CCA613 | 233 | 1 | 1 | 1 | 1 | 1 | 1 | 2 | 1 | 12 | 233 | 1 | 1 | 4 | 2 | 1 | 12 | ||||||||||||||||||||||||||||||||||||||||||||||||
CCA614 | 551 | 1 | 1 | 2 | 1 | 7 | 2 | 1 | 4 | 4 | 1 | 69 | 551 | 1 | 1 | 10 | 3 | 4 | 4 | 1 | 69 | ||||||||||||||||||||||||||||||||||||||||||||
CCA642 | 382 | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 4 | 1 | 4 | 1 | 4 | 91 | 382 | 4 | 2 | 2 | 7 | 5 | 5 | 91 | |||||||||||||||||||||||||||||||||||||||||||
CCA642R | 57 | 1 | 3 | 57 | 1 | 3 | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
CCB613 | 165 | 1 | 1 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 52 | 165 | 1 | 1 | 4 | 6 | 1 | 1 | 52 | |||||||||||||||||||||||||||||||||||||||||||
DDA624 | 113 | 6 | 4 | 7 | 2 | 1 | 1 | 16 | 113 | 19 | 1 | 1 | 16 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
DDA631 | 561 | 1 | 1 | 5 | 6 | 3 | 3 | 2 | 1 | 1 | 1 | 3 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3 | 1 | 30 | 561 | 2 | 17 | 2 | 6 | 6 | 1 | 3 | 1 | 30 | |||||||||||||||||||||||||||||||||
DDA634 | 569 | 3 | 1 | 1 | 1 | 1 | 1 | 1 | 3 | 1 | 3 | 1 | 7 | 1 | 1 | 2 | 3 | 5 | 1 | 1 | 8 | 569 | 8 | 1 | 3 | 16 | 3 | 5 | 1 | 9 | |||||||||||||||||||||||||||||||||||
EEA624 | 228 | 1 | 1 | 1 | 3 | 1 | 2 | 2 | 1 | 15 | 2 | 54 | 228 | 3 | 3 | 6 | 15 | 56 | |||||||||||||||||||||||||||||||||||||||||||||||
EEA627 | 495 | 12 | 15 | 7 | 3 | 1 | 1 | 2 | 2 | 1 | 1 | 1 | 3 | 5 | 2 | 1 | 7 | 3 | 1 | 2 | 28 | 495 | 37 | 4 | 16 | 7 | 3 | 1 | 2 | 28 |
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
The only error I found in your code, is that your closing ")" for your
For( i = 1, i <= N Items( col_lst2 ), i++,
was placed after Bucket 2, instead of after Bucket 3.
I made a couple of additional minor modifications, but the code works as far as I can tell
Names Default To Here( 1 );
cdt = Open( "C:\Documents\Data\out.csv" );
Current Data Table( cdt );
sum_lst2 = {};
sum_lst3 = {};
sum_lst4 = {};
col_lst2 = cdt << get column names( Numeric );
For( i = 1, i <= N Items( col_lst2 ), i++,
colname = Column( col_lst2[i] ) << get name;
If(
Contains(
{"XZ008", "XZ010", "XZ015", "XZ053", "YZ1374", "YZ1375", "YZ9429", "YZ9792", "YZ9908", "YZ1930", "YZ1933", "YZ2744",
"YZ4621", "YZ4625", "YZ4630", "YZ4640", "YZ4681", "YZ4830"},
colname
) | (colname >= "YZ901" & colname <= "YZ916") | (colname >= "YZ954" & colname <= "YZ965") | (colname >= "YZ1360" &
colname <= "YZ1371") | (colname >= "YZ2581" & colname <= "YZ2586") | (colname >= "YZ2900" & colname <= "YZ2950") | (
colname >= "YZ3400" & colname <= "YZ3450"),
Insert Into( sum_lst2, col_lst2[i] )
); //Bucket1
If(
Contains( {"XZ098", "XZ090"}, colname ) | (colname >= "YZ9700" & colname <= "YZ9791") | (colname >= "YZ9793" & colname
<= "YZ9799"),
Insert Into( sum_lst4, col_lst2[i] )
);//Bucket2
If(
!Contains(
{"XZ008", "XZ010", "XZ015", "XZ053", "XZ098", "XZ090", "YZ1374", "YZ1375", "YZ9429", "YZ9792", "YZ9908", "YZ1930",
"YZ1933", "YZ2744", "YZ4621", "YZ4625", "YZ4630", "YZ4640", "YZ4681", "YZ4830"},
colname
) | (colname >= "YZ901" & colname <= "YZ916") | (colname >= "YZ954" & colname <= "YZ965") | (colname >= "YZ1360" &
colname <= "YZ1371") | (colname >= "YZ2581" & colname <= "YZ2586") | (colname >= "YZ2900" & colname <= "YZ2950") | (
colname >= "YZ3400" & colname <= "YZ3450") | (colname >= "YZ9700" & colname <= "YZ9791") | (colname >= "YZ9793" &
colname <= "YZ9799"),
Insert Into( sum_lst3, col_lst2[i] )
); //Bucket3
);
Eval( Eval Expr( cdt << New Column( "Bucket1", formula( Sum( Expr( sum_lst2 ) ) ) ) ) );
Eval( Eval Expr( cdt << New Column( "Bucket3", formula( Sum( Expr( sum_lst3 ) ) ) ) ) );
Eval( Eval Expr( cdt << New Column( "Bucket2", formula( Sum( Expr( sum_lst4 ) ) ) ) ) );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
The only error I found in your code, is that your closing ")" for your
For( i = 1, i <= N Items( col_lst2 ), i++,
was placed after Bucket 2, instead of after Bucket 3.
I made a couple of additional minor modifications, but the code works as far as I can tell
Names Default To Here( 1 );
cdt = Open( "C:\Documents\Data\out.csv" );
Current Data Table( cdt );
sum_lst2 = {};
sum_lst3 = {};
sum_lst4 = {};
col_lst2 = cdt << get column names( Numeric );
For( i = 1, i <= N Items( col_lst2 ), i++,
colname = Column( col_lst2[i] ) << get name;
If(
Contains(
{"XZ008", "XZ010", "XZ015", "XZ053", "YZ1374", "YZ1375", "YZ9429", "YZ9792", "YZ9908", "YZ1930", "YZ1933", "YZ2744",
"YZ4621", "YZ4625", "YZ4630", "YZ4640", "YZ4681", "YZ4830"},
colname
) | (colname >= "YZ901" & colname <= "YZ916") | (colname >= "YZ954" & colname <= "YZ965") | (colname >= "YZ1360" &
colname <= "YZ1371") | (colname >= "YZ2581" & colname <= "YZ2586") | (colname >= "YZ2900" & colname <= "YZ2950") | (
colname >= "YZ3400" & colname <= "YZ3450"),
Insert Into( sum_lst2, col_lst2[i] )
); //Bucket1
If(
Contains( {"XZ098", "XZ090"}, colname ) | (colname >= "YZ9700" & colname <= "YZ9791") | (colname >= "YZ9793" & colname
<= "YZ9799"),
Insert Into( sum_lst4, col_lst2[i] )
);//Bucket2
If(
!Contains(
{"XZ008", "XZ010", "XZ015", "XZ053", "XZ098", "XZ090", "YZ1374", "YZ1375", "YZ9429", "YZ9792", "YZ9908", "YZ1930",
"YZ1933", "YZ2744", "YZ4621", "YZ4625", "YZ4630", "YZ4640", "YZ4681", "YZ4830"},
colname
) | (colname >= "YZ901" & colname <= "YZ916") | (colname >= "YZ954" & colname <= "YZ965") | (colname >= "YZ1360" &
colname <= "YZ1371") | (colname >= "YZ2581" & colname <= "YZ2586") | (colname >= "YZ2900" & colname <= "YZ2950") | (
colname >= "YZ3400" & colname <= "YZ3450") | (colname >= "YZ9700" & colname <= "YZ9791") | (colname >= "YZ9793" &
colname <= "YZ9799"),
Insert Into( sum_lst3, col_lst2[i] )
); //Bucket3
);
Eval( Eval Expr( cdt << New Column( "Bucket1", formula( Sum( Expr( sum_lst2 ) ) ) ) ) );
Eval( Eval Expr( cdt << New Column( "Bucket3", formula( Sum( Expr( sum_lst3 ) ) ) ) ) );
Eval( Eval Expr( cdt << New Column( "Bucket2", formula( Sum( Expr( sum_lst4 ) ) ) ) ) );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
I don't understand the issue. The following columns are the columns that are being added up for Bucket 3.
{YZ0100, YZ0801, YZ0805, YZ0806, YZ0807, YZ0813, YZ0814, YZ0815, YZ0817, YZ0871,
YZ0874, YZ0881, YZ1367, YZ1541, YZ1571, YZ1573, YZ1575, YZ1577, YZ1583, YZ1585,
YZ1590, YZ1591, YZ1594, YZ2701, YZ2708, YZ2737, YZ2740, YZ2745, YZ2749, YZ2750,
YZ3404, YZ3407, YZ3415, YZ3416, YZ3417, YZ3418, YZ3422, YZ3423, YZ3492, YZ3493,
YZ3502, YZ3534, YZ4821, YZ9031, YZ9901, XZ001, XZ013, XZ019, XZ027, XZ034, XZ035,
XZ046, XZ048, XZ094, XZ099}
Which of those column should not be there?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
Ah, given that logic, which I missed, I think how it needs to be stated, is:
If(
Contains(
{"XZ008", "XZ010", "XZ015", "XZ053", "XZ098", "XZ090", "YZ1374", "YZ1375", "YZ9429", "YZ9792", "YZ9908", "YZ1930", "YZ1933", "YZ2744",
"YZ4621", "YZ4625", "YZ4630", "YZ4640", "YZ4681", "YZ4830"},
colname
) == 0,
If(
(colname >= "YZ901" & colname <= "YZ916") | (colname >= "YZ954" & colname <= "YZ965") | (colname >= "YZ1360" & colname <= "YZ1371") | (colname
>= "YZ2581" & colname <= "YZ2586") | (colname >= "YZ2900" & colname <= "YZ2950") | (colname >= "YZ3400" & colname <= "YZ3450") | (colname
>= "YZ9700" & colname <= "YZ9791") | (colname >= "YZ9793" & colname <= "YZ9799"),
Insert Into( sum_lst3, col_lst2[i] )
)
); //Bucket3
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
HI,
Watching this thread, I'm thinking that this script could be challenging to maintain / proofread later. I'll post an alternate approach in case anyone finds it useful... hopefully someone will.
1) Stack the original table, so it looks like the one below. I call this "MainTable"
2) Store the conditions in a data table just for this purpose. In your case, anything that is not in buckets 1 or 2 is in bucket 3, so those conditions need not be listed. I've only done this for a couple of cases, just for illustration. I call this "ConditionTable". Note that the dt1 column name :Label is hardcoded here... you can work around this of course, if needed or desired.
3) Now, the script becomes a lot shorter and easier to follow, assuming familiarity with expression handling, which it appears from your code that you possess. The sums can be easily obtained by using a summary table, as we now have a "bucket" column for grouping. Overall, the project should be easier to maintain, as the condition table is much easier to read than code. (Be sure to use the == operator in the condition table.)
I've attached sample tables in case you want to try this out.
Cheers,
Brady
Names Default To Here( 1 );
dt1 = Data Table( "MainTable" );
dtC = Data Table( "ConditionTable" );
//set all buckets to 3, and change as needed
dt1 << New Column( "Bucket", <<set each value( 3 ) );
//this will assign buckets 1 and 2 to appropriate rows in dt1 based on conditions
//in the condition table
condExpr = Expr(
dt1:Bucket[dt1 << get rows where( _COND_ )] = _BUCKET_
);
//loop through to apply the conditions
For( i = 1, i <= N Row( dtC ), i++,
Eval(
Substitute( Name Expr( condExpr ),
Expr( _COND_ ), Parse( dtC:Condition[i] ),
Expr( _BUCKET_ ), dtc:Bucket[i]
)
)
);
//produce a table of results
dtSum = dt1 << summary ( sum(:Data), group(:Item, :Bucket));
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
Thanks this is a very simple and robust approach.Only caveat is stacking the main table.The column names beginning with YZ* is dynamic.How do I stack the table without knowing names of all the columns of interest and their data type?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: summary analysis for dynamic data
Okay so I was able to get one step further and select the columns starting with "YZ" using the script below.
My original table contains - item,total tested qty,week,FB*,IB* columns.
I would need total tested qty,week and FB* for my analysis.How do I use the stack function to prep the table so that Brady's approach can be used.
dt = Current Data Table();
Col_List = dt << Get Column Names( "String" );
For( i = 1, i <= N Items( Col_List ), i++,
If( Contains( Col_List[i], "YZ" ),
Column( Col_List[i] ) << Set Selected( 1 )
)
);