I am attempting to pull column values for selected rows into a concatenated label which is used in annotating an added reference line in a distribution.
The error message in concatenation is due to mismatch in the list dimensions. The Associative Array calls used to collect values from the selected rows is removing duplicate values, resulting in lists of unique values.
My goal is to get the name, age and weight for a given selected row into a string that can be used as the label in a later add ref call.
With the current approach I have error codes and in some cases a misalignment in data without an error code.
Example 1 - Error condition: Three unique weights are found where the lists for ages and names only found two items each from the selected rows. The third iteration of indexing fails due to the differences in list dimensions.
Example 2 - Misaligned data: There are three rows selected using the names Leslie and Patty. There are two entries with the name Patty, one age 14 and weight 85, the other age 16 with weight 134. One of the entries for Patty and the entry for Leslie are both 14 years old. The result of misaligned data can be see if the label only considers the lists for name {"LESLIE", "PATTY"} and age {14,16}. Indexing through would only make two entries, the first is "Leslie ...14" the second is "Patty ...16" which is a correct association but "Patty ...14" is missed and the list dimensions are data dependent.
Should I remove the Associative Array calls and instead move to a subset/table approach or should I revise the Associative Array calls to have list items which are lists with the name, age and weight captured in each iteration or for each selected row?
Thank you,
Bryan
Error code when dimensions of lists are exceeded:
argument should be character at row 41 in access or evaluation of 'Concat' , Bad Argument( theAges[i] ), theNames[i] || /*###*/" ..." || /*###*/theAges[i] /*###*/
Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Big Class.jmp" );
col_name_list = dt << get column names( string );
newColName = "State";
// New column: State
If( !Contains( col_name_list, newColName ),
Local( {dt, tempStrings},
dt = Data Table( "Big Class" );
dt << New Column( "State", Character, "Nominal" ) << Begin Data Update;
tempStrings = {"CA", "WA", "OR"};
For Each Row( dt, :State = tempStrings[Sequence( 1, 3, 1, 1 )] );
dt << End Data Update;
);
//Make it a double Patty :), another person with the same name.
dt << Add Rows( 1, At End );
//Cheat and use a constant of 41 for the last row index
dt[[41], {name, age, sex, height, weight, State}] = {{"PATTY", 16, "F", 57, 134, "WA"}};
);
// For illustration, select some names
dt << select where( Contains Item( :name, {"LESLIE", "PATTY"} ) );
dist = dt << Distribution(
Stack( 1 ),
Continuous Distribution(
Column( :height ),
Quantiles( 0 ),
Summary Statistics( 0 ),
Horizontal Layout( 1 ),
Normal Quantile Plot( 0 ), // disable, if the user enabled it in the preferences :)
Vertical( 0 ),
Set Bin Width( 2 ),
Process Capability( 0 )
//);
),
By( :State )
);
// get all reports
// if there is a BY group, get the info - if there is none, put it into a list
If( Is List( dist ),
dists=dist;
myBys = Transform each ({myexpr} , dist << Get Where Expr, Arg(myExpr , 2 ))
, //no By
dists = Eval List( {dist} );
myBys= {};
);
//get the reports
distRPTs = Transform Each( {dist}, dists, Report( dist ) );
// Find the Midpoints of the bins the selected rows are in
dist << Save( "Level Midpoints" );
//loop though the By groups and add the reflines
For Each( {report, idx}, distRPTs,
//report = distRPTs[2]; idx=2;
selected rows = dt << get selected rows;
// there is a BY group -> restrict the selection
if(N Items(myBys),
Eval( Eval Expr( selected rows = dt << get rows where( Contains( Expr( selected rows ), Row() ) & :State == Expr( myBys[idx] ) ) ) );
);
theBins = (Associative Array( Column( dt, N Cols( dt ) )[selected rows] )) << get keys;
theNames = (Associative Array( Column(dt, "name") [dt << get selected rows] )) << get keys;
theAges = (Associative Array( Column(dt, "age") [dt << get selected rows] )) << get keys;
theWeights = (Associative Array( Column(dt, "weight") [dt << get selected rows] )) << get keys;
For( i = 1, i <= N Items(theWeights), i++,
theLabels[i] = theNames[i] || " ..." || theAges[i] ;
);
);
dt << delete columns( N Cols( dt ) );