Using column values from selected rows as labels for added references

bb101 · Oct 14, 2024 03:54 PM

I am attempting to pull column values for selected rows into a concatenated label which is used in annotating an added reference line in a distribution.

The error message in concatenation is due to mismatch in the list dimensions. The Associative Array calls used to collect values from the selected rows is removing duplicate values, resulting in lists of unique values.

My goal is to get the name, age and weight for a given selected row into a string that can be used as the label in a later add ref call.

With the current approach I have error codes and in some cases a misalignment in data without an error code.

Example 1 - Error condition: Three unique weights are found where the lists for ages and names only found two items each from the selected rows. The third iteration of indexing fails due to the differences in list dimensions.

Example 2 - Misaligned data: There are three rows selected using the names Leslie and Patty. There are two entries with the name Patty, one age 14 and weight 85, the other age 16 with weight 134. One of the entries for Patty and the entry for Leslie are both 14 years old. The result of misaligned data can be see if the label only considers the lists for name {"LESLIE", "PATTY"} and age {14,16}. Indexing through would only make two entries, the first is "Leslie ...14" the second is "Patty ...16" which is a correct association but "Patty ...14" is missed and the list dimensions are data dependent.

Should I remove the Associative Array calls and instead move to a subset/table approach or should I revise the Associative Array calls to have list items which are lists with the name, age and weight captured in each iteration or for each selected row?

Thank you,

Bryan

Error code when dimensions of lists are exceeded:

argument should be character at row 41 in access or evaluation of 'Concat' , Bad Argument( theAges[i] ), theNames[i] || /*###*/" ..." || /*###*/theAges[i] /*###*/

Names Default To Here( 1 ); 

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );


col_name_list = dt << get column names( string );
newColName = "State";

// New column: State
If( !Contains( col_name_list, newColName ),
	Local( {dt, tempStrings},
		dt = Data Table( "Big Class" );
		dt << New Column( "State", Character, "Nominal" ) << Begin Data Update;
		tempStrings = {"CA", "WA", "OR"};
		For Each Row( dt, :State = tempStrings[Sequence( 1, 3, 1, 1 )] );
		dt << End Data Update;
	);
	//Make it a double Patty :), another person with the same name.
	dt << Add Rows( 1, At End );
	//Cheat and use a constant of 41 for the last row index
	dt[[41], {name, age, sex, height, weight, State}] = {{"PATTY", 16, "F", 57, 134, "WA"}};
);


// For illustration, select some names
dt << select where( Contains Item( :name, {"LESLIE", "PATTY"} ) );

dist = dt << Distribution(
	Stack( 1 ),
	Continuous Distribution(
		Column( :height ),
		Quantiles( 0 ),
		Summary Statistics( 0 ),
		Horizontal Layout( 1 ),
		Normal Quantile Plot( 0 ), // disable, if the user enabled it in the preferences :)
		Vertical( 0 ),
		Set Bin Width( 2 ),
		Process Capability( 0 )
	
	//);	
	),
	By( :State )
);


// get all reports
// if there is a BY group, get the info - if there is none, put it into a list
 If( Is List( dist ),
	dists=dist;
	myBys = Transform each ({myexpr} , dist << Get Where Expr, Arg(myExpr , 2 ))
	
	, //no By
	
	dists = Eval List( {dist} );
	myBys= {};
	
);

//get the reports
distRPTs = Transform Each( {dist}, dists, Report( dist ) );


// Find the Midpoints of the bins the selected rows are in
dist << Save( "Level Midpoints" );

//loop though the By groups and add the reflines
For Each( {report, idx}, distRPTs,
//report = distRPTs[2]; idx=2;
	selected rows = dt << get selected rows;

	// there is a BY group -> restrict the selection
	if(N Items(myBys),
	Eval( Eval Expr( selected rows = dt << get rows where( Contains( Expr( selected rows ), Row() ) & :State == Expr( myBys[idx] ) ) ) );
	);


	theBins = (Associative Array( Column( dt, N Cols( dt ) )[selected rows] )) << get keys;
	theNames = (Associative Array( Column(dt, "name") [dt << get selected rows] )) << get keys;
	theAges = (Associative Array( Column(dt, "age") [dt << get selected rows] )) << get keys; 
	theWeights = (Associative Array( Column(dt, "weight") [dt << get selected rows] )) << get keys; 
	
	For( i = 1, i <= N Items(theWeights), i++,
		theLabels[i] =  theNames[i]  || " ..." || theAges[i] ;
	);
	
);
dt << delete columns( N Cols( dt ) );

hogi · Oct 14, 2024 04:37 PM

Hi Bryan, you could use a transform column to prepare the correct results - and then pick the entries you need:

Names Default To Here( 1 ); 

dt = Open( "$SAMPLE_DATA/Big Class.jmp" );


col_name_list = dt << get column names( string );
newColName = "State";

// New column: State
If( !Contains( col_name_list, newColName ),
	Local( {dt, tempStrings},
		dt = Data Table( "Big Class" );
		dt << New Column( "State", Character, "Nominal" ) << Begin Data Update;
		tempStrings = {"CA", "WA", "OR"};
		For Each Row( dt, :State = tempStrings[Sequence( 1, 3, 1, 1 )] );
		dt << End Data Update;
	);
	//Make it a double Patty :), another person with the same name.
	dt << Add Rows( 1, At End );
	//Cheat and use a constant of 41 for the last row index
	dt[[41], {name, age, sex, height, weight, State}] = {{"PATTY", 16, "F", 57, 134, "WA"}};
);


// For illustration, select some names
dt << select where( Contains Item( :name, {"LESLIE", "PATTY"} ) );

dist = dt << Distribution(
	Stack( 1 ),
	Continuous Distribution(
		Column( :height ),
		Quantiles( 0 ),
		Summary Statistics( 0 ),
		Horizontal Layout( 1 ),
		Normal Quantile Plot( 0 ), // disable, if the user enabled it in the preferences :)
		Vertical( 0 ),
		Set Bin Width( 2 ),
		Process Capability( 0 )
	
	//);	
	),
	By( :State )
);


// get all reports
// if there is a BY group, get the info - if there is none, put it into a list
 If( Is List( dist ),
	dists=dist;
	myBys = Transform each ({myexpr} , dist << Get Where Expr, Arg(myExpr , 2 ))
	
	, //no By
	
	dists = Eval List( {dist} );
	myBys= {};
	
);

//get the reports
distRPTs = Transform Each( {dist}, dists, Report( dist ) );


// Find the Midpoints of the bins the selected rows are in
dist << Save( "Level Midpoints" );


dt << transform column ("label",Character, Formula (:name  || " ..." || Char(age)));


//loop though the By groups and add the reflines
For Each( {report, idx}, distRPTs,
//report = distRPTs[2]; idx=2;
	selected rows = dt << get selected rows;

	// there is a BY group -> restrict the selection
	if(N Items(myBys),
	Eval( Eval Expr( selected rows = dt << get rows where( Contains( Expr( selected rows ), Row() ) & :State == Expr( myBys[idx] ) ) ) );
	);
	
	myLabels = Associative Array(:label[selected rows]) << get keys;
	print(myLabels)
	
);
dt << delete columns( N Cols( dt ) );

Names Default To Here( 1 ); dt = Open( "$SAMPLE_DATA/Big Class.jmp" ); col_name_list = dt << get column names( string ); newColName = "State"; // New column: State If( !Contains( col_name_list, newColName ), Local( {dt, tempStrings}, dt = Data Table( "Big Class" ); dt << New Column( "State", Character, "Nominal" ) << Begin Data Update; tempStrings = {"CA", "WA", "OR"}; For Each Row( dt, :State = tempStrings[Sequence( 1, 3, 1, 1 )] ); dt << End Data Update; ); //Make it a double Patty :), another person with the same name. dt << Add Rows( 1, At End ); //Cheat and use a constant of 41 for the last row index dt[[41], {name, age, sex, height, weight, State}] = {{"PATTY", 16, "F", 57, 134, "WA"}}; ); // For illustration, select some names dt << select where( Contains Item( :name, {"LESLIE", "PATTY"} ) ); dist = dt << Distribution( Stack( 1 ), Continuous Distribution( Column( :height ), Quantiles( 0 ), Summary Statistics( 0 ), Horizontal Layout( 1 ), Normal Quantile Plot( 0 ), // disable, if the user enabled it in the preferences :) Vertical( 0 ), Set Bin Width( 2 ), Process Capability( 0 ) //); ), By( :State ) ); // get all reports // if there is a BY group, get the info - if there is none, put it into a list If( Is List( dist ), dists=dist; myBys = Transform each ({myexpr} , dist << Get Where Expr, Arg(myExpr , 2 )) , //no By dists = Eval List( {dist} ); myBys= {}; ); //get the reports distRPTs = Transform Each( {dist}, dists, Report( dist ) ); // Find the Midpoints of the bins the selected rows are in dist << Save( "Level Midpoints" ); dt << transform column ("label",Character, Formula (:name || " ..." || Char(age))); //loop though the By groups and add the reflines For Each( {report, idx}, distRPTs, //report = distRPTs[2]; idx=2; selected rows = dt << get selected rows; // there is a BY group -> restrict the selection if(N Items(myBys), Eval( Eval Expr( selected rows = dt << get rows where( Contains( Expr( selected rows ), Row() ) & :State == Expr( myBys[idx] ) ) ) ); ); myLabels = Associative Array(:label[selected rows]) << get keys; print(myLabels) ); dt << delete columns( N Cols( dt ) );

bb101 · Oct 28, 2024 09:28 PM

Hi @hogi

I have combined your ideas in this post and the related post on adding a ref line.

The solution that is working to align the right information without loss of repeated values while correctly adding bins to the plots in the stack was to prepare lists from the fields in the table by indexing through selected rows within a nested for loop.

Plot distributions
Collect By criteria into an array for each of the plots in an instance in the report (3 plots for each data item, 3 arrays)
Collect the reports into an object used in iteration
Append the midpoints for the plotted data (3 plots, 3 columns appended to the data table)
Loop through object with a For Each to access and augment each plot
1. Initialize list variables for bins and labels
2. Select the rows in data based on baseline criteria (ex Serial Number)
3. Refine selected rows based on criteria matching the By criteria in the plots
4. Use a for loop for as many selected rows
  1. Collect the bins from the appended midpoint columns (Each plot has a respective column)
  2. Collect the information into the label from interesting columns (eg Date of Measurment, Serial Number)
  3. Add ref line with collected information.
Delete the midpoint columns to avoid stale data in case the plot settings in step 1 changes.

Key changes which helped:

1. Change from Associative Array to lists populated through iteration to avoid elimination of duplicate values, maintaining alignment.

2. Create separate variables for each of the by conditions.

Hope this helps close the loop. Your advice is appreciated. They enabled my learning and progress.

Using column values from selected rows as labels for added references

Re: Using column values from selected rows as labels for added references

Re: Using column values from selected rows as labels for added references