Solved: Cell comparison with conditional output

jmper99 · Jun 10, 2023 4:57 PM

Hi,

I'm trying to create a count of occurrences based on conditions, using the table below as an example. The conditions for a count of 1 are that columns "Parameter" AND "Location" both repeat together in the next row OR anywhere else in the 2 columns. The output table would have the same column data present in the original table with 3 added columns of "Count" , "Tool 1", "Tool 2" (number of tools can increase to 5).

Output for this table would be as follows. Since row 3 and 4 repeat per the 2 column condition (parameter and location) that would be a count of 1, then it repeats again with the same conditions that would be the 2nd count for a total of 2. The row to be displayed would depend on the first one of 3 rows by order of date. Finally the Tool column would populate by order of date as well (ie. Tool 1 (or E) came in first at 1:10PM , then Tool 2 (or F) came in at 2:10PM) etc..

brady_brady · Feb 6, 2023 02:30 PM

Hello,

I *think* I see what you are looking for, but as Jim mentions, more context is better. The following produces the output you describe, for the data you have provided, but since there is only 1 line of output to go on... it is just a guess. I assume, based on the input and output you provided, that in addition to parameter and location, process and part ALSO are used as grouping variables (?), and that Count will always be 1 less than n Rows in a summary of the data. These are easy enough to change in the code below if not the case, but again... just a guess.

At any rate, try this out to see if it is what you need and if it is not, hopefully you're a quick code mod away from what you want.

Cheers,

Brady

Names Default To Here( 1 );

dt = New Table( "Sample",
	Add Rows( 6 ),
	Set Header Height( 46 ),
	New Column( "Date/Time",
		Numeric,
		"Continuous",
		Format( "y/m/d h:m", 19 ),
		Input Format( "y/m/d h:m" ),
		Set Values( [3273127800, 3273124200, 3273142200, 3273138600, 3273152400, 3273149400] )
	),
	New Column( "Parameter", Character, "Nominal", Set Values( {"A.1", "A.2", "A.3", "A.3", "A.2", "A.3"} ) ),
	New Column( "Location", Character, "Nominal", Set Values( {"B.1", "B.2", "B.3", "B.3", "B.1", "B.3"} ) ),
	New Column( "Process", Character, "Nominal", Set Values( {"C.1", "C.2", "C.3", "C.3", "C.1", "C.3"} ) ),
	New Column( "Tool", Character, "Nominal", Set Values( {"E.", "E", "F", "E", "E", "G"} ), Set Display Width( 43 ) ),
	New Column( "Part", Character, "Nominal", Set Values( {"F-1.1", "F-1.2", "F-1.1", "F-1.1", "F-1.1", "F-1.1"} ) )
);

/////////////////BEGIN PROCESSING

//create a summary table
dtSum = dt << summary( group(:parameter, :location, :"Date/Time"n, :process, :tool, :part), Link to original data table( 0 ) );

//create a column for easy group navigation
dtSum << new column ("plpp", character, formula(:parameter || :location || :process || :part));

//process table, from bottom up
for each ( { r }, nrow(dtSum) :: 2,
	if ( dtSum:plpp[r] == dtSum:plpp[r-1],
		dtSum:tool[r-1] ||= evalinsert(", ^dtSum:tool[ r ]");
		dtSum:n Rows[r - 1] += dtSum:n Rows[ r ];
		dtSum << delete rows ( r );
	);
);

// delete rows with no duplicates
dtSum << deleteRows (dtSum << get rows where (:n Rows == 1) ) 
			<< Move Selected Columns( {:"Date/Time"n}, To First )
			<< Move Selected Columns( {:Tool}, To Last );
			
// rename nRows to count, after subtracting 1
dtSum:Nrows[1::nrow(dtSum)]--;
dtSum:NRows << set name ("Count");

// split out tool column
dtSum << Text to Columns( columns( :Tool ), Delimiters( "," ) );
dtSum << delete columns ({"Tool", "plpp"});

View solution in original post

txnelson · Feb 9, 2023 08:27 PM

Both Brady's and my version of the code work on the simple table you provided. I concur with hime that if his code does not work please provide a more populated table that we can test against.

Jim

View solution in original post

txnelson · Nov 17, 2022 07:07 AM

Here is a script that should give you a good start in your data conversion

Names Default To Here( 1 );

// Create a sample table
dt = New Table( "Sample",
	Add Rows( 6 ),
	Set Header Height( 46 ),
	New Column( "Date/Time",
		Numeric,
		"Continuous",
		Format( "y/m/d h:m", 19 ),
		Input Format( "y/m/d h:m" ),
		Set Values( [3273127800, 3273124200, 3273142200, 3273138600, 3273152400, 3273149400] )
	),
	New Column( "Parameter", Character, "Nominal", Set Values( {"A.1", "A.2", "A.3", "A.3", "A.2", "A.3"} ) ),
	New Column( "Location", Character, "Nominal", Set Values( {"B.1", "B.2", "B.3", "B.3", "B.1", "B.3"} ) ),
	New Column( "Process", Character, "Nominal", Set Values( {"C.1", "C.2", "C.3", "C.3", "C.1", "C.3"} ) ),
	New Column( "Tool", Character, "Nominal", Set Values( {"E.", "E", "F", "E", "E", "G"} ), Set Display Width( 43 ) ),
	New Column( "Part", Character, "Nominal", Set Values( {"F-1.1", "F-1.2", "F-1.1", "F-1.1", "F-1.1", "F-1.1"} ) )
);

// The calculation script starts here
// Calculate the Counts
dtSum = dt << Summary(
	Group( :Parameter, :Location ),
	Freq( "None" ),
	Weight( "None" ),
	Link to original data table( 0 ),
	output table name( "Summary" )
);

// Delete rows that have no matches
dtSum << select where( dtSum:N Rows == 1 );
If( N Items( dtSum << get selected rows ) > 0,
	dtSum << delete Rows
);

// The count is equal to how many rows, not how many matches so subtract 1 to count the matches
For Each Row(
	dtSum:N Rows = dtSum:N Rows - 1;
);

dtSum:N Rows << set name("Count");

// Create a new table from original that contains only the data with matches
dtJoin = dt << Join(
	With( dtSum ),
	Merge Same Name Columns,
	Match Flag( 0 ),
	By Matching Columns( :Parameter = :Parameter, :Location = :Location ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 ),
	Output Table( "Joined" )
);

// Sort the data to set tool usage order
dtJoin << Sort( By( :Parameter, :Location, :"Date/Time"n), Replace Table(1));

// Create a new column for tool order
dtJoin << New Column("Tool Col", character, set each value(
	"Tool " || Char( Col Cumulative Sum(1, :Parameter, :Location))
));

// Split the data to the final output for
dtFinal = dtJoin << Split(
	Split By( :Tool Col ),
	Split( :Tool ),
	Group( :Parameter, :Location ),
	Output Table( "Final" ),
	Sort by Column Property
);

// Clean up
close( dtSum, nosave );
close( dtJoin, nosave );

The documentation on the various Platforms and Functions are documented in the Scripting Index and in the Scripting Guide. Both are available under the Help pulldown menu.

Jim

jmper99 · Feb 3, 2023 04:49 PM

Hi,

First, my apologies for the long delay, I was pulled in multiple directions. Next, thank you for the script it is almost perfect and a "Final" table was output. What I found was the count column does what it is supposed to do, however, the "Tool" does not anchor the data. Meaning for each tool I should have a total "count" of hits/counts, so one column for "count" and the respective column for tool next to it should have the number of hits That is also grouped by the other columns like parameter, location etc... What I get is pictured below, where the tools get split out into different columns in addition to being spread out in rows associated with a count. And the newly created "tool" column repeats the same ID

txnelson · Feb 4, 2023 05:21 PM

Is the above output example, and example of what the output should look like, or is it an example of how the output is currently incorrect?

Would it be possible for you to attach the JMP data table you are using, so there is an example of some real data that can be used for testing? Also, if the above is not an example of what the data should look like, could you provide a simple example of that? I am confused about what you are really looking for.

Jim

jmper99 · Feb 9, 2023 07:05 PM

Hi,

Yes the latest screen capture I sent is of the output from your script as what you labeled the "Final" table. Forgive me for not sharing the specific table in question with the data as I have to update the information to reflect non-ip parameters. I'll try to do a better job of explaining. As I mentioned in the initial question the table below is a sample case

and the output should look like the table below

I want to create a summary table of the number of times (repeats), or count, a "tool" hits a "location" within the same day of "date/time" (not necessarily specific time). So if a row repeats per the location, parameter, and process it should count that as 1 repeat. If that second row is on the same or different tool it doesn't matter so long as it meets the 3 criterion mentioned for 1 repeat count. Then, where the output table example says "Tool1, "Tool2" etc... that would be a record of which tools were involved in the repeat event for those repeats. So if it went to tool "E", then the (location, parameter, process) indicates another event happened on the same day and it was on tool "F" then a column would be added to the right reflect that the first event happened on E and second on F and so on. Eventually I want to create set of plots in jmp from that summary where I would be able to display the count of those repeat events by tool, parameter, location etc... In addition to having which tools were involved in those repeat events. I hope that explains a little more on my thoughts. And thank you for your time on this I appreciate it.

brady_brady · Feb 6, 2023 02:30 PM

Hello,

I *think* I see what you are looking for, but as Jim mentions, more context is better. The following produces the output you describe, for the data you have provided, but since there is only 1 line of output to go on... it is just a guess. I assume, based on the input and output you provided, that in addition to parameter and location, process and part ALSO are used as grouping variables (?), and that Count will always be 1 less than n Rows in a summary of the data. These are easy enough to change in the code below if not the case, but again... just a guess.

At any rate, try this out to see if it is what you need and if it is not, hopefully you're a quick code mod away from what you want.

Cheers,

Brady

Names Default To Here( 1 );

dt = New Table( "Sample",
	Add Rows( 6 ),
	Set Header Height( 46 ),
	New Column( "Date/Time",
		Numeric,
		"Continuous",
		Format( "y/m/d h:m", 19 ),
		Input Format( "y/m/d h:m" ),
		Set Values( [3273127800, 3273124200, 3273142200, 3273138600, 3273152400, 3273149400] )
	),
	New Column( "Parameter", Character, "Nominal", Set Values( {"A.1", "A.2", "A.3", "A.3", "A.2", "A.3"} ) ),
	New Column( "Location", Character, "Nominal", Set Values( {"B.1", "B.2", "B.3", "B.3", "B.1", "B.3"} ) ),
	New Column( "Process", Character, "Nominal", Set Values( {"C.1", "C.2", "C.3", "C.3", "C.1", "C.3"} ) ),
	New Column( "Tool", Character, "Nominal", Set Values( {"E.", "E", "F", "E", "E", "G"} ), Set Display Width( 43 ) ),
	New Column( "Part", Character, "Nominal", Set Values( {"F-1.1", "F-1.2", "F-1.1", "F-1.1", "F-1.1", "F-1.1"} ) )
);

/////////////////BEGIN PROCESSING

//create a summary table
dtSum = dt << summary( group(:parameter, :location, :"Date/Time"n, :process, :tool, :part), Link to original data table( 0 ) );

//create a column for easy group navigation
dtSum << new column ("plpp", character, formula(:parameter || :location || :process || :part));

//process table, from bottom up
for each ( { r }, nrow(dtSum) :: 2,
	if ( dtSum:plpp[r] == dtSum:plpp[r-1],
		dtSum:tool[r-1] ||= evalinsert(", ^dtSum:tool[ r ]");
		dtSum:n Rows[r - 1] += dtSum:n Rows[ r ];
		dtSum << delete rows ( r );
	);
);

// delete rows with no duplicates
dtSum << deleteRows (dtSum << get rows where (:n Rows == 1) ) 
			<< Move Selected Columns( {:"Date/Time"n}, To First )
			<< Move Selected Columns( {:Tool}, To Last );
			
// rename nRows to count, after subtracting 1
dtSum:Nrows[1::nrow(dtSum)]--;
dtSum:NRows << set name ("Count");

// split out tool column
dtSum << Text to Columns( columns( :Tool ), Delimiters( "," ) );
dtSum << delete columns ({"Tool", "plpp"});

txnelson · Feb 9, 2023 08:27 PM

Both Brady's and my version of the code work on the simple table you provided. I concur with hime that if his code does not work please provide a more populated table that we can test against.

Jim

jmper99 · Feb 17, 2023 02:20 PM

Thank you this can also work with a little grouping manipulation.

jmper99 · Feb 17, 2023 02:19 PM

Thank you this worked !

Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output

Re: Cell comparison with conditional output