Re: How to determine duplicate values and delete the data with lowest N rows val...

UserID16644 · Aug 7, 2024 12:08 AM

Hi all,

I have this piece of code but I'm quite not sure if there is a more appropriate approach to this. My loop gets an error because some of the data table have multiple NBO values.

Data out of range at row 10 in access or evaluation of 'For Each' , For Each/*###*/({table}, dt_Test, table << New Column( "Ref" ) ; Wait( 0 ) ; table
:Ref[table << get rows where( :Lot == "NBO" & :N Rows == Col Max( :N Rows, :Lot ) )]
= 1)

So what I am trying to do is create an If statement inside the For Each loop. The IF statement will determine if NBO is has no duplicates on the Lot Column, if it has no duplicates, then it will put 1 on Ref column.

For Each( {table}, dt_Test,
	table << New Column("Ref");
	wait(0);
		If (
		//NBO has duplicate, deletes the NBO with lowest N Row value
		dt_Test << Select duplicate rows(Match(:Lot)); //only selects the matching values, how can I delete the NBO with lowest N Rows count?
		dt_Test << Delete Rows;
		,
		//if NBO has no duplicates,
		table:Ref[table << get rows where( :Lot == "NBO" & :N Rows == col max(:N Rows, :Lot))]=1;	
		);
	
);

jthi · Aug 7, 2024 12:34 AM

Would make this much easier if we could see some of the data.

-Jarmo

UserID16644 · Aug 7, 2024 01:05 AM

Here is a sample data table after subset. dt has 2 NBOs and this works fine.

table:Ref[table << get rows where( :Lot == "NBO" & :N Rows == col max(:N Rows, :Lot))]=1;

However, in dt2, which I have 1 NBO, the script suddenly stops and has this error:

Data out of range at row 10 in access or evaluation of 'For Each' , For Each/*###*/({table}, dt_Test, table << New Column( "Ref" ) ; Wait( 0 ) ; table
:Ref[table << get rows where( :Lot == "NBO" & :N Rows == Col Max( :N Rows, :Lot ) )]
= 1)

jthi · Aug 7, 2024 01:15 AM

I'm a bit confused with your script and what you are trying to do. Are you trying to find duplicated lots and delete all but last row? And then set NBO as 1 (there should be only one left)?

-Jarmo

UserID16644 · Aug 7, 2024 01:23 AM

Sorry, I also got confused writing the question. So for the data, first I'll subset it then go through each data table to add column Ref and its value as 1 where NBO has the highest count in N Rows. I initially got this working:

For Each( {table}, dt_Test,
	table << New Column("Ref");
	wait(0);
		table:Ref[table << get rows where( :Lot == "NBO" & :N Rows == col max(:N Rows, :Lot))]=1;	
);

But when it looped to a data table with only 1 NBO present in the column, I am having this error:

Data out of range at row 10 in access or evaluation of 'For Each' , For Each/*###*/({table}, dt_Test, table << New Column( "Ref" ) ; Wait( 0 ) ; table
:Ref[table << get rows where( :Lot == "NBO" & :N Rows == Col Max( :N Rows, :Lot ) )]
= 1)

jthi · Aug 7, 2024 01:33 AM

Maybe the issue is somewhere else? What does dt_test contain for example? And dt_test and table are used in a bit weird manner (also the if might not be doing what you think here)

I would modify your code a bit to make it easier to debug (don't place get rows where inside the [])

Names Default To Here(1);

dt = Open("$DOWNLOADS/dt.jmp");
dt2 = Open("$DOWNLOADS/dt2.jmp");

dts = Eval List({dt, dt2});

For Each({cur_dt}, dts,
	cur_dt << New Column("Ref");
	//cur_dt = dts[2];
	sel_rows = cur_dt << get rows where(:Lot == "NBO" & :N Rows == Col Max(:N Rows, :Lot));
	cur_dt:Ref[sel_rows] = 1;
);

This way you can print the sel_rows and see if it is empty / what it has

-Jarmo

Byron_JMP · Aug 7, 2024 7:07 AM

I might suggest a completely different approach.

1. sort the table by date, or what ever tells you what the most current duplicate is.

2. join the table to its self (Tables>Join), and click the drop multiples option

-Only the first, by row order, of the duplicates ends up in the product table.

do this manually once with the workflow builder turned on, and then Bang - Bob's your uncle - you've got the script to do it again. scripting... without scripting.

JMP Systems Engineer, Health and Life Sciences (Pharma)

hogi · Aug 7, 2024 05:25 PM

nice trick

How to determine duplicate values?

Re: How to determine duplicate values and delete the data with lowest N rows value?

Re: How to determine duplicate values and delete the data with lowest N rows value?

Re: How to determine duplicate values and delete the data with lowest N rows value?

Re: How to determine duplicate values and delete the data with lowest N rows value?

Re: How to determine duplicate values and delete the data with lowest N rows value?

Re: How to determine duplicate values?

Re: How to determine duplicate values?