Re: Duplicate values in a colums

lisaash · Jun 9, 2023 8:59 AM

I would like to create a formula in a new column to count duplicate rows in another column how do I do this?

Victor_G · Oct 10, 2022 05:33 AM

If I understand your problem correctly, you may find different options to create a count column with the number of duplicates in this topic : Solved: What is the easiest way to count the number of occurrences in a column when rows... - JMP Us...

Hope it helps you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

txnelson · Oct 10, 2022 5:53 AM

The
Col Number( xCol, <byVar, ...> )
function will work for this.

Using the Big Class sample data table, counting all of the individuals that have the same weight, one can create a new column and apply this formula

Col Number( :Weight, :Weight );

to get the results

Jim

Hegedus1 · Oct 10, 2022 10:01 AM

Hi, I am interested in a more generic solution/script.

Early in my work flow I would like to identify columns with only a single value since in essence they contain no information and will actually throw some errors if used in the wrong analysis. For example some of the data files I am working with have 600+ columns of which as many as 40-50 may contain only the same repeated values. Note they may be different from file to file so I need to keep them.

Now I need to identify them but I do not want to delete them, but I would like to get them out of the way.

For a given data table, I would like to identify all columns that have only a single value and group them into group with a label such as "no info".

How would you approach a generic script that I could add to the add-ins?

Andy

txnelson · Oct 10, 2022 7:27 AM

Here is a little script that will get you started on what you want to do.

Names Default To Here( 1 );
dt = Current Data Table();

// Loop across the columns
For( i = N Cols( dt ), i >= 1, i--,
	Summarize( dt, bygroup = by( As Column( i ) ) );
	// Remove blank/missing category
	If( bygroup[1] == "" | bygroup[1] == ".",
		Remove From( bygroup, 1, 1 )
	);
	// If only 1 or less values are found, delete the column
	If( N Items( bygroup ) <= 1,
		dt << delete columns( i )
	);
);

Please take the time to make sure you understand how the above code works. You will need to do that to make the changes to the code to get it to do exactly what you want the final product to work.

Scripting documentation is in the Scripting Guide and in the Scripting Index.

Jim

Phuong4692 · Jul 31, 2024 05:13 AM

Thank you for your great help. But could you help me a little more?

how can I count duplicated "Weight" if it happens with 2 different ages or 3 as below?

TRANSLATE with x

English

Arabic	Hebrew	Polish
Bulgarian	Hindi	Portuguese
Catalan	Hmong Daw	Romanian
Chinese Simplified	Hungarian	Russian
Chinese Traditional	Indonesian	Slovak
Czech	Italian	Slovenian
Danish	Japanese	Spanish
Dutch	Klingon	Swedish
English	Korean	Thai
Estonian	Latvian	Turkish
Finnish	Lithuanian	Ukrainian
French	Malay	Urdu
German	Maltese	Vietnamese
Greek	Norwegian	Welsh
Haitian Creole	Persian

TRANSLATE with

COPY THE URL BELOW

Back

EMBED THE SNIPPET BELOW IN YOUR SITE

Enable collaborative features and customize widget: Bing Webmaster Portal

Back

txnelson · Jul 31, 2024 07:21 AM

Assuming you want to know the count across all ages for each unique weight, here is one way to do it

names default to here(1);
dt=
// Open Data Table: Big Class.jmp
// → Data Table( "Big Class" )
Open( "$SAMPLE_DATA/Big Class.jmp" );

// Create a table that has only unique values of weight for each age
dtOne = dt << Summary(
	Group( :age, :weight ),
	Freq( "None" ),
	Weight( "None" ),
	output table name( "Summary of Big Class grouped by age, weight" )
);

// Create a table that has the counts of the different weights across ages
dtTwo = dtOne <<
Summary(
	Group( :weight ),
	Freq( "None" ),
	Weight( "None" ),
	output table name(
		"Summary of Summary of Big Class grouped by age, weight grouped by weight"
	)
);

// Add the counts to the original data table
dt << Update(
	With(
		Data Table(dtTwo
		)
	),
	Match Columns( :weight = :weight )
);

close( dtOne, nosave );
close( dtTwo, nosave );

Jim

Phuong4692 · Jul 31, 2024 11:34 AM

thank you so much but it's a little different with my target. My homework is like that below. But so far, I don't have ideal how to deal with that

txnelson · Jul 31, 2024 12:24 PM

Use

Tables=>Subset

with the By option based on Process

Then use

Tables=>Join

To join matching on Model and you can select the Inner Join as you indicate

Jim

Jeff_Perkinson · Oct 20, 2022 12:44 PM

I'm afraid I don't understand exactly what you're looking for.

Could you give an example table showing what this new column would contain?

-Jeff

Duplicate values in a columns