- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Duplicate values in a columns
I would like to create a formula in a new column to count duplicate rows in another column how do I do this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a colums
Hello @lisaash,
If I understand your problem correctly, you may find different options to create a count column with the number of duplicates in this topic : Solved: What is the easiest way to count the number of occurrences in a column when rows... - JMP Us...
Hope it helps you,
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a colums
The
Col Number( xCol, <byVar, ...> )
function will work for this.
Using the Big Class sample data table, counting all of the individuals that have the same weight, one can create a new column and apply this formula
Col Number( :Weight, :Weight );
to get the results
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a colums
Hi, I am interested in a more generic solution/script.
Early in my work flow I would like to identify columns with only a single value since in essence they contain no information and will actually throw some errors if used in the wrong analysis. For example some of the data files I am working with have 600+ columns of which as many as 40-50 may contain only the same repeated values. Note they may be different from file to file so I need to keep them.
Now I need to identify them but I do not want to delete them, but I would like to get them out of the way.
For a given data table, I would like to identify all columns that have only a single value and group them into group with a label such as "no info".
How would you approach a generic script that I could add to the add-ins?
Andy
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a colums
Here is a little script that will get you started on what you want to do.
Names Default To Here( 1 );
dt = Current Data Table();
// Loop across the columns
For( i = N Cols( dt ), i >= 1, i--,
Summarize( dt, bygroup = by( As Column( i ) ) );
// Remove blank/missing category
If( bygroup[1] == "" | bygroup[1] == ".",
Remove From( bygroup, 1, 1 )
);
// If only 1 or less values are found, delete the column
If( N Items( bygroup ) <= 1,
dt << delete columns( i )
);
);
Please take the time to make sure you understand how the above code works. You will need to do that to make the changes to the code to get it to do exactly what you want the final product to work.
Scripting documentation is in the Scripting Guide and in the Scripting Index.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a colums
Thank you for your great help. But could you help me a little more?
how can I count duplicated "Weight" if it happens with 2 different ages or 3 as below?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a colums
Assuming you want to know the count across all ages for each unique weight, here is one way to do it
names default to here(1);
dt=
// Open Data Table: Big Class.jmp
// → Data Table( "Big Class" )
Open( "$SAMPLE_DATA/Big Class.jmp" );
// Create a table that has only unique values of weight for each age
dtOne = dt << Summary(
Group( :age, :weight ),
Freq( "None" ),
Weight( "None" ),
output table name( "Summary of Big Class grouped by age, weight" )
);
// Create a table that has the counts of the different weights across ages
dtTwo = dtOne <<
Summary(
Group( :weight ),
Freq( "None" ),
Weight( "None" ),
output table name(
"Summary of Summary of Big Class grouped by age, weight grouped by weight"
)
);
// Add the counts to the original data table
dt << Update(
With(
Data Table(dtTwo
)
),
Match Columns( :weight = :weight )
);
close( dtOne, nosave );
close( dtTwo, nosave );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a colums
thank you so much but it's a little different with my target. My homework is like that below. But so far, I don't have ideal how to deal with that
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a colums
Use
Tables=>Subset
with the By option based on Process
Then use
Tables=>Join
To join matching on Model and you can select the Inner Join as you indicate
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Duplicate values in a columns
I'm afraid I don't understand exactly what you're looking for.
Could you give an example table showing what this new column would contain?