Choose Language Hide Translation Bar
Highlighted

How to hide/exclude duplicate IDs except for the first ID?

Hello awesome people,

I am new to JMP and needs some help with cleaning the data set.

I have several duplicates in my data set, where one individual is admitted multiple times in a given period.

I want only the first episode of the individual, according to the admission date (m/d/yyyy)


For instance,

Respondent with an ID number ABC11 appeared twice in my data set. The respondent was admitted to the treatment  in 7.14.2013 and then readmitted in 8.25.2014. I want to keep only the information of ABC11 in 7.14.2013)


Thanks to previous discussions, I figured how to remove the duplicates joining the table, matching the columns, and dropping duplicates. ​This way, I create a new data file with Unique client sample (n=4,000)  without the duplicates from the original file with total admitted sample (n=6000)

However, now I want to hide/exclude those duplicates and leave them in the same data file,

so that I can flexibly compare and contrast Total Admitted sample (n=6000, with duplicates) and Unique client sample (n=4000, without duplicates) in later analysis.

Unless, I'll have to run analysis separately and combine the result in data reports, sounds like a hassle.

Blessings,

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
ms
Super User ms
Super User

Re: How to hide/exclude duplicate IDs except for the first ID?

Sorry, I was too quick to answer. The Date column should not be a grouping variable in this case.

Sort by date and try this:

dt << select where(Col Min(Row(), :ID) < Row()) << hide and exclude;

View solution in original post

6 REPLIES 6
Highlighted
ms
Super User ms
Super User

Re: How to hide/exclude duplicate IDs except for the first ID?

With JSL:

// Exclude all but first row of multiples (sort by date to get the earliest...)

dt = Current Data Table();

dt << select where(Col Min(Row(), :Respondent, :Date) < Row()) << hide and exclude;

Same approach with column formula:

dt<<new column("exclude", Row State, Formula(ExcludedState(Col Min(Row(), :Respondent, :Date) < Row())))

Highlighted

Re: How to hide/exclude duplicate IDs except for the first ID?

Hello thanks for your answer,

I copied and pasted the JSL into my script but for some reason it didn't work.

More precisely, it worked only for few but not all.

12953_pastedImage_1.png

Didn't work for these ones.

12952_pastedImage_0.png

As you can see (sorry for the bad resolution), we have total three ID AL2063098, and I wanted row number 333 and 334 to be deleted but only row 334 was excluded/hided.

Below is the JSL I used.

// Exclude all but first row of multiples (sort by date to get the earliest...)

dt = Current Data Table();

dt << select where(Col Min(Row(), :ID, :Date) < Row()) << hide and exclude;

Highlighted
ms
Super User ms
Super User

Re: How to hide/exclude duplicate IDs except for the first ID?

Sorry, I was too quick to answer. The Date column should not be a grouping variable in this case.

Sort by date and try this:

dt << select where(Col Min(Row(), :ID) < Row()) << hide and exclude;

View solution in original post

Highlighted

Re: How to hide/exclude duplicate IDs except for the first ID?

Thank you so much, it worked!

If you have time, do you care to elaborate on the meaning of this script?

Or where can I find a way to learn how to use JSL?

Highlighted
ms
Super User ms
Super User

Re: How to hide/exclude duplicate IDs except for the first ID?

Sure, I'll try...


Row() is the "current" row.


Col Min(Row(), :ID) returns the smallest row number for for each :ID


dt << select where(...) iterates over all rows and selects those that match the condition inside the parentheses.

Here, the condition being tested is Col Min(Row(), :ID) < Row(). It evaluates to True for every row with a row number greater than the first row of each ID. Only the first instance of an ID, i.e. where Col Min(Row(), :ID) equals Row(), will evaluate to false and thus not be selected.


<< hide and exclude send a message to the the data table to hide and exclude all selected rows.


I learned JSL by reading the Scripting Guide (PDF book), interacting with users on this forum and loads of trial and error.

Also the Scripting Index (an excellent resource in the Help menu) has been indispensable for looking up syntax on the go.

But I think that most important for my JSL learning curve was my long experience of JMP. Without knowing JMP, JSL wouldn't have made much sense to me.

Highlighted

Re: How to hide/exclude duplicate IDs except for the first ID?

Dear MS,

now the syntax look like a language to mem but I guess I still have a long way to get

Hope you have a good day

Article Labels

    There are no labels assigned to this post.