Discussions

colez · Jun 8, 2023 5:28 PM

This is a pretty elementary question, but how should one go about analysis of a large human clinical research data set with multiple rows for a given participant?

Lets say there are columns titled "ParticipantID", "Visit", and "HasBeard" each containing categorical data.

For each ParticipantID (i.e. 001; 002; 003) there are multiple study Visits (i.e. V1; V2; V3 etc) with each Visit containing unique data points on various measures such as "HasBeard".

Thus there are multiple rows for a single Participant ID (i.e. Participant 001, V1; Participant 001, V2; Participant 001, V3 etc.)

SEE ATTACHED.

If I were to do an analysis on a certain column measure such as "HasBeard" that contains categorical data (0 = has no beard, 1 = has beard), and want to know the distribution of Participants that have a beard at V3 but exclude those that already had a beard at V1, what is the most efficient and elegant method?

Given the provided image, I want to only return Participant 002, once.

Do I want to combine a Data Filter and a Local Data Filter? Do I want to exclude certain rows first using the Exclude function?

What is the best way to leverage JMP PRO for this sort of basic work?

Chris_Kirchberg · Feb 8, 2021 06:44 PM

Hi @colez,

Looks like you are asking for a basic distributional type report. Are you looking to do number of patients with a beard or the number of times a patient has been seen to have a beard? In other words what kind of graph are you looking for? Visit in X axis with counts (unique or multiple) on Y axis? Is there only one event or multiple events? Maybe post a picture of what your are looking for.

I don't think JMP Pro is going to give you much of an advantage. If you have have structured data (and many), maybe JMP Clinical is the right tool for you since it is designed for clinical trials.

Clinical Data Analysis Software | JMP Clinical

Chris Kirchberg, M.S.²
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com

colez · Feb 8, 2021 06:57 PM

Hello @Chris_Kirchberg, thank you for the swift response!

I'm looking for the number of Participants who had a beard by their third visit (V3) but with the caveat that they did not already have one at their first visit (V1). (This premise humorously assumes that you can't lose a beard once you have one, so if they get a beard by V2, they will also have one at V3)

In other words, how many participants converted to having a beard over the course of the study? (The correct answer for my given example is 1.)

Notably in the image I attached, there are NINE rows total, but only THREE unique participants.

Out of those THREE participants, only Participant 002 actually meets the criteria I outlined.

Even though Participant 001 also had a beard at V3, (HasBeard = Yes), they do not meet the criteria because they already had a beard at V1.

So my query ought to return a value of 1, representing Participant 002 as a single participant despite their multiple visits/timepoints.

txnelson · Feb 8, 2021 07:39 PM

I would just use the Summary Platform and group the data based on Participant ID and HasBeard. Then all you have to do is to select all of the rows that are designated as HasBeard=No, and delete them. What is remaining is all of the unique ParticipantIDs that have Beards.

Jim

colez · Feb 8, 2021 09:04 PM

Right but this grouping would still include Participants who had a beard at V1, the crucial exclusionary criterion.

txnelson · Feb 8, 2021 10:13 PM

My error.

All you need to do to get a data table with just those who have grown a beard, is to create a new column and give it the below formula

curPart = :ParticipantID;
x = Current Data Table() << get rows where( :ParticipantID == curPart & :Visit == "V1" & :HasBeard == "No" );
y = Try( N Rows( x ) );
x = Current Data Table() << get rows where( :ParticipantID == curPart & :Visit == "V3" & :HasBeard == "Yes" );
z = Try( N Rows( x ) );
If( y == 1 & z == 1,
	"Yes",
	"No"
);

Then, just select all of the rows with a Yes for the new columns value, and then subset those rows into a new data table. If you then need to reduce the table to just one row per ParticipantID, you can then use the Summary Platform as I suggested.

Jim

Chris_Kirchberg · Feb 9, 2021 08:48 AM

Thanks @txnelson ,

I was about to suggest a where clause. It could even be put into a column formula nested within an If statement and put a 1 or a 0 for that row. That way, you might be able to plot it.

A crazy ideal is to plot it as is. Has Beard on Y, Visit on X. ID on Group Y. Then make a mosaic plot (Sorry @XanGregg, its ugly, but gets the point across) . Works OK, but if you have hundreds of patients or more, then, maybe not as good and a rule based approach like @txnelson would be better. Especially if your intent is to find them and then do something (like find more details, other patient demographics, as to what else is going on with them that might have given rise to a sudden appearance of a beard).

Chris Kirchberg, M.S.²
Data Scientist, Life Sciences - Global Technical Enablement
JMP Statistical Discovery, LLC. - Denver, CO
Tel: +1-919-531-9927 ▪ Mobile: +1-303-378-7419 ▪ E-mail: chris.kirchberg@jmp.com
www.jmp.com

dale_lehman · Feb 9, 2021 09:40 AM

I would first use Tabulate, with the ID as the rows and the V# as the columns. This would reformat the table more in lines with the questions you are asking (it is not a necessary step, but it makes the logic for your formulas and/or displays clearer). You will then have one row for each person, with a separate column for each visit - the body of the table can either indicate the number of rows with those two variables, or if a continuous variable could be the max, min, average, sum, etc of that continuous column. In the new reformatted data table, it should be relatively straightforward to use logical statements (such as IF, AND, OR) to create columns identifying aspects such as the first visit at which a person has a beard.

Discussions

How to consider multiple rows for a given Participant ID

Re: How to consider multiple rows for a given Participant ID

Re: How to consider multiple rows for a given Participant ID

Re: How to consider multiple rows for a given Participant ID

Re: How to consider multiple rows for a given Participant ID

Re: How to consider multiple rows for a given Participant ID

Re: How to consider multiple rows for a given Participant ID

Re: How to consider multiple rows for a given Participant ID

Recommended Articles