Subscribe Bookmark RSS Feed

Transposing from short form to long form data

benson_munyan

Community Trekker

Joined:

Dec 17, 2014

Hello Everyone,

I am trying to figure out a way to reorganize my data from a single row per subject that contains data for two trials into two rows for each subject, each containing data from a single trial. My analyses will all be repeated measures based. I am able to get this done at an individual variable, but when I try to get more complicated, it fails. For instance, I can get 2 rows per subject with both scores from an item, but if I do two variables, I end up with 4 rows, instead of 2 rows and 2 columns.

I have looked at the documentation and tried fiddling around with this, but haven't had any luck. If anyone has any ideas, I would be much appreciative.

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

Hi benson.munyan,

You can do this with a "multiple series stack." For instance, in the example dataset below I have 4 subjects measured on two "series" -- variable 1 and variable 2 -- each measured at two time-points.

7796_Screen Shot 2014-12-18 at 9.46.01 AM.png

If I want to stack these data such that I have 8 rows, 2 for each subject, with separate columns for variable 1 and variable 2, we can check the box in Tables > Stack for "Multiple Series Stack." In this case, we have 2 series, and the columns are contiguous; that is, the multiple observations for each variable are next to each other, rather than interleaved (e.g. we don't have the columns like this: Var1-Time1, Var2-Time1, Var1-Time2, Var2-Time2).

Here is the set up for Tables > Stack.

7797_Screen Shot 2014-12-18 at 9.46.15 AM.png

And here is the dataset this will produce:

7798_Screen Shot 2014-12-18 at 9.46.25 AM.png

To clean things up, I would rename "Data" to be "Var 1", and "Data 2" to be "Var 2." Then, I would select the "Label" column, and invoke Cols > Recode, and recode each level to be either Time 1 or Time 2, so you will have a variable identifying the time points. "Label 2" is redundant and can be deleted.

I hope this helps!

julian

8 REPLIES
Phil_Brown

Super User

Joined:

Mar 20, 2012

Can you post an example of what you're trying to do? Depending on the data in each row, there may be opportunities to partition based on some delimiter.

PDB
benson_munyan

Community Trekker

Joined:

Dec 17, 2014

Sure. Here's a screenshot of what I'm working with.

Data.jpg

This is an export from Qualtrics. I have one (1) row per subject. Each subject goes through two research trials (T1 and T2). As such, I have to columns for each variable (For instance, IPQ_T1_Score (The score for trial 1) and IPQ_T2_Score (The IPQ score for trial 2)). Instead of have two columns of data for each variable, I would like to have one column for each variable and instead have two rows per subject.

So, Instead of IPQ_T1_Total Score and IPQ_T2_Total Score, I would simply have IPQ Score that is delineated by another column Trial (1 and 2). In short, I want to go from this:

Image%2B2.jpg

To this:

Image%2B4.png

Solution

Hi benson.munyan,

You can do this with a "multiple series stack." For instance, in the example dataset below I have 4 subjects measured on two "series" -- variable 1 and variable 2 -- each measured at two time-points.

7796_Screen Shot 2014-12-18 at 9.46.01 AM.png

If I want to stack these data such that I have 8 rows, 2 for each subject, with separate columns for variable 1 and variable 2, we can check the box in Tables > Stack for "Multiple Series Stack." In this case, we have 2 series, and the columns are contiguous; that is, the multiple observations for each variable are next to each other, rather than interleaved (e.g. we don't have the columns like this: Var1-Time1, Var2-Time1, Var1-Time2, Var2-Time2).

Here is the set up for Tables > Stack.

7797_Screen Shot 2014-12-18 at 9.46.15 AM.png

And here is the dataset this will produce:

7798_Screen Shot 2014-12-18 at 9.46.25 AM.png

To clean things up, I would rename "Data" to be "Var 1", and "Data 2" to be "Var 2." Then, I would select the "Label" column, and invoke Cols > Recode, and recode each level to be either Time 1 or Time 2, so you will have a variable identifying the time points. "Label 2" is redundant and can be deleted.

I hope this helps!

julian

julian

Staff

Joined:

Jun 25, 2014

Also: if your analysis involves several within- and between-subject factors you might get some use out of the full-factorial repeated measures add-in I posted in the file exchange a little while ago. This add-in generates the mixed-model terms for the repeated measures analysis and then invokes Fit Model.

Full Factorial Repeated Measures ANOVA Add-In

julian

benson_munyan

Community Trekker

Joined:

Dec 17, 2014

@Julian

Thank you! That is perfect! With respect to your add in, I believe I already have that installed. Someone there was kind enough to bring it to my attention. In fact, that is one of the big reasons I've been trying to sort this out.

Is there a way that I can add a variable to each case to obviously state which case is which?

julian

Staff

Joined:

Jun 25, 2014

Hi benson.munyan,

You're welcome! I'm glad this will work for you.

Regarding the variable you want to add, what do you want to identify about each case? You should always retain the subject identifier, which stacking will do as long as you retain all the unstacked columns, so you'll know which subject the data in each row is from. Identifying the time (or condition, or whatever else identifies the repeated measurements) should be as simple as retaining one of those label columns that you recode. Together, those two columns (subject ID and stacking group identifier) allow you to recreate the original table using Tables > Split, so nothing as far as identifying cases is lost in this restructuring. Is there something else you want to capture for each row?

Julian

benson_munyan

Community Trekker

Joined:

Dec 17, 2014

Julian,

I've tried to run your add on twice and it has crashed JMP both times. Would you like an error log?

julian

Staff

Joined:

Jun 25, 2014

I'm sorry to hear that! Yes, please provide the log if you can, and what version of JMP and operating system you're having trouble on.

Thanks!