Choose Language Hide Translation Bar
Highlighted
Newbie2Jumpie
Contributor

How to get from JMP table format WIDE to LONG. With some twists... [JSL, scripting]

Hi,

 

Simply put, I just need to get from JMP format WIDE to LONG.

I'll use BIG CLASS to outline the basic idea:

 

dt = Open( "$SAMPLE_DATA/big class.jmp" );

Basic objective: Structure BEFORE

NAME AGE SEX HEIGHT WEIGHT
KATIE 12 F 59 95
LOUISE 12 F 61 123
JANE 12 F 55 74
JACLYN 12 F 66 145

Structure AFTER (let's see NAME and SEX as IDs)

NAME SEX SOURCE_COL VALUE
KATIE F AGE 12 
LOUISE F AGE 12 
JANE F AGE 12 
JACLYN F AGE 12 
KATIE F HEIGHT 59 
LOUISE F HEIGHT 61 
JANE F HEIGHT 55 
JACLYN F HEIGHT 66 
KATIE F WEIGHT 95
LOUISE F WEIGHT 123
JANE F WEIGHT 74
JACLYN F WEIGHT 145...

Now comes the "twist" part. Unfortunately, life is complicated.

Also, the data situation is a bit more complicated. Fictional rows added to original BIG CLASS contents..
Structure dataset BEFORE

NAME AGE SEX HEIGHT WEIGHT
KATIE 12 F 59 95
KATIE 12 F 61 105 <-
LOUISE 12 F 61 123
JANE 12 F 55 74
JANE 12 F 65 84 <-
JACLYN 12 F 66 145

interim processing (aggregating rows)

NAME AGE SEX HEIGHT WEIGHT NO_ROWS
KATIE 12 F 60 100 2
LOUISE 12 F 61 123 1
JANE 12 F 60 79 2
JACLYN 12 F 66 145 1

Structure dataset AFTER (let's see NAME and SEX as IDs)

NAME SEX SOURCE_COL VALUE 
KATIE F AGE 12 1
LOUISE F AGE 12 1
JANE F AGE 12 1
JACLYN F AGE 12 1
KATIE F HEIGHT 60 <- 2
LOUISE F HEIGHT 61 1
JANE F HEIGHT 60 <- 2
JACLYN F HEIGHT 66 1
KATIE F WEIGHT 95 1
LOUISE F WEIGHT 123 1
JANE F WEIGHT 79 <- 2
JACLYN F WEIGHT 145 1

I wrote some JSL code based on BIG CLASS. The first rows are working, but for the later rows I'd appreciate your support.

I have provided the relevant logical steps for your convenience, see (1) to (5) .

 //*********************************************************///
// Create example data tables to be used in the merging
names default to here( 1 );
dt = Open( "$SAMPLE_DATA/big class.jmp" );
	ColmName = dt << get column names(character, numeric);
	show(ColmName) ; 

For( i = 4, i <= N Cols( dt ), i++,
	current data table(dt);
	dt2 = dt << subset( selected rows( 0 ), columns( :NAME, :SEX, Column( i ) ) ));
	dt2 << New Column( "SOURCE_VOL", character, set each value(Column( i )) ) ;
    Column( dt2, 3) << set name( "Value" );

wait(0);


// (1) Get all of the character columns for concatenating
	ColmName = dt << get column names(character, numeric);
	show(ColmName) ; 

// (2) Splitting in subsets per column, e.g. one for AGE, then HEIGHT, then WEIGHT  //
BIG_CLASS_(ColmName[1]) = BIG_CLASS <<  Subset(all rows, columns(:NAME, AGE (:Name(:ColmName[1])) ), Output Table Name("BIG_CLASS_(ColmName[1])") ) ;

// (3) Adding level info to every subset //
	          BIG_CLASS_(ColmName[1])  << new column("SOURCE_VOL", Character, Nominal, set each value("ColmName[1]");  
			  imo_pick_1 =  :"ColmName[1]" << set name ("Value") ; 

// (4) Aggregate per subset           //
	  BIG_CLASS_sub_(ColmName[1]) =   BIG_CLASS_(ColmName[1]) <<
	   Summary(  Group(:NAME, :SEX),
	 
			 sum( :AGE ),

				Link to original data table( 0 ),
				Freq( "None" ),  
				Weight( "None" ) 
				
			 );   
			 
// (5) Concatenate allsubsets to build LONG data structure //
BIG_CLASS_demo =Data Table(BIG_CLASS_sub_(ColmName[1]))  <<
		Concatenate(
			Data Table( "BIG_CLASS_sub_(ColmName[2])" ),
			Data Table( "BIG_CLASS_sub_(ColmName[3])" ),
					 Create source column) ;
					 

My actual WIDE table has about 100.000s of data lines and approx. 100 of columns. All in all JMP may expect to handle +10 million data rows, until it arrives at the LONG format. 

I welcome every advice.

 

Thank you very much!

Newbie

0 Kudos
1 REPLY 1
Highlighted
ms
Super User ms
Super User

Re: How to get from JMP table format WIDE to LONG. With some twists... [JSL, scripting]

Not sure if I have undestood the problem correctly, but it looks like Summary() and Stack() could do the job.

 

Does this produce what you're after? (sorting will be different)


Names Default To Here(1);

//example table
dt = New Table("Class", 

    New Column("name", Character, "Nominal", Set Values({"KATIE", "KATIE", "LOUISE", "JANE", "JANE", "JACLYN"})),
    New Column("age", Numeric, "Ordinal", Set Values([12, 12, 12, 12, 12, 12])),
    New Column("sex", Character, "Nominal", Set Values({"F", "F", "F", "F", "F", "F"})),
    New Column("height", Numeric, "Continuous", Set Values([59, 61, 61, 55, 65, 66])),
    New Column("weight", Numeric, "Continuous", Set Values([95, 105, 123, 74, 84, 145]))
);

//Summary
dtsum = dt << Summary(Group(:name, :sex), Mean(:age), Mean(:height), Mean(:weight), statistics column name format("column"));

// Stack summary table
dtstack = dtsum << Stack(columns(:age, :height, :weight), Source Label Column("Source_Col"), Stacked Data Column("Value"));
0 Kudos