cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Blanco15
Level I

creating new variables derived from multiple rows per ID

Hello ,

 

I am trying to create one variable (type2) from multiple rows per ID using variable type. I want to covert a table like this one:

ID

type

1

A

1

A

1

B

2

A

2

A

3

B

3

B

3

B

 

To:

ID

type2

1

C

2

A

3

B

 

Where type2 is:  

If type is always A then type2=A;

If type is always B then type2=B;

If type is a combination of A and B then type2=C;

 

Thanks

2 REPLIES 2
jthi
Super User

Re: creating new variables derived from multiple rows per ID

Many ways to do this. I would calculate how many times type appears in ID and if it is the same amount as ID appears in data, then you know it is the only type.

 

Summary with formula:

Names Default To Here(1);

dt = New Table("From",
	Add Rows(8),
	New Column("ID", Numeric, "Nominal", Format("Best", 12), Set Values([1, 1, 1, 2, 2, 3, 3, 3])),
	New Column("type", Character(16), "Nominal", Set Values({"A", "A", "B", "A", "A", "B", "B", "B"}))
);

dt_summary = dt << Summary(
	Output table("To"),
	Group(:ID),
	N(:type),
	Subgroup(:type),
	Freq("None"),
	Weight("None"),
	statistics column name format("column"),
	Link to original data table(0)	
);

dt_summary << New Column("type2", Character, "Nominal", 
	<< Set Each Value(
		If(:"type, A"n == 0, "B",
			:"type, B"n	== 0, "A",
			"C"
		)
	)
);

And other using Col Sum with Select duplicated rows:

Names Default To Here(1);
dt = New Table("From",
	Add Rows(8),
	New Column("ID", Numeric, "Nominal", Format("Best", 12), Set Values([1, 1, 1, 2, 2, 3, 3, 3])),
	New Column("type", Character(16), "Nominal", Set Values({"A", "A", "B", "A", "A", "B", "B", "B"}))
);
dt << New Column("type2", Character, Nomimal, Formula(
	If(Col Sum(1, :ID) == Col Sum(1, :ID, :type),
		:type,
		"C"
	)
));
wait(1);
dt:type2 << Delete Formula;
wait(1);
dt << Select Duplicate Rows(Match(:ID, :type2));
wait(1);
dt << Delete Rows;

See Scripting Index for more information about the functions especially Col Sum.

 

-Jarmo

Re: creating new variables derived from multiple rows per ID

Hello,

 

It looks like your table may be sorted by :ID and :type. If it is, or if you are willing to sort it, you can take advantage of this and use a (relatively) simple formula for the :type2 column:

If( :type[Col Min( Row(), :ID )] == :type[Col Max( Row(), :ID )],
	:type,
	"C"
)

Cheers,

Brady