cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Learn how to build custom Python data connectors and further customize JMP’s Data Connector Framework with the Python Data Connector Demo, available now in the JMP Marketplace!
  • See how to create experiments to support product design and ID useful product features. Register for June 12 webinar, 2pm US Eastern Time.

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
Jax
Jax
Level I

Calculating Percentage difference for a big amount of data

snipe.PNG

Hi everyone! 
I have attached a example file under this post.
I'm currently trying to calculate the percentage of difference,
which is given by (New Value- Original Value)/Original Value
In this case, Time 0 is taken as the reference ( Original Value).
 For example:
Time: 0, Part: 1 Value 1: 690.2
Time: 50, Part:1, Value1: 690.52
(690.52-690.2)/(690.2) = 0.046%
Time: 100, Part:1, Value1:689.25
(689.25-690.2)/(690.2)= -0.13%
The result I'm suppose to get is to have a new table with the Time, id, Part unchanged,
while having the new values of Value 1 and Value 2 in percentage format. 
I have thought of moving the Different Time values into different tables,
then compute it against the Time 0 and append it to a new table, but I have no clue on 
how to write it on script. 
I have written some script in the "Percentage Different Progress" file, but I'm stuck.
I really appreciate it if anyone could help me in this. 
Thank you so so much in advance.

Capture1.PNG

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Calculating Percentage difference for a big amount of data

The Value 2 results can easily replace the Value 1 results.

No you do not need to hard code the JSL for all of the columns.  I have expanded my previous code to illistrate how to run this on as many columns as there are in the data table.  I also made the assumptions that you would not want the original Value1 data changed for the Time=0 rows, but if that is not the case, then just remove the If() statement that is making that decision

Names Default To Here( 1 );
dt = Current Data Table();

// Create a subset with only the data where Time = 0
dt2 = dt << subset( Rows( dt << get rows where( :Time == 0 ) ), selected columns( 0 ) );

// Delete the Time column.  In the join step below, it
// will write on top of the original Time column if
// it remains in the subsetted data table
dt2 << delete columns( "Time" );

// Create a list of the column names so all of them can be 
// changed to a new name for the Join below.  Eliminate
// the columns "id" and "Part" from the name change since
// they will be used in the column matching in the Join.
divisorColNames = dt2 << get column names( string );
For( i = N Items( divisorColNames ), i >= 1, i--,
	If( divisorColNames[i] == "id" | divisorColNames[i] == "Part",
		divisorColNames = Remove( divisorColNames, i, 1 )
	)
);

// Rename all of the divisor columns by "adding divisor_"
// in front of each name
For( i = 1, i <= N Items( divisorColNames ), i++,
	Column( dt2, divisorColNames[i] ) << set name(
		"divisor_" || Char( Column( dt2, divisorColNames[i] ) << get name )
	)
);

// Join the data tables.  All of the matching id and part
// values will be joined to all rows with the same values
dtJoin = dt << Join(
	With( dt2 ),
	Merge Same Name Columns,
	Match Flag( 0 ),
	By Matching Columns( :id = :id, :Part = :Part ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 1, 0 ),
	Preserve main table order( 1 )
);

// The dt2 data table is no longer needed so close it
Close( dt2, nosave );

// Loop through all of the columns and create the new values
// unless the Time value is 0
For( theCol = 1, theCol <= N Items( divisorColNames ), theCol++,
	For( theRow = 1, theRow <= N Rows( dtJoin ), theRow++,
		If( dtJoin:Time[theRow] != 0,
			Column( dtJoin, divisorColNames[theCol] )[theRow] = (
			Column( dtJoin, divisorColNames[theCol] )[theRow]
			-Column( dtJoin, "divisor_" || divisorColNames[theCol] )[theRow]) /
			Column( dtJoin, "divisor_" || divisorColNames[theCol] )[theRow]
		)
	)
);

// Delete all of the no longer needed divisor columns
For( i = 1, i <= N Items( divisorColNames ), i++,
	dtJoin << delete columns( "divisor_" || divisorColNames[i] )
);
Jim

View solution in original post

4 REPLIES 4
txnelson
Super User

Re: Calculating Percentage difference for a big amount of data

If I understand what you want, I think this is the simplest way to do what you want.

dt = Current Data Table();

dt2 = dt << subset(
	Rows( dt << get rows where( Time == 0 ) ),
	columns( {"id", "Part", "Value1"} )
);

dt2:Value1 << set name( "divisor" );

dtJoin = dt << Join(
	With( dt2 ),
	Merge Same Name Columns,
	Match Flag( 0 ),
	By Matching Columns( :id = :id, :Part = :Part ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 1, 0 ),
	Preserve main table order( 1 )
);

Close( dt2, nosave );

dtJoin << New Column( "Value2",
	Numeric,
	"Continuous",
	Format( "Percent", 10, 4 ),
	Formula( (:Value1 - :divisor) / :divisor )
);

dtJoin:Value2 << delete property( "formula" );
dtJoin << delete columns( "divisor" );
Jim
Jax
Jax
Level I

Re: Calculating Percentage difference for a big amount of data

Hi Jim!
Thank you so much for the help. 
I'm wondering if there is a way 
1) to Replace the Value2 (results) in the Column Value 1 initial cells.
2) if there are actually more columns ( like 30)  that I would need to do this operation(percentage difference),
do I need to hardcode code every new column, or is there a way I could multiply or for loop the columns creation.
 

txnelson
Super User

Re: Calculating Percentage difference for a big amount of data

The Value 2 results can easily replace the Value 1 results.

No you do not need to hard code the JSL for all of the columns.  I have expanded my previous code to illistrate how to run this on as many columns as there are in the data table.  I also made the assumptions that you would not want the original Value1 data changed for the Time=0 rows, but if that is not the case, then just remove the If() statement that is making that decision

Names Default To Here( 1 );
dt = Current Data Table();

// Create a subset with only the data where Time = 0
dt2 = dt << subset( Rows( dt << get rows where( :Time == 0 ) ), selected columns( 0 ) );

// Delete the Time column.  In the join step below, it
// will write on top of the original Time column if
// it remains in the subsetted data table
dt2 << delete columns( "Time" );

// Create a list of the column names so all of them can be 
// changed to a new name for the Join below.  Eliminate
// the columns "id" and "Part" from the name change since
// they will be used in the column matching in the Join.
divisorColNames = dt2 << get column names( string );
For( i = N Items( divisorColNames ), i >= 1, i--,
	If( divisorColNames[i] == "id" | divisorColNames[i] == "Part",
		divisorColNames = Remove( divisorColNames, i, 1 )
	)
);

// Rename all of the divisor columns by "adding divisor_"
// in front of each name
For( i = 1, i <= N Items( divisorColNames ), i++,
	Column( dt2, divisorColNames[i] ) << set name(
		"divisor_" || Char( Column( dt2, divisorColNames[i] ) << get name )
	)
);

// Join the data tables.  All of the matching id and part
// values will be joined to all rows with the same values
dtJoin = dt << Join(
	With( dt2 ),
	Merge Same Name Columns,
	Match Flag( 0 ),
	By Matching Columns( :id = :id, :Part = :Part ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 1, 0 ),
	Preserve main table order( 1 )
);

// The dt2 data table is no longer needed so close it
Close( dt2, nosave );

// Loop through all of the columns and create the new values
// unless the Time value is 0
For( theCol = 1, theCol <= N Items( divisorColNames ), theCol++,
	For( theRow = 1, theRow <= N Rows( dtJoin ), theRow++,
		If( dtJoin:Time[theRow] != 0,
			Column( dtJoin, divisorColNames[theCol] )[theRow] = (
			Column( dtJoin, divisorColNames[theCol] )[theRow]
			-Column( dtJoin, "divisor_" || divisorColNames[theCol] )[theRow]) /
			Column( dtJoin, "divisor_" || divisorColNames[theCol] )[theRow]
		)
	)
);

// Delete all of the no longer needed divisor columns
For( i = 1, i <= N Items( divisorColNames ), i++,
	dtJoin << delete columns( "divisor_" || divisorColNames[i] )
);
Jim
Jax
Jax
Level I

Re: Calculating Percentage difference for a big amount of data

Thank you so much Jim !

Recommended Articles