cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

Min Max Normalization

Hello,

I'm new to JMP and I have a simple question:

I have the attached kind of data.

Now I need the following.

For every item from TypeA and one Matrix2 (=GroupBy) --> Done by Sorting in the Table

I have a range of distance values (such as the once highlighted in bold in the csv)

and now I need a new column with the min-max normalized distance values: e.g. for row1:

(1296,732785897795 - min(1296,732785897795, 12394,933566637337, 128802,00879001617, 905,2143796067685)) / max(1296,732785897795, 12394,933566637337, 128802,00879001617, 905,2143796067685) - min(1296,732785897795, 12394,933566637337, 128802,00879001617, 905,2143796067685)

or simple: value - min / max - min

Could somebody tell me how I get this done esp. for every value in the entire datatable?!?!

Type;Matrix1;Matrix2;Distance

TypeA;PCA.FullMatrix.MatrixA.txtScalingFunction(1.0);PCA.FullMatrix.MatrixA.txt;1296,732785897795

TypeB;PCA.FullMatrix.MatrixA.txtScalingFunction(1.0);PCA.FullMatrix.MatrixA.txt;0,0

TypeA;PCA.FullMatrix.MatrixA.txtScalingFunction(10.0);PCA.FullMatrix.MatrixA.txt;12394,933566637337

TypeB;PCA.FullMatrix.MatrixA.txtScalingFunction(10.0);PCA.FullMatrix.MatrixA.txt;21014,10900351405

TypeA;PCA.FullMatrix.MatrixA.txtScalingFunction(100.0);PCA.FullMatrix.MatrixA.txt;128802,00879001617

TypeB;PCA.FullMatrix.MatrixA.txtScalingFunction(100.0);PCA.FullMatrix.MatrixA.txt;220418,79499411583

TypeA;PCA.FullMatrix.MatrixA.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixA.txt;905,2143796067685

TypeB;PCA.FullMatrix.MatrixA.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixA.txt;2213813,8588409424

TypeA;PCA.FullMatrix.MatrixB.txtScalingFunction(1.0);PCA.FullMatrix.MatrixB.txt;861,6510828882456

TypeB;PCA.FullMatrix.MatrixB.txtScalingFunction(1.0);PCA.FullMatrix.MatrixB.txt;0,0

TypeA;PCA.FullMatrix.MatrixB.txtScalingFunction(10.0);PCA.FullMatrix.MatrixB.txt;9523,238631725311

TypeB;PCA.FullMatrix.MatrixB.txtScalingFunction(10.0);PCA.FullMatrix.MatrixB.txt;13497,199942827225

TypeA;PCA.FullMatrix.MatrixB.txtScalingFunction(100.0);PCA.FullMatrix.MatrixB.txt;709,0857161283493

TypeB;PCA.FullMatrix.MatrixB.txtScalingFunction(100.0);PCA.FullMatrix.MatrixB.txt;147853,7547969818

TypeA;PCA.FullMatrix.MatrixB.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixB.txt;709,080782532692

TypeB;PCA.FullMatrix.MatrixB.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixB.txt;1491273,704208374

TypeA;PCA.FullMatrix.MatrixB0.txtScalingFunction(1.0);PCA.FullMatrix.MatrixB0.txt;1352,5252192541957

TypeB;PCA.FullMatrix.MatrixB0.txtScalingFunction(1.0);PCA.FullMatrix.MatrixB0.txt;0,0

TypeA;PCA.FullMatrix.MatrixB0.txtScalingFunction(10.0);PCA.FullMatrix.MatrixB0.txt;15944,243417739868

TypeB;PCA.FullMatrix.MatrixB0.txtScalingFunction(10.0);PCA.FullMatrix.MatrixB0.txt;31300,447243332863

TypeA;PCA.FullMatrix.MatrixB0.txtScalingFunction(100.0);PCA.FullMatrix.MatrixB0.txt;14921,962318569422

TypeB;PCA.FullMatrix.MatrixB0.txtScalingFunction(100.0);PCA.FullMatrix.MatrixB0.txt;342046,20464229584

TypeA;PCA.FullMatrix.MatrixB0.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixB0.txt;329390,0448797047

TypeB;PCA.FullMatrix.MatrixB0.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixB0.txt;3449066,6614990234

TypeA;PCA.FullMatrix.MatrixC.txtScalingFunction(1.0);PCA.FullMatrix.MatrixC.txt;2091,4913414521143

TypeB;PCA.FullMatrix.MatrixC.txtScalingFunction(1.0);PCA.FullMatrix.MatrixC.txt;0,0

TypeA;PCA.FullMatrix.MatrixC.txtScalingFunction(10.0);PCA.FullMatrix.MatrixC.txt;24644,81373512745

TypeB;PCA.FullMatrix.MatrixC.txtScalingFunction(10.0);PCA.FullMatrix.MatrixC.txt;26506,051447868347

TypeA;PCA.FullMatrix.MatrixC.txtScalingFunction(100.0);PCA.FullMatrix.MatrixC.txt;1175,9547229632735

TypeB;PCA.FullMatrix.MatrixC.txtScalingFunction(100.0);PCA.FullMatrix.MatrixC.txt;283024,64327049255

TypeA;PCA.FullMatrix.MatrixC.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixC.txt;1175,9547229632735

TypeB;PCA.FullMatrix.MatrixC.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixC.txt;2849641,921447754

TypeA;PCA.FullMatrix.MatrixD.txtScalingFunction(1.0);PCA.FullMatrix.MatrixD.txt;5667,218272924423

TypeB;PCA.FullMatrix.MatrixD.txtScalingFunction(1.0);PCA.FullMatrix.MatrixD.txt;0,0

TypeA;PCA.FullMatrix.MatrixD.txtScalingFunction(10.0);PCA.FullMatrix.MatrixD.txt;60630,825313568115

TypeB;PCA.FullMatrix.MatrixD.txtScalingFunction(10.0);PCA.FullMatrix.MatrixD.txt;49495,72706222534

TypeA;PCA.FullMatrix.MatrixD.txtScalingFunction(100.0);PCA.FullMatrix.MatrixD.txt;1222,9498270750046

TypeB;PCA.FullMatrix.MatrixD.txtScalingFunction(100.0);PCA.FullMatrix.MatrixD.txt;544452,9990234375

TypeA;PCA.FullMatrix.MatrixD.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixD.txt;1222,9498270750046

TypeB;PCA.FullMatrix.MatrixD.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixD.txt;5494025,822509766

TypeA;PCA.FullMatrix.MatrixE.txtScalingFunction(1.0);PCA.FullMatrix.MatrixE.txt;1803,4288639264996

TypeB;PCA.FullMatrix.MatrixE.txtScalingFunction(1.0);PCA.FullMatrix.MatrixE.txt;0,0

TypeA;PCA.FullMatrix.MatrixE.txtScalingFunction(10.0);PCA.FullMatrix.MatrixE.txt;23951,372221767902

TypeB;PCA.FullMatrix.MatrixE.txtScalingFunction(10.0);PCA.FullMatrix.MatrixE.txt;31137,94615727663

TypeA;PCA.FullMatrix.MatrixE.txtScalingFunction(100.0);PCA.FullMatrix.MatrixE.txt;1240,121220368892

TypeB;PCA.FullMatrix.MatrixE.txtScalingFunction(100.0);PCA.FullMatrix.MatrixE.txt;329015,66047382355

TypeA;PCA.FullMatrix.MatrixE.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixE.txt;1197,2512193241855

TypeB;PCA.FullMatrix.MatrixE.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixE.txt;3306628,45728302

TypeA;PCA.FullMatrix.MatrixF.txtScalingFunction(1.0);PCA.FullMatrix.MatrixF.txt;3855,985625088215

TypeB;PCA.FullMatrix.MatrixF.txtScalingFunction(1.0);PCA.FullMatrix.MatrixF.txt;0,0

TypeA;PCA.FullMatrix.MatrixF.txtScalingFunction(10.0);PCA.FullMatrix.MatrixF.txt;44549,40316963196

TypeB;PCA.FullMatrix.MatrixF.txtScalingFunction(10.0);PCA.FullMatrix.MatrixF.txt;49444,8531126976

TypeA;PCA.FullMatrix.MatrixF.txtScalingFunction(100.0);PCA.FullMatrix.MatrixF.txt;31119,507083684206

TypeB;PCA.FullMatrix.MatrixF.txtScalingFunction(100.0);PCA.FullMatrix.MatrixF.txt;541804,04129982

TypeA;PCA.FullMatrix.MatrixF.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixF.txt;200535,49309930205

TypeB;PCA.FullMatrix.MatrixF.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixF.txt;5463254,535614014

TypeA;PCA.FullMatrix.MatrixG.txtScalingFunction(1.0);PCA.FullMatrix.MatrixG.txt;8515,496073201299

TypeB;PCA.FullMatrix.MatrixG.txtScalingFunction(1.0);PCA.FullMatrix.MatrixG.txt;0,0

TypeA;PCA.FullMatrix.MatrixG.txtScalingFunction(10.0);PCA.FullMatrix.MatrixG.txt;94455,55345153809

TypeB;PCA.FullMatrix.MatrixG.txtScalingFunction(10.0);PCA.FullMatrix.MatrixG.txt;101235,704870224

TypeA;PCA.FullMatrix.MatrixG.txtScalingFunction(100.0);PCA.FullMatrix.MatrixG.txt;20724,949643336236

TypeB;PCA.FullMatrix.MatrixG.txtScalingFunction(100.0);PCA.FullMatrix.MatrixG.txt;1109786,8460083008

TypeA;PCA.FullMatrix.MatrixG.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixG.txt;881185,4669873863

TypeB;PCA.FullMatrix.MatrixG.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixG.txt;1,119510265991211E7

TypeA;PCA.FullMatrix.MatrixH.txtScalingFunction(1.0);PCA.FullMatrix.MatrixH.txt;1307,421814918518

TypeB;PCA.FullMatrix.MatrixH.txtScalingFunction(1.0);PCA.FullMatrix.MatrixH.txt;0,0

TypeA;PCA.FullMatrix.MatrixH.txtScalingFunction(10.0);PCA.FullMatrix.MatrixH.txt;10822,828064918518

TypeB;PCA.FullMatrix.MatrixH.txtScalingFunction(10.0);PCA.FullMatrix.MatrixH.txt;18757,292488336563

TypeA;PCA.FullMatrix.MatrixH.txtScalingFunction(100.0);PCA.FullMatrix.MatrixH.txt;736,497870028019

TypeB;PCA.FullMatrix.MatrixH.txtScalingFunction(100.0);PCA.FullMatrix.MatrixH.txt;205999,00681304932

TypeA;PCA.FullMatrix.MatrixH.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixH.txt;736,489783167839

TypeB;PCA.FullMatrix.MatrixH.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixH.txt;2078450,4611206055

TypeA;PCA.FullMatrix.MatrixI.txtScalingFunction(1.0);PCA.FullMatrix.MatrixI.txt;1565,8852154612541

TypeB;PCA.FullMatrix.MatrixI.txtScalingFunction(1.0);PCA.FullMatrix.MatrixI.txt;0,0

TypeA;PCA.FullMatrix.MatrixI.txtScalingFunction(10.0);PCA.FullMatrix.MatrixI.txt;13818,568996116519

TypeB;PCA.FullMatrix.MatrixI.txtScalingFunction(10.0);PCA.FullMatrix.MatrixI.txt;14653,831587553024

TypeA;PCA.FullMatrix.MatrixI.txtScalingFunction(100.0);PCA.FullMatrix.MatrixI.txt;752,5333097577095

TypeB;PCA.FullMatrix.MatrixI.txtScalingFunction(100.0);PCA.FullMatrix.MatrixI.txt;160965,3396615982

TypeA;PCA.FullMatrix.MatrixI.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixI.txt;752,5333097577095

TypeB;PCA.FullMatrix.MatrixI.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixI.txt;1624051,2430114746

TypeA;PCA.FullMatrix.MatrixJ.txtScalingFunction(1.0);PCA.FullMatrix.MatrixJ.txt;1749,2281637191772

TypeB;PCA.FullMatrix.MatrixJ.txtScalingFunction(1.0);PCA.FullMatrix.MatrixJ.txt;0,0

TypeA;PCA.FullMatrix.MatrixJ.txtScalingFunction(10.0);PCA.FullMatrix.MatrixJ.txt;14932,082843661308

TypeB;PCA.FullMatrix.MatrixJ.txtScalingFunction(10.0);PCA.FullMatrix.MatrixJ.txt;23836,434760332108

TypeA;PCA.FullMatrix.MatrixJ.txtScalingFunction(100.0);PCA.FullMatrix.MatrixJ.txt;152409,1852493286

TypeB;PCA.FullMatrix.MatrixJ.txtScalingFunction(100.0);PCA.FullMatrix.MatrixJ.txt;260808,26988983154

TypeA;PCA.FullMatrix.MatrixJ.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixJ.txt;477450,98512765765

TypeB;PCA.FullMatrix.MatrixJ.txtScalingFunction(1000.0);PCA.FullMatrix.MatrixJ.txt;2630440,6514434814

1 ACCEPTED SOLUTION

Accepted Solutions
ms
Super User (Alumni) ms
Super User (Alumni)

Re: Min Max Normalization

I suggest column formula that utilize Col Min() and Col Max() functions (with "group by" option).

A new formula column can be set up manually or by running the script below. To do it by hand just add a new numeric column, choose the formula property and construct the formula with the formula editor (or paste the formula inside Formula() below).

dt = Current Data Table(); //i.e your data table

dt << New Column( "Norm distance",

  Numeric,

  Continuous,

  Formula( (:Distance - Col Minimum( :Distance, :Type )) / (Col Maximum( :Distance, :Type ) - Col Minimum( :Distance, :Type )) )

);

View solution in original post

4 REPLIES 4
ms
Super User (Alumni) ms
Super User (Alumni)

Re: Min Max Normalization

I suggest column formula that utilize Col Min() and Col Max() functions (with "group by" option).

A new formula column can be set up manually or by running the script below. To do it by hand just add a new numeric column, choose the formula property and construct the formula with the formula editor (or paste the formula inside Formula() below).

dt = Current Data Table(); //i.e your data table

dt << New Column( "Norm distance",

  Numeric,

  Continuous,

  Formula( (:Distance - Col Minimum( :Distance, :Type )) / (Col Maximum( :Distance, :Type ) - Col Minimum( :Distance, :Type )) )

);

Re: Min Max Normalization

Hello,

thanks for your reply, but it was not exactly what I need

The challenge is that, you cannot take the col min/max for a type but for a type and (in pseudo code):

if(issubstring(matrix2,matrix1) --> eg Matrix1: PCA.FullMatrix.MatrixA.txtScalingFunction(10.0) Matrix 2: PCA.FullMatrix.MatrixA.txt

or all the bold marked in the csv.

Imagine you have a number of csv files in one... This is somehow the problem here, but I need to know how i would solve it in the future....

Re: Min Max Normalization

NO!!!! I was wrong... When I wrote my reply to MS I understood that the columnsorting was wrong I can take

:Distance - Col Minimum( :Distance, :Matrix2)) / (Col Maximum( :Distance, :Matrix2) - Col Minimum( :Distance, :Matrix2)) )

I was thinking too much.... The "Sorting" is independent of the problem...


Solved

ms
Super User (Alumni) ms
Super User (Alumni)

Re: Min Max Normalization

Great that it worked. I was not sure how you wanted to group the data. I just want to add that the column statistics formulas can have more than one "by" column if necessary. For example:

Col Mean(:Distance, :Type, :Matrix2).

And yes, sorting does not matter here.