Subscribe Bookmark RSS Feed

Search for Nearest Value

natalie_

Community Trekker

Joined:

Jan 6, 2016

Hi all,

I need to search for the nearest value of something in a column.  Is there a function (that I just can't find) to do this, or do I need to make my own routine to do so?

Thanks, Natalie

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

Below are two ways to get the row(s) of the nearest value. The first returns a matrix of row numbers, and hence allows for ties. For the the second approach that uses the LocMin() function it is assumed that there is one unique nearest value; it will return the row number of the first instance only and neglecting any ties.

Names Default To Here(1);

dt = Open("$SAMPLE_DATA\Big Class.jmp");

col = Column("weight");

// Select all rows that are closest to a value

value = 99.5;

d = Abs((col << get values) - value);

rows = Loc(d == Min(d));


Show(rows);

dt << select rows(rows);

// Select first row that is closest to a value

value = 112;

rows = Loc Min(Abs((col << get values) - value));


Show(rows);

dt << select rows(rows);

8 REPLIES
txnelson

Super User

Joined:

Jun 22, 2012

Natalie,

Here is a simple function that I wrote that returns the value which is the closest to the target value.

Names Default To Here( 1 );

dt = open("$SAMPLE_DATA\Semiconductor Capability.jmp");

findClosest = Function( {value, col},

{a, R, aa, justAbove, justBelow, value, col},

a = Column( col ) << get values;

R = Rank( a );

aa = sorted = a[R];

b = 108;

justAbove = aa[Loc( aa > value )[1]];

justBelow = aa[Loc( aa < value )[N Rows( Loc( aa < b ) )]];

If( justAbove - value < value - justBelow,

justAbove,

);

);

show(minimum = findClosest(108,"NPN1"));

Jim
natalie_

Community Trekker

Joined:

Jan 6, 2016

Thank you Jim, this is great.

I can definitely use this.  However, I need to know the row number, so that I can get the corresponding value in the next column.

After using your function and finding the nearest value, should I search for that value in the original data to find the row number?

txnelson

Super User

Joined:

Jun 22, 2012

Or, you can create a matrix vector

  rowmat=index(1,N Rows(dt));

and then sort it using the same Rank matrix

rowmatsorted = rowmat;

and then it will contain the row numbers in the same order as your sorted value matrix, so when you get the nearest value position from the values matrix, the same position in the rowmatsorted matrix will have the row number

Jim
Solution

Below are two ways to get the row(s) of the nearest value. The first returns a matrix of row numbers, and hence allows for ties. For the the second approach that uses the LocMin() function it is assumed that there is one unique nearest value; it will return the row number of the first instance only and neglecting any ties.

Names Default To Here(1);

dt = Open("$SAMPLE_DATA\Big Class.jmp");

col = Column("weight");

// Select all rows that are closest to a value

value = 99.5;

d = Abs((col << get values) - value);

rows = Loc(d == Min(d));


Show(rows);

dt << select rows(rows);

// Select first row that is closest to a value

value = 112;

rows = Loc Min(Abs((col << get values) - value));


Show(rows);

dt << select rows(rows);

natalie_

Community Trekker

Joined:

Jan 6, 2016

Sorry for the slow response; I got caught up on something else.  I initially tried your second solution, but I can not get it to work at all.  I can't even get it to locate the minimum value of the column...  What's going on?  I noticed that my column was a string column, not a numeric column, so I converted it, but that didn't seem to help any.

column(dtFile1, "X_location")<<data type("Numeric");                

col = dtFile1 << Column("X_location");

rows = Loc Min(((col<<getValues)));

Craige_Hales

Staff

Joined:

Mar 21, 2013

use

col = Column( dtFile1, "X_location" );

JMP objects (like the data table in dtFile1) usually silently ignore messages (like <<column(...) ) that they don't understand.

Craige
natalie_

Community Trekker

Joined:

Jan 6, 2016

Thank you for your help!  It works.

Craige_Hales

Staff

Joined:

Mar 21, 2013

One more way using KDTable.

dt = Open( "$sample_data/big class.jmp" );

// create the input matrix for kdtable: [95 59, 123 61, 74 55, ...

data = (dt:weight << getasmatrix) || (dt:height << getasmatrix);

// create the lookup table

lookup = KDTable( data );

// do an example lookup: find the 10 nearest people to a weight of 100 pounds and height of 60

// inches within a distance of 5.  (5 is in the same "units" as weight and pounds...normalize?)

{rowNumbers, distances} = lookup << knearestrows( {10, 5}, [100 60] );

// display the results

For( irow = 1, irow <= N Cols( rowNumbers ), irow++,

  rowIndex = rowNumbers[irow];

  Write( dt:name[rowIndex], " w=", dt:weight[rowIndex], " h=", dt:height[rowIndex], " d=", distances[irow], "\!n" );

);

/*:

ALFRED w=99 h=64 d=4.12310562561766

CHRIS w=99 h=64 d=4.12310562561766

MARK w=104 h=62 d=4.47213595499958

KATIE w=95 h=59 d=5.09901951359278

so d=4.123 is sqrt((100-99)^2+(60-64)^2).  By the time Katie was discovered, the radius of 5 was exceeded, so the requested 10 results were limited by the radius.

KDTable works with K dimensional data; you might only need K=1 (just weight, for example) or K=3 (include age as well).  The scaling/normalizing issue is more obvious with age in this data, which has a much smaller range than height or weight.  Is the distance from 13 years old to 14 years old more or less than the distance from 75 pounds to 80 pounds?

edit: With just weight, it looks even simpler:

dt = Open( "$sample_data/big class.jmp" );

// create the lookup table

lookup = KDTable( dt:weight << getasmatrix );

// do an example lookup: find the 10 nearest people to a weight of 100 pounds

{rowNumbers, distances} = lookup << knearestrows( {10, 5}, [100] );

// display the results

For( irow = 1, irow <= N Cols( rowNumbers ), irow++,

  rowIndex = rowNumbers[irow];

  Write( dt:name[rowIndex], " w=", dt:weight[rowIndex], " d=", distances[irow], "\!n" );

);

/*:

ALFRED w=99 d=1

CHRIS w=99 d=1

JOHN w=98 d=2

MARK w=104 d=4

KATIE w=95 d=5

JOE w=105 d=5

MICHAEL w=95 d=5

CLAY w=105 d=5

DANNY w=106 d=6

Craige