Discussions

natalie_ · Jan 19, 2016 04:48 PM

Hi all,

I need to search for the nearest value of something in a column. Is there a function (that I just can't find) to do this, or do I need to make my own routine to do so?

Thanks, Natalie

ms · Jan 20, 2016 06:04 PM

Below are two ways to get the row(s) of the nearest value. The first returns a matrix of row numbers, and hence allows for ties. For the the second approach that uses the LocMin() function it is assumed that there is one unique nearest value; it will return the row number of the first instance only and neglecting any ties.

Names Default To Here(1);

dt = Open("$SAMPLE_DATA\Big Class.jmp");

col = Column("weight");

// Select all rows that are closest to a value

value = 99.5;

d = Abs((col << get values) - value);

rows = Loc(d == Min(d));

Show(rows);

dt << select rows(rows);

// Select first row that is closest to a value

value = 112;

rows = Loc Min(Abs((col << get values) - value));

Show(rows);

dt << select rows(rows);

View solution in original post

txnelson · Jan 19, 2016 10:14 PM

Natalie,

Here is a simple function that I wrote that returns the value which is the closest to the target value.

Names Default To Here( 1 );

dt = open("$SAMPLE_DATA\Semiconductor Capability.jmp");

findClosest = Function( {value, col},

{a, R, aa, justAbove, justBelow, value, col},

a = Column( col ) << get values;

R = Rank( a );

aa = sorted = a[R];

b = 108;

justAbove = aa[Loc( aa > value )[1]];

justBelow = aa[Loc( aa < value )[N Rows( Loc( aa < b ) )]];

If( justAbove - value < value - justBelow,

justAbove,

);

show(minimum = findClosest(108,"NPN1"));

Jim

natalie_ · Jan 20, 2016 11:49 AM

Thank you Jim, this is great.

I can definitely use this. However, I need to know the row number, so that I can get the corresponding value in the next column.

After using your function and finding the nearest value, should I search for that value in the original data to find the row number?

txnelson · Jan 20, 2016 01:46 PM

Or, you can create a matrix vector

rowmat=index(1,N Rows(dt));

and then sort it using the same Rank matrix

rowmatsorted = rowmat;

and then it will contain the row numbers in the same order as your sorted value matrix, so when you get the nearest value position from the values matrix, the same position in the rowmatsorted matrix will have the row number

Jim

ms · Jan 20, 2016 06:04 PM

Below are two ways to get the row(s) of the nearest value. The first returns a matrix of row numbers, and hence allows for ties. For the the second approach that uses the LocMin() function it is assumed that there is one unique nearest value; it will return the row number of the first instance only and neglecting any ties.

Names Default To Here(1);

dt = Open("$SAMPLE_DATA\Big Class.jmp");

col = Column("weight");

// Select all rows that are closest to a value

value = 99.5;

d = Abs((col << get values) - value);

rows = Loc(d == Min(d));

Show(rows);

dt << select rows(rows);

// Select first row that is closest to a value

value = 112;

rows = Loc Min(Abs((col << get values) - value));

Show(rows);

dt << select rows(rows);

natalie_ · Jan 29, 2016 11:42 AM

Sorry for the slow response; I got caught up on something else. I initially tried your second solution, but I can not get it to work at all. I can't even get it to locate the minimum value of the column... What's going on? I noticed that my column was a string column, not a numeric column, so I converted it, but that didn't seem to help any.

column(dtFile1, "X_location")<<data type("Numeric");

col = dtFile1 << Column("X_location");

rows = Loc Min(((col<<getValues)));

Craige_Hales · Jan 29, 2016 12:48 PM

use

col = Column( dtFile1, "X_location" );

JMP objects (like the data table in dtFile1) usually silently ignore messages (like <<column(...) ) that they don't understand.

Craige

natalie_ · Jan 29, 2016 02:50 PM

Thank you for your help! It works.

Craige_Hales · Jan 21, 2016 09:07 AM

One more way using KDTable.

dt = Open( "$sample_data/big class.jmp" );

// create the input matrix for kdtable: [95 59, 123 61, 74 55, ...

data = (dt:weight << getasmatrix) || (dt:height << getasmatrix);

// create the lookup table

lookup = KDTable( data );

// do an example lookup: find the 10 nearest people to a weight of 100 pounds and height of 60

// inches within a distance of 5. (5 is in the same "units" as weight and pounds...normalize?)

{rowNumbers, distances} = lookup << knearestrows( {10, 5}, [100 60] );

// display the results

For( irow = 1, irow <= N Cols( rowNumbers ), irow++,

rowIndex = rowNumbers[irow];

Write( dt:name[rowIndex], " w=", dt:weight[rowIndex], " h=", dt:height[rowIndex], " d=", distances[irow], "\!n" );

);

/*:

ALFRED w=99 h=64 d=4.12310562561766

CHRIS w=99 h=64 d=4.12310562561766

MARK w=104 h=62 d=4.47213595499958

KATIE w=95 h=59 d=5.09901951359278

so d=4.123 is sqrt((100-99)^2+(60-64)^2). By the time Katie was discovered, the radius of 5 was exceeded, so the requested 10 results were limited by the radius.

KDTable works with K dimensional data; you might only need K=1 (just weight, for example) or K=3 (include age as well). The scaling/normalizing issue is more obvious with age in this data, which has a much smaller range than height or weight. Is the distance from 13 years old to 14 years old more or less than the distance from 75 pounds to 80 pounds?

edit: With just weight, it looks even simpler:

dt = Open( "$sample_data/big class.jmp" );

// create the lookup table

lookup = KDTable( dt:weight << getasmatrix );

// do an example lookup: find the 10 nearest people to a weight of 100 pounds

{rowNumbers, distances} = lookup << knearestrows( {10, 5}, [100] );

// display the results

For( irow = 1, irow <= N Cols( rowNumbers ), irow++,

rowIndex = rowNumbers[irow];

Write( dt:name[rowIndex], " w=", dt:weight[rowIndex], " d=", distances[irow], "\!n" );

);

/*:

ALFRED w=99 d=1

CHRIS w=99 d=1

JOHN w=98 d=2

MARK w=104 d=4

KATIE w=95 d=5

JOE w=105 d=5

MICHAEL w=95 d=5

CLAY w=105 d=5

DANNY w=106 d=6

Craige

Discussions

Search for Nearest Value

Re: Search for Nearest Value

Re: Search for Nearest Value

Re: Search for Nearest Value

Re: Search for Nearest Value

Re: Search for Nearest Value

Re: Search for Nearest Value

Re: Search for Nearest Value

Re: Search for Nearest Value

Re: Search for Nearest Value

Recommended Articles