Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Search for Nearest Value

Topic Options

- Start Article
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jan 19, 2016 1:48 PM
(6669 views)

Hi all,

I need to search for the nearest value of something in a column. Is there a function (that I just can't find) to do this, or do I need to make my own routine to do so?

Thanks, Natalie

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Below are two ways to get the row(s) of the nearest value. The first returns a matrix of row numbers, and hence allows for ties. For the the second approach that uses the LocMin() function it is assumed that there is one unique nearest value; it will return the row number of the first instance only and neglecting any ties.

Names Default To Here**(****1****)**;

dt = Open**(**"$SAMPLE_DATA\Big Class.jmp"**)**;

col = Column**(**"weight"**)**;

// Select all rows that are closest to a value

value = **99.5**;

d = Abs**((**col << **get values****)** - value**)**;

rows = Loc**(**d == Min**(**d**))**;

Show**(**rows**)**;

dt << **select rows****(**rows**)**;

// Select first row that is closest to a value

value = **112**;

rows = Loc Min**(**Abs**((**col << **get values****)** - value**))**;

Show**(**rows**)**;

dt << **select rows****(**rows**)**;

8 REPLIES 8

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Search for Nearest Value

Natalie,

Here is a simple function that I wrote that returns the value which is the closest to the target value.

Names Default To Here**(** **1** **)**;

dt = open**(**"$SAMPLE_DATA\Semiconductor Capability.jmp"**)**;

findClosest = Function**(** **{**value, col**}**,

**{**a, R, aa, justAbove, justBelow, value, col**}**,

a = Column**(** col **)** << **get values**;

R = Rank**(** a **)**;

aa = sorted = a**[**R**]**;

b = **108**;

justAbove = aa**[**Loc**(** aa > value **)[****1]]**;

justBelow = aa**[**Loc**(** aa < value **)[**N Rows**(** Loc**(** aa < b **)** **)]]**;

If**(** justAbove - value < value - justBelow,

justAbove,

**)**;

**)**;

show**(**minimum = findClosest**(****108**,"NPN1"**))**;

Jim

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Search for Nearest Value

Thank you Jim, this is great.

I can definitely use this. However, I need to know the row number, so that I can get the corresponding value in the next column.

After using your function and finding the nearest value, should I search for that value in the original data to find the row number?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Search for Nearest Value

Or, you can create a matrix vector

rowmat=index(1,N Rows(dt));

and then sort it using the same Rank matrix

rowmatsorted = rowmat

and then it will contain the row numbers in the same order as your sorted value matrix, so when you get the nearest value position from the values matrix, the same position in the rowmatsorted matrix will have the row number

Jim

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Below are two ways to get the row(s) of the nearest value. The first returns a matrix of row numbers, and hence allows for ties. For the the second approach that uses the LocMin() function it is assumed that there is one unique nearest value; it will return the row number of the first instance only and neglecting any ties.

Names Default To Here**(****1****)**;

dt = Open**(**"$SAMPLE_DATA\Big Class.jmp"**)**;

col = Column**(**"weight"**)**;

// Select all rows that are closest to a value

value = **99.5**;

d = Abs**((**col << **get values****)** - value**)**;

rows = Loc**(**d == Min**(**d**))**;

Show**(**rows**)**;

dt << **select rows****(**rows**)**;

// Select first row that is closest to a value

value = **112**;

rows = Loc Min**(**Abs**((**col << **get values****)** - value**))**;

Show**(**rows**)**;

dt << **select rows****(**rows**)**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Search for Nearest Value

Sorry for the slow response; I got caught up on something else. I initially tried your second solution, but I can not get it to work at all. I can't even get it to locate the minimum value of the column... What's going on? I noticed that my column was a string column, not a numeric column, so I converted it, but that didn't seem to help any.

column**(**dtFile1, "X_location"**)**<<**data type****(**"Numeric"**)**;

col = dtFile1 << **Column(**"X_location"**)**;

rows = Loc Min**(((**col<<**getValues****)))**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Search for Nearest Value

use

col = Column( dtFile1, "X_location" );

JMP objects (like the data table in dtFile1) usually silently ignore messages (like <<column(...) ) that they don't understand.

Craige

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Search for Nearest Value

Thank you for your help! It works.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Search for Nearest Value

One more way using KDTable.

dt = Open( "$sample_data/big class.jmp" );

// create the input matrix for kdtable: [95 59, 123 61, 74 55, ...

data = (dt:weight << getasmatrix) || (dt:height << getasmatrix);

// create the lookup table

lookup = KDTable( data );

// do an example lookup: find the 10 nearest people to a weight of 100 pounds and height of 60

// inches within a distance of 5. (5 is in the same "units" as weight and pounds...normalize?)

{rowNumbers, distances} = lookup << knearestrows( {10, 5}, [100 60] );

// display the results

For( irow = 1, irow <= N Cols( rowNumbers ), irow++,

rowIndex = rowNumbers[irow];

Write( dt:name[rowIndex], " w=", dt:weight[rowIndex], " h=", dt:height[rowIndex], " d=", distances[irow], "\!n" );

);

/*:

ALFRED w=99 h=64 d=4.12310562561766

CHRIS w=99 h=64 d=4.12310562561766

MARK w=104 h=62 d=4.47213595499958

KATIE w=95 h=59 d=5.09901951359278

so d=4.123 is sqrt((100-99)^2+(60-64)^2). By the time Katie was discovered, the radius of 5 was exceeded, so the requested 10 results were limited by the radius.

KDTable works with K dimensional data; you might only need K=1 (just weight, for example) or K=3 (include age as well). The scaling/normalizing issue is more obvious with age in this data, which has a much smaller range than height or weight. Is the distance from 13 years old to 14 years old more or less than the distance from 75 pounds to 80 pounds?

edit: With just weight, it looks even simpler:

dt = Open( "$sample_data/big class.jmp" );

// create the lookup table

lookup = KDTable( dt:weight << getasmatrix );

// do an example lookup: find the 10 nearest people to a weight of 100 pounds

{rowNumbers, distances} = lookup << knearestrows( {10, 5}, [100] );

// display the results

For( irow = 1, irow <= N Cols( rowNumbers ), irow++,

rowIndex = rowNumbers[irow];

Write( dt:name[rowIndex], " w=", dt:weight[rowIndex], " d=", distances[irow], "\!n" );

);

/*:

ALFRED w=99 d=1

CHRIS w=99 d=1

JOHN w=98 d=2

MARK w=104 d=4

KATIE w=95 d=5

JOE w=105 d=5

MICHAEL w=95 d=5

CLAY w=105 d=5

DANNY w=106 d=6

Craige