cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
BHarris
Level VI

Finding best window of data

As a companion to my last question...

 

I need to identify the best window in a data table, e.g. from Samples/Data/Functional Data/Weekly Weather Data.jmp, I need to identify the X consecutive lines (moving window) with the highest TMAX by STATION.  Not sure whether to use the window with the highest average, or the lowest minimum, so I'd like to know how to do both.

 

As before, I'm sure there's a way to do this with JSL, but I'd prefer to use a function if possible.

 

I've tried a bunch of things with Col Moving Average() and Col Rank() but it somehow feels just out of reach.

 

1 ACCEPTED SOLUTION

Accepted Solutions
hogi
Level XII

Re: Finding best window of data

How about combining Col Max (to find the Peak) with Col Moving Average (to stretch the search window) 

 

In addition:

  • to mark the peak position, compare individual values with the maximum value
    quick & dirty:
    - maybe: 2 matches
    - maybe [but unlikely]: a wrong match with the max value of another station
  • for Weekly Weather Data, one has to invest some extra effort:
    for every station some weeks are missing  - so to search within the correct window, one has to inform Jmp about the missing data via split & stack (works as none of the weeks is completely missing).

hogi_1-1699198600823.png

 

dt = Open( "$SAMPLE_DATA/Functional Data/Weekly Weather Data.jmp" );

half_window_size = 2;

dtsplit = dt << Split(
	Split By( :week of year ),
	Split( :TAVG ),
	Group( :STATION ),
	Remaining Columns( Drop All )
);

myNames=dt split << get column names();
myNames= myNames[2::Nitems(myNames)];
dtstack = dtsplit <<
Stack(
	columns(myNames)
);

New Column( "window",Formula(Col Moving Average( :Data, 1, half_window_size, half_window_size, :STATION )));

New Column( "max_window",Formula( Col Maximum( :window, :STATION ) ));

New Column( "marker",Nominal,Formula( :window == :max_window ));

:Label << Set Data Type(Numeric) << set name ("week of year");

Graph Builder(
	Variables(
		X( :week of year ),
		Y( :Data ),
		Y( :window, Position( 1 ) ),
		Y( :max_window, Position( 1 ) ),
		Group X( :STATION, N View Levels( 1 )),
		Overlay( :marker )
	),
	Elements(
		Points( X, Y( 1 ), Overlay( 0 ) ),
		Points( X, Y( 2 )),
		Smoother( X, Y( 3 ) )
	)
)

 

View solution in original post

5 REPLIES 5
dale_lehman
Level VII

Re: Finding best window of data

I must be misunderstanding your question, but it sounds like you just want the column maximum by station - see the last column in the attached and the stored script Tabulating the results.  But I suspect you mean something else and I don't understand what you mean by "moving window."

BHarris
Level VI

Re: Finding best window of data

No, I'm looking to create a column where it has a "1" next to the X-consecutive rows, e.g. 4 consecutive rows, whose minimum value of another column across those 4 rows is the largest.

txnelson
Super User

Re: Finding best window of data

It would be helpful if you could provide an example of the end result of what you are looking for.

Jim
hogi
Level XII

Re: Finding best window of data

How about combining Col Max (to find the Peak) with Col Moving Average (to stretch the search window) 

 

In addition:

  • to mark the peak position, compare individual values with the maximum value
    quick & dirty:
    - maybe: 2 matches
    - maybe [but unlikely]: a wrong match with the max value of another station
  • for Weekly Weather Data, one has to invest some extra effort:
    for every station some weeks are missing  - so to search within the correct window, one has to inform Jmp about the missing data via split & stack (works as none of the weeks is completely missing).

hogi_1-1699198600823.png

 

dt = Open( "$SAMPLE_DATA/Functional Data/Weekly Weather Data.jmp" );

half_window_size = 2;

dtsplit = dt << Split(
	Split By( :week of year ),
	Split( :TAVG ),
	Group( :STATION ),
	Remaining Columns( Drop All )
);

myNames=dt split << get column names();
myNames= myNames[2::Nitems(myNames)];
dtstack = dtsplit <<
Stack(
	columns(myNames)
);

New Column( "window",Formula(Col Moving Average( :Data, 1, half_window_size, half_window_size, :STATION )));

New Column( "max_window",Formula( Col Maximum( :window, :STATION ) ));

New Column( "marker",Nominal,Formula( :window == :max_window ));

:Label << Set Data Type(Numeric) << set name ("week of year");

Graph Builder(
	Variables(
		X( :week of year ),
		Y( :Data ),
		Y( :window, Position( 1 ) ),
		Y( :max_window, Position( 1 ) ),
		Group X( :STATION, N View Levels( 1 )),
		Overlay( :marker )
	),
	Elements(
		Points( X, Y( 1 ), Overlay( 0 ) ),
		Points( X, Y( 2 )),
		Smoother( X, Y( 3 ) )
	)
)

 

BHarris
Level VI

Re: Finding best window of data

Turns out our particular problem was far more complicated than we originally realized, so we ended up coding up the processing in python since we're more familiar with that.

 

I'll flag @hogi's as the solution to close the topic.