cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
33909
Level I

Make JSL script more efficient for sampling a 'sliding window'

Hello,

 

I'm trying to create a script that will sample a data table with a 'sliding window' (sampling a set number of rows of a set 'width',  then moving up a row to sample the next set, resulting in a data set of rows 1-10, 2-11, 3-12 and so forth) sampling of my data table, creating a new data set that I can use for pattern analysis. 

 

My script is operational, but doesn't work well for larger databases, with slow (>1h) analysis or even crashing JMP, are there any suggestions for how to improve it?

 

Thanks!

// Define the window size (number of rows) and the increment (how much the standardised time should increase by)
windowSize = 9;
increment = 91;

// Get the number of rows in the data table
dt = Open("$SAMPLE_DATA/Time Series/GNP.jmp");
dt << Add Rows(windowSize);
numRows = N Rows(dt);

// Create a list to store the subsets
subsets = {};
standTimeMatrix = J(windowSize, 1, 0);  // Pre-allocate matrix for Stand Time

// Fill the Stand Time matrix
dt << begin data update;
For(j = 1, j <= windowSize, j++,
    standTimeMatrix[j, 1] = 1 + (j - 1) * increment;
);

// Loop through the data table to create the sliding windows
For(i = 1, i <= numRows - windowSize + 1, i++,
    // Get the subset of data for the current window
    subset = dt << Subset(Rows(i::i + windowSize - 1));

    // Add the window number to the subset
    windowNumberColumn = Repeat(i, windowSize);
    subset << New Column("Window Number", Numeric, Continuous, Set Values(windowNumberColumn));

    // Add the "Standardised Time" column to the subset
    subset << New Column("Standardised Time", Numeric, Continuous, Set Values(standTimeMatrix));

    // Add the subset to the list
    Insert Into(subsets, subset);
);


// Create a new table for concatenation
newTable = New Table("All Data");


// Run the concatenations using the subsets list

For(i = 1, i <= N Items(subsets), i++,
    newTable << Concatenate(
        subsets[i],
        append to first table(1)
    );
    // Close the subset table after concatenation to free up memory
    Close(subsets[i], No Save);
);



2 REPLIES 2
jthi
Super User

Re: Make JSL script more efficient for sampling a 'sliding window'

Not sure if this is doing what you want (it has different result than your script) or what is "large database" but maybe something like this would work

Names Default To Here(1);

size = 10;

dt = Open("$SAMPLE_DATA/Time Series/GNP.jmp", Invisible);

dt_results = dt << Clone;
dt_results << Show Window(0);

dt_results << Delete Rows(1::N Rows(dt_results));
dt_results << New Column("Idx", Numeric, Ordinal);
dt_results << New Column("Rows", Numeric, Ordinal);

cols = dt << Get Column Names("String");

i = 0;
While(i + size <= N Rows(dt),
	old_rows = (1+i)::(i+size);
	dt_temp = dt << Subset(Rows(old_rows), Selected Columns(0), Invisible);
	dt_temp << New Column("Idx", Numeric, Ordinal, Set Each Value(i + 1));
	dt_temp << New Column("Rows", Numeric, Ordinal, Values(old_rows));
	
	dt_results << Concatenate(dt_temp,
		"Append to first table"
	);
	Close(dt_temp, no save);
	
	i++;
);

dt_results << Show Window(1);
-Jarmo
hogi
Level XII

Re: Make JSL script more efficient for sampling a 'sliding window'

Maybe it's not necessary to create the auxiliary table?

Col Moving Average  provides some settings to compare values within a sliding window:

hogi_0-1722339177462.png

 

what are your next steps?