Hello,
I'm trying to create a script that will sample a data table with a 'sliding window' (sampling a set number of rows of a set 'width', then moving up a row to sample the next set, resulting in a data set of rows 1-10, 2-11, 3-12 and so forth) sampling of my data table, creating a new data set that I can use for pattern analysis.
My script is operational, but doesn't work well for larger databases, with slow (>1h) analysis or even crashing JMP, are there any suggestions for how to improve it?
Thanks!
// Define the window size (number of rows) and the increment (how much the standardised time should increase by)
windowSize = 9;
increment = 91;
// Get the number of rows in the data table
dt = Open("$SAMPLE_DATA/Time Series/GNP.jmp");
dt << Add Rows(windowSize);
numRows = N Rows(dt);
// Create a list to store the subsets
subsets = {};
standTimeMatrix = J(windowSize, 1, 0); // Pre-allocate matrix for Stand Time
// Fill the Stand Time matrix
dt << begin data update;
For(j = 1, j <= windowSize, j++,
standTimeMatrix[j, 1] = 1 + (j - 1) * increment;
);
// Loop through the data table to create the sliding windows
For(i = 1, i <= numRows - windowSize + 1, i++,
// Get the subset of data for the current window
subset = dt << Subset(Rows(i::i + windowSize - 1));
// Add the window number to the subset
windowNumberColumn = Repeat(i, windowSize);
subset << New Column("Window Number", Numeric, Continuous, Set Values(windowNumberColumn));
// Add the "Standardised Time" column to the subset
subset << New Column("Standardised Time", Numeric, Continuous, Set Values(standTimeMatrix));
// Add the subset to the list
Insert Into(subsets, subset);
);
// Create a new table for concatenation
newTable = New Table("All Data");
// Run the concatenations using the subsets list
For(i = 1, i <= N Items(subsets), i++,
newTable << Concatenate(
subsets[i],
append to first table(1)
);
// Close the subset table after concatenation to free up memory
Close(subsets[i], No Save);
);