cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

JMP Workflow Challenge 1: Motif Extraction and Identification from Continuous Power Data

PowerCurve.png

Calling JMP workflow and JSL wizards. Here's a Thursday challenge for you. The attached data set is set of power data off my running power meter. What we want to do is identify and correctly tag the 8 motifs that occur during this ~1000 point signal data set attached to this post, create a new column with function ID and write it in a way that it can be used to automate the tagging of future data that follows this same envelope. 

 

The data collection assumptions for this data set and the future are:

  • Prior to the start of a motif the power is 0.
  • There is a period before and after the motifs that will be non-zero (warm-up and warm-down). This is not part of the system under test and should not be tagged.
  • Each motif has an attack, sustain, decay and release. The shape of the function is identical from motif to motif. It goes from 0 to some peak over a period of samples and then ramps down to the baseline noise floor (walking power). The motif ID is over when the power reads 0 again.
  • There will be an unknown number of functions to identify in the future. There are 8 in the data set. The solution needs to be invariant to that.
  • The overall length of the functions as well as the peak can vary.

We are working on the left edge of the analytic workflow today (shown in green below). The data comes from a file (.csv), and eventually when we automated this workflow a folder of .csv files. The Data Access is being used, and what the task is today is to perform the Data Blending and Cleaning tasks on this example file to have the data ready to expand the workflow to other analytic capabilities in the future.

DataWorkflowChallenge.png

Leave your solutions in the comments.

 

 

Connect with me on LinkedIn: https://bit.ly/3MWgiXt
26 REPLIES 26

Re: JMP Workflow Challenge 1: Motif Extraction and Identification from Continuous Power Data

Fantastic @ngambles . This works great and thanks for posting the link to the video as well, which describes how to get the access token. These data are coming in as I'd like to see them. And plus this is a nice improvement over manually exporting them and combining them to a .csv file. Really sets the stage for automating this data pull and getting the data cleaned up to be analyzed and published on a regular basis to JMP Live. And yes, the watts are coming in for me through the API from your script.

StravaDataPull.png

I pasted in @brady_brady 's formulas from above, and it still doesn't get all the intervals. However, thanks to  @ngambles 's Strava API code, we have another interesting feature in the data set to explore and use, which is the smoothed velocity column. I think this can be useful. You also now have the unique activity IDs to work with as another feature (knowing that I will only run between 6-10 sprints in any given activity. I think we should throw the ML model at it again, @Byron_JMP , now that we have a bigger training data set and a couple additional features to build the model.

I've attached the updated training data set.

Strava Moving Velocity.png

And @Jordan_Hiller I think you should try too! You are a data cleanup pro.

Connect with me on LinkedIn: https://bit.ly/3MWgiXt
ngambles
Level III

Re: JMP Workflow Challenge 1: Motif Extraction and Identification from Continuous Power Data

I'm glad to hear that the watts are coming in for you through the API.

 

I have made a few more updates to the Strava API code that handles automatically getting refreshed access codes, and then I rolled that up into a generic add-in that I've attached.

 

The first time you run the addin you will be prompted to enter your Strava API info

ngambles_0-1649646105005.png

which gets saved in your add-in folder so you do not have to enter it again.

 

The add-in returns a data table containing all of the available activities from Strava.

ngambles_1-1649646536256.png

 

This add-in should further streamline the process of getting data from the Strava API, not only for this project but for others who also want to explore their data from Strava.

 

ngambles
Level III

Re: JMP Workflow Challenge 1: Motif Extraction and Identification from Continuous Power Data

I have been thinking about my initial approach to this challenge and thought I could do it better so I have made a second attempt.

 

This time, I have defined several target shapes and then search through the data looking for regions where the shapes match reasonably well. My defined target shapes are:

ngambles_2-1649647745230.png

 

This method requires fewer tuning parameters, and allows for more flexibility by allowing additional target shapes to be added easily. 

 

I have added a column (watts modified) that performs a linear extrapolation when the watts value drops out during a run.

 

I have also added a column (Row State Formula) that automatically hides and excludes data points that are not associated with a motif ID.

 

In my opinion, this is an improvement over my initial solution to the challenge.

 

This method identifies 41 runs from the "2022_03_28 Strava Sprinting Training Data Set 5 Runs.jmp" data set

ngambles_3-1649648614234.png

 

 

The JSL code to process the Strava data is:

names default to here(1);

// uncomment the following two lines if you have installed the Strava - Get Activities add-in and want to automatically download the Strava data prior to identifying motifs
// include("$ADDIN_HOME(com.caes.ngambles.strava-get-activities)\main.jsl");
// wait(0);
dt = current data table(); // Find Regions that match target shapes dt << New Column( "Shape Match", Numeric, nominal, Formula( thresh = 12500; shapeList = {{221, 245, 289, 327, 374, 431, 477, 518, 535, 518, 491, 451, 401, 350, 304, 238, 178, 161, 156, 106, 96, 96, 87, 85, 81, 79, 76, 74, 73, 72, 73, 72, 72, 72, 72, 72, 71, 70, 70, 70}, {306, 321, 347, 384, 429, 476, 515, 546, 579, 582, 576, 542, 489, 442, 375, 348, 295, 275, 242, 185, 152, 121, 106, 98, 93, 89, 89, 85, 84, 83, 82, 82, 81, 80, 79, 79, 78, 78, 78, 78}, {357, 390, 401, 428, 475, 505, 558, 604, 619, 634, 634, 625, 581, 530, 483, 444, 378, 357, 324, 282, 214, 186, 162, 153, 137, 132, 124, 109, 104, 97, 94, 90, 90, 89, 88, 87, 87, 85, 85, 86}}; minErr = Empty(); For( k = 1, k <= N Items( shapeList ), k++, err = 0; For( n = 0, n < N Items( shapeList[k] ), n++, For( m = 1, m <= N Items( shapeList[k] ), m++, err += (:watts[(Row() + m) - n - 1] - shapeList[k][m]) ^ 2 ); err /= N Items( shapeList[k] ) - 1; minErr = Minimum( minErr, err ); ); ); If( minErr <= thresh, flag = 1, flag = 0 ); ) ); // Run Region - helps clean up regions found by shape match dt << New Column( "Run Region", Numeric, "Continuous", Formula( regionCount = Summation( k = 0, 5, Lag( :Shape Match, k ) ); If( regionCount < 6 & region Count > 0 & :Shape Match == 1, nearStart = 1, nearStart = 0 ); If( nearStart == 0, ans = :Shape Match; , ans = 0; If( :watts[row()] > 0 & :watts[row() +1] > 0, ans = :Shape Match; ) ); ans; ) ); // motif ID dt << New Column( "Motif ID", Numeric, Nominal, Formula( mID = 0; For( k = 1, k <= Row(), k++, If( :Run Region[k - 1] == 0 & :Run Region[k] == 1, mID++ ) ); If( :Run Region == 1, mID ); ) ); // row state formula column dt << New Column("Row State Formula", Row State, Row State, Formula( If( Is Missing(:Motif ID), Row State() = Combine States(Hidden State(1), Excluded State(1)), Row State() ) ) ); // linear interpolation of zero values within motif ID dt << New Column( "watts modified", Numeric, Continuous, Formula( If( Is Missing( :Motif ID ), Empty(), If( :watts > 0, :watts, rb = Row(); ra = Row(); For( k = Row(), k >= 1, k--, If( :watts[k] != 0, rb = k; Break(); ) ); For( k = Row(), k <= N Rows(), k++, If( :watts[k] != 0, ra = k; Break(); ) ); deltaPower = :watts[rb] - :watts[ra]; deltaTime = rb - ra; :watts[rb] + (deltaPower / deltaTime) * (Row() - rb); ) ) ) ); // rank timestamp, Motif ID - x axis values to help graph each motif dt << New Column( "X", Numeric, Continuous, Formula( Col Rank( :timestamp, :Motif ID ) ) );

 

Re: JMP Workflow Challenge 1: Motif Extraction and Identification from Continuous Power Data

Is there any reason this wouldn't work on the macOS? I just tried to load it and you may have created it for Windows only.

Connect with me on LinkedIn: https://bit.ly/3MWgiXt
ngambles
Level III

Re: JMP Workflow Challenge 1: Motif Extraction and Identification from Continuous Power Data

I am unable to test it on a Mac, so my default choice was Windows only.  Attached is an add-in that allows both Windows and Mac. I'm not aware of a reason it shouldn't work, but it's untested.

 

Please let me know if it runs successfully on a Mac.

Re: JMP Workflow Challenge 1: Motif Extraction and Identification from Continuous Power Data

It does work on the Mac, however, if you enter the refresh token incorrectly the add-in will refuse to load and you have to uninstall and reinstall. Is this pulling all recent activities? Might be good to have search string to pull in just those with the title of interest (i.e. "sprint").

Connect with me on LinkedIn: https://bit.ly/3MWgiXt
ngambles
Level III

Re: JMP Workflow Challenge 1: Motif Extraction and Identification from Continuous Power Data

Glad to hear that the add-in works on Mac.

 

You're right, I forgot to add the filter for "sprint" back in after making the add-in as generic as possible. 

 

The following code:

  • uses the add-in to get fresh data from Strava
  • filters rows to include only "sprint" names
  • filters columns to match what was posted in "2022_03_28 Strava Sprinting Training Data Set 5 Runs" data set
  • identifies motifs
  • adds script to data table to chart the results

 

names default to here(1);

// if add-in is found, get fresh data from Strava, otherwise just use current data table
if( file exists("$ADDIN_HOME(com.caes.ngambles.strava-get-activities)\main.jsl"),
	include("$ADDIN_HOME(com.caes.ngambles.strava-get-activities)\main.jsl");
	wait(0);
);

// get reference to current data table
dt = current data table();

// Clean up data set for specific motif extraction challenge
if( contains( dt << get column names(string), "name"),
	r = dt << get rows where(
		!contains(lower case(:name), "sprint")
	);
	try(dt << delete rows(r));
);
colList = {
	"name",
	"altitude",
	"distance",
	"grade_smooth",
	"lat",
	"lng"
};
for( k = 1, k <= n items(colList), k++,
	try(dt << delete column( colList(k) ));
);

// Find Regions that match target shapes
dt << New Column( "Shape Match",
	Numeric,
	nominal,
	Formula(
		thresh = 12500;
		shapeList = {
			{221, 245, 289, 327, 374, 431, 477, 518, 535, 518, 491, 451, 401, 350, 304, 238, 178, 161, 156, 106,  96,  96,  87,  85,  81,  79,  76,  74,  73, 72, 73, 72, 72, 72, 72, 72, 71, 70, 70, 70}, 
			{306, 321, 347, 384, 429, 476, 515, 546, 579, 582, 576, 542, 489, 442, 375, 348, 295, 275, 242, 185, 152, 121, 106,  98,  93,  89,  89,  85,  84, 83, 82, 82, 81, 80, 79, 79, 78, 78, 78, 78}, 
			{357, 390, 401, 428, 475, 505, 558, 604, 619, 634, 634, 625, 581, 530, 483, 444, 378, 357, 324, 282, 214, 186, 162, 153, 137, 132, 124, 109, 104, 97, 94, 90, 90, 89, 88, 87, 87, 85, 85, 86}		
		};
		minErr = Empty();
		For( k = 1, k <= N Items( shapeList ), k++,
			err = 0;
			For( n = 0, n < N Items( shapeList[k] ), n++,
				For( m = 1, m <= N Items( shapeList[k] ), m++,
					err += (:watts[(Row() + m) - n - 1] - shapeList[k][m]) ^ 2
				);
				err /= N Items( shapeList[k] ) - 1;
				minErr = Minimum( minErr, err );
			);
		);
		If( minErr <= thresh,
			flag = 1,
			flag = 0
		);
		flag;
	)
);

// Run Region
dt << New Column( "Run Region",
	Numeric,
	"Continuous",
	Formula(
		regionCount = Summation( k = 0, 5, Lag( :Shape Match, k ) );
		If( regionCount < 6 & region Count > 0 & :Shape Match == 1,
			nearStart = 1,
			nearStart = 0
		);
		If( nearStart == 0,
			ans = :Shape Match;
			,
			ans = 0;
			If( :watts[row()] > 0 & :watts[row() +1] > 0,
				ans = :Shape Match;
			)
		);
		ans;
	)
);

// motif ID
dt << New Column( "Motif ID",
	Numeric,
	Nominal,
	Formula( 
		mID = 0;
		For( k = 1, k <= Row(), k++,
			If( :Run Region[k - 1] == 0 & :Run Region[k] == 1,
				mID++		
			)
		);
		If( :Run Region == 1, mID );	
	)
);

// row state formula column
dt << New Column("Row State Formula", 
	Row State, 
	Row State, 
	Formula(
		If(
			Is Missing(:Motif ID), 
			Row State() = Combine States(Hidden State(1), Excluded State(1)),
			Row State()		
		)		
	)		
);

// linear interpolation of zero values within motif ID
dt << New Column( "watts modified",
	Numeric,
	Continuous,
	Formula(
		If( Is Missing( :Motif ID ),
			Empty(),
			If( :watts > 0,
				:watts,
				rb = Row();
				ra = Row();
				For( k = Row(), k >= 1, k--,
					If( :watts[k] != 0,
						rb = k;
						Break();
					)
				);
				For( k = Row(), k <= N Rows(), k++,
					If( :watts[k] != 0,
						ra = k;
						Break();
					)
				);
				deltaPower = :watts[rb] - :watts[ra];
				deltaTime = rb - ra;
				:watts[rb] + (deltaPower / deltaTime) * (Row() - rb); 
			)
		)
	)
);

// rank timestamp, Motif ID - x axis values to help graph each motif
dt << New Column( "X",
	Numeric,
	Continuous,
	Formula( Col Rank( :timestamp, :Motif ID ) )
);

// graph results script added to data table
dt << new script(
	"Graph Results",
	names default to here(1);
	dt = current data table();
	
	nw = new window(
		"Motif Extraction Results",
		h list box(
			gb1 = dt << Graph Builder(
				Show Control Panel( 0 ),
				Show Legend( 0 ),
				Show Footer( 0 ),
				Variables(
					X( :Name( "X" ) ),
					Y( :watts modified ),
					Wrap( :Motif ID )
				),
				Elements( Line( X, Y, Legend( 6 ) ), Points( X, Y, Legend( 7 ) ) )
			)
			,
			gb2 = dt << Graph Builder(
				Show Control Panel( 0 ),
				Show Footer( 0 ),
				Variables( X( :X ), Y( :watts modified ), Overlay( :Motif ID ) ),
				Elements(
					Smoother( X, Y, Legend( 5 ), Lambda( 0.0014 ) ),
					Points( X, Y, Legend( 6 ) )
				)
			)
		)
	);
);

// run script to make graphs
// eval( dt << get property("Graph Results") );