Streamlining Statistical Analysis with Custom JSL Scripts - (2023-US-30MP-1488)

All right. Hi, Welcome.

Thanks for joining this online session

of the 2023 America's JMP Discovery Summit.

I'm presenting on behalf of myself and my team member, Prince Shiva.

Both of us are data scientists here with Intel.

Today, our talk is streamlining statistical analysis

with custom JSL scripts,

and we'll focus on how at Intel we develop

these custom analysis scripts using JMP scripting language JSL.

Okay, this is our agenda for the next 25 -30 minutes or so.

I should have time for Q&A at the end of the session.

I shouldn't go that far over time.

Hopefully stay within the 30 minutes.

Normally, I would stop and ask for questions,

but since this is a prerecorded question, I've been informed

that I'll actually be able to answer questions in the chat.

If you have any questions, just feel free to just type them into the chat.

Maybe put a slide number inside of there as well.

The slide numbers are on every slide, and I can just directly answer

those questions in chat, or you can write them down and ask them live afterwards.

There is a callout on the bottom

of the slide here that all of the code and data tables that we're going to be

sharing today, we're going to be walking through a live demonstration.

All of those are available on the conference or summit website.

Go ahead, go download those,

open them now and get them ready to go, get your JMP open.

We have a couple of introductory

things that we're going to be going through to start here.

Take that time to get yourself settled

so that you can walk through this code with me and see where it is.

But cool. This is our agenda.

We're going to be going through some quick

presenter introductions and as a background on our team.

We'll do some background on motivation,

on why we like to do these custom JSL scripts and what value they have.

We'll go through a really high- level

overall analysis flow for these custom scripts, and then the meat

of the presentation is going to be this one sample analysis demonstration.

We're actually going to show a really simple

custom analysis script for one sample analysis,

and we'll go step by step through each of those different sections

to make that custom script.

We'll finish with some conclusions, recommendations and final thoughts.

All right. Present your introductions here.

I'm here with Prince Shiva.

You can't see him in this camera here, but he's here with me.

He's a data scientist here at Intel.

His research interests are in process control

system development for manufacturing,

as well as operational cost optimization through different data science methods.

He's been with Intel for the past four years,

and he has about the same amount of experience with JMP as JSL.

That's when he started working with JMP was when we came to Intel.

My name is Logan Mathesen. I'm also a data scientist here at Intel.

My research interests are in Bayesian and Black box optimization,

statistical surrogate modeling and design and analysis of experiments.

I've been with Intel for the past two years.

I've been coding with JSL.

I got introduced just before I came to Intel,

and I've been working with JMP for the last six years or so.

I've been really lucky,

so I know the value of these nice click and drop user- friendly kind of interfaces

that JMP has that makes statistical analysis so nice.

Just a little bit about our team, Prince and are on the team.

Together we are team members, so our team here at Intel,

we are the statistical lead for Intel manufacturing.

We are responsible for the experimental design and analysis of different

experiments that happen within our modules at Intel.

We also do statistical method development and maintenance.

Any new statistical methods that we want to employ,

or different methods that we need to maintain,

as well as doing all of the statistical training for the engineers here at Intel.

We'll train them on the basic concepts

of statistics as well as, how to interact with our custom scripts.

To do that analysis for them.

Like said, we do have a custom JSL repository.

We proliferate, own and maintain that repository.

It has over 150 different JSL scripts

for automated, streamlined analysis and reporting.

These scripts are really nice because they embed internal

best- known methods,

directly into the analysis.

Decisions that our team has made about the right way to do statistics,

we've embedded all of that decision- making,

directly into these custom analysis scripts,

and that means that they are reproducible, repeatable across the world.

Actually, everyone in Intel manufacturing

is using these scripts for their statistical analysis.

Okay.

Again, just to give some more motivation and background for the value

of these custom analysis scripts, these automated analysis,

they do improve data exploration, visualization and analysis,

as well as standardizing all of those types of activities.

It's always the same kind of exploration, visualization and analysis.

It's really helpful for common analysis activities .

Engineers are a lot of times doing

the same sort of activities when they're talking about analysis.

Maybe they're establishing and monitoring process control charts.

Maybe they're qualifying newly installed tooling to make sure that that tooling

is performing the way that we would expect it to, to have a qualified tool.

Maybe they're doing product qualification to make sure that we can actually produce

a product of quality on that different tooling.

Maybe it's metrology qualification, making sure that our metrology are taking

accurate and reliable measurements, or maybe it's some sort of yield analysis.

But as you can see,

these are all very common engineering activities that get done,

hundreds of times a week across the world here at Intel.

All of these things can be automated

and being a nice standard report format.

For me, and I know Prince, maybe he feels the same way.

Selfishly, I really love these scripts because we do review

all of that statistical analysis that comes through.

As the statistical lead, that's what our team is responsible for.

For me, selfishly, it's really great because I always

seem to see the same analysis.

I know the way that it's supposed to be

analyzed, and I'm able to see that same exact report coming out every time.

It makes for a very efficient analysis review,

as well as analysis generation there.

Over here on the right- hand side,

you're going to be seeing this automated analysis output.

It's just a nice block diagram to show

the components that go into these custom analyzes.

Right up top, we'll sort of do a bottom

line up front that we'll have a nice summary table that has those custom

statistical decisions baked into it .

If you only have one thing and you only need to review one specific thing,

you're just going to look at that summary table,

with that final decision that comes from our best- known methods.

Then there's a lot of supporting information underneath that to help give

a more full picture of the analysis for a deeper dive.

Of course, we're going to include different visualizations, plots,

summary statistics, and then we'll see in a second.

But this is really the heavy lifter behind these custom JSL scripts .

These native JMP platforms that we're used to.

Specifically, we're going to be looking

at a distribution platform in a second to help us with our analysis.

But this is really the backbone of all of these scripts .

This is where the heavy lifting gets done.

Okay.

Let's go into a really high- level, general analysis flow.

On the left- hand side, this is what the user is going to be

viewing or experiencing, as they're using these custom JSL scripts.

They're of course going to load some data table of interest.

We're hoping that they have some data if they're trying to do a data

analysis here, and then they're going to go navigate to the custom analysis

script that they need to run, they'll go ahead and execute that.

That's then going to pop up in input GUI for them to interact with .

This input GUI is actually going to be almost

identical to the regular JMP input that we see .

The distribution platform of, "Hey, you want to do this type of analysis,

tell me what fields in your data table, what columns go where ."

For this kind of analysis after they enter in all of that input GUI information,

they're then going to wait as the script manipulates and analyzes that data,

and then it's going to present them with a final output report GUI.

What we like to do with our analysis,

the more complex ones, is that we'll often have some extra

interactivity that can be done inside of that final report.

The engineer can do any final tweaking that they want,

complete their final analysis, and then they have that exact report

ready to go that they can share with anybody to share this analysis.

Now, underlying that, us is the developer.

What does this look like to get a custom script done?

First off, we need to generate that input GUI.

We then copy data tables because we never

want to be manipulating an original data table that a user has.

That's a great way to break someone's trust

and make them not want to use a script is by destroying their data.

Always make that copy.

We then execute those standard JMP analysis

that was talking about, those native platforms.

We store any critical variables out of those,

into some code variables so that we can reference them later.

In other portions of the report,

we go ahead and create any visualizations through maybe Graph Builder or a similar

platform in JMP, and then we create that sort of final

analysis summary table or that decision table.

Then we present the user with that final output report .

We generate that final report for them.

Again, all sort of background here.

The rest of the presentation is going to be us going live

through both of these flows .

Seeing what the underlying script looks like,

and then also seeing what the actual user

is going to be experiencing as they're going through this.

All right.

Here's our basic analysis demonstration.

Again, this is going to be from the position of a beginner,

from a JMP beginner.

One of the things that we're going

to be doing in the spirit of a simpler context for these education purposes,

is that we're only going to be covering a simple one- sample analysis.

If you have a set of data,

is the mean of that set of data equal to a given value.

Again, all of this JSL code and all of these data tables

that we're going to be showing, are available online for us there.

Let's go ahead and jump into it.

First things first.

We have a data table here, with 16 different entries.

Let me find my mouse. There it is.

Awesome.

We have these 16 different data table entries.

We're going to be interested in this parameter here, thickness .

We have some process parameter thickness

and we're going to say, "Is the average thickness equal to one micrometer?"

That is the statistical question for this analysis demonstration that we have here.

Over here on the right- hand side, again,

we would imagine that the user would have some sort of data table open.

But if the user decided to run this custom script without that data table open,

here's just an example of some code that would check to see if a data table

was open, and if not, it would allow the user to open up a data table.

Otherwise, it's going to say, "Yes, this is the data table that I want to look at.

Let's start my analysis."

Any good thing that we should do anytime we get any sort of data is always

just make some sort of visualization, get our hands on it.

Here's just a little visualization of this thickness,

by this data table entry here.

One through 16, that's going to be on our x- axis there.

All right. Let's jump over to JMP again.

Hopefully, you have your JMP open if you would like to follow along,

at least hopefully you have your JSL code up and going.

I'm going to open up just my JMP home window here.

You'll see that I already have the data table open,

and I already have the JSL script open and ready to go.

I'm not going to go through opening them here.

The other thing that's important is this log.

We are going to be talking about this log.

This is sort of your best friend,

as you're developing any sort of these scripts

to make sure that everything is running appropriately.

Let's go ahead and open up all of these here and let's take a look.

On the left- hand side, we do have that actual JSL script inside of there.

We have a nice header, we have some log lines inside of here.

If you highlight something and hit this run button, it will run just that portion.

If nothing is highlighted and you hit the run button,

it's going to compile and run the whole script.

Just be careful with that.

Again, inside of these scripts, we're going to have a lot of these dividers.

Prince and have really done our best to do some really thorough commenting

and some really thorough dividing inside of here to make it easy for anybody

to pick up and read this and hopefully jump on their own custom scripts here.

But again, everything up here,

just printing some log lines, making sure this stuff is going well,

clearing variables, opening up tables if we need to, so on and so forth.

This is really just some initial workspace

cleanup kind of things that we're going to do,

so let's highlight all of that and hit Run.

We'll see out here in our log that yes, indeed,

that completed successfully inside of there.

Let's go ahead and flip back over to our slides.

We're primed and ready to go, our workspace is ready to go.

Again, the first thing that we need to do as the script developers that we need

to present that primary input GUI to our user .

This is what it's going to look like on the left -hand side,

and again, it should look very familiar to a standard JMP input window.

On the right side, this is sort of the meet and the primary

way that we get that GUI going, it's going to be this column dialog box.

You'll see we have a title,

we have this response list, which is going to be these response variables, response.

This is going to be the variable name for us moving forward so that we can

recognize what the user entered into this field.

We can see that this is a required numeric.

That's because minimum column is one and data type is numeric.

We have this run ID list, which is going to be our order number.

What order were these measurements taken?

This is going to be critical for our visualization.

This will be the x-axis on our visualization,

and then of course, we have sort of an alpha box.

This is going to be the alpha- level for our significance testing,

for saying whether or not our mean is equal to our hypothesized mean.

It'll default at 0.05, but the user can always change that

as is called out in the bottom right- hand corner here.

There are some other elements in this section of the code.

We'll look at it really briefly as we go through it right now.

But that's the high level of what else is done inside of there in words.

Let's go ahead and flip back over

to JMP and let's take a look at this primary user input window.

Again, it's going to be this next

divider which starts up here, and goes down to about there.

We'll start from here.

Again, we have some log lines, and then like said…

Sorry, let me grab that common line or I'll get an error.

We have a log line,

and then like I said, this is the meat and bones of that primary input GUI.

Let's go ahead and hit run on that and we'll see.

Here it is.

We have these tables open because we're

looking at this specific summit data table .

We have these columns available.

Thickness will be our response variable

measurement number will go on that order number .

We're actually saying, "Hey,

something might have happened to this data table that it got sorted or something,

but this was the actual measurement order that these things were taken.

We're going to put that as our order number variable inside of here

and we're going to go ahead and click okay and we'll see

that everything went through fine.

Like I said, after this,

there's some other error checking, some buttons down here.

We'll see that there's a lot of print lines to make sure.

"Hey, is that input window working the way we think it is?

Are restoring the variables

and the way that we thought we were storing them?"

This is just a developer kind of check for us inside of here.

Let's go ahead and run that, and we can see that, yes, indeed,

our response is thickness,

and that run ID is that measurement number.

And alpha was unchanged at 0.05.

We do have some error checking.

We'll get to that in a couple of slides where we'll talk about all of that.

Just one quick note down here.

That cancel button that we saw inside of our …

There we go. This cancel button that's over here .

We have the OK button and we have a Cancel button

if the user ends up selecting…

Sorry. Let me go back here.

Computer, work with me. There we go.

If we end up hitting that Cancel button, what happens?

Well, JSL actually doesn't have anything pre- coated in.

This is us putting in a condition that if Cancel is entered,

we're going to go ahead and throw this and kill the script.

Let's go ahead and run.

These last sections here, and then we'll flip back over

to the slides and that's how we're going to run our primary input GUI .

That's simple. That's all it is there.

Pretty straightforward to get such a nice interface inside of here .

Less than 100 lines of code inside of there.

Cool.

Next up, we're going to talk about creating that copy data table.

Again, we never want to corrupt our users data table.

On the right- hand side,

we're seeing the code for how to create that copy really well commented.

Every single line has a nice comment to tell you exactly what's happening

inside of there, even if you're not familiar with JMP or JSL.

We'll go ahead and scroll down and we're just going to run all of this.

You'll notice that right now in the top right,

we have the summit data table, that's going to turn into a copy .

That's the original currently.

But when I run this, we're now going to open up a copy of this table.

There's this copy data table.

We'll pop it back up in this corner and you'll notice that now this script

is operating over this copy data table .

We are no longer doing anything on that original data table.

Any manipulations we do is on that copy. We're good to go.

We're set. It's clean there.

Let's go ahead and flip back over to the slides and we'll move forward.

After we've given that primary input GUI,

oftentimes we do need some secondary or even tertiary input GUI.

The users provided us some initial

information about how they would like their data to be analyzed.

Now it's some follow- up questions.

Come on to that.

For us, for this one sample analysis,

again, we're looking at our thickness variable.

This is what it's going to look like, that secondary input window .

We know which variable we're targeting, we want to analyze, but what is the target

value that we want to compare it against?

What's the value of interest here?

You'll even notice that in the title

of this that we're already calling out the value for thickness .

This script is already starting to be smart and it's already starting to be

adapted for us,

of listening to what the user said in the primary input GUI

and proliferating that, into that secondary input GUI.

It really just makes it clean for users as they step through these.

When we get to more complex kinds of scripts.

Inside of here in the middle, we see this is…

Again, the main code to generate that secondary input window.

There are a couple of other

functionalities inside of the code that will walk through a little bit.

Just one note, secondary input windows,

they're not necessary, but they are nice to have.

Of course, if you have too many tertiary input windows, it'll slow us down and it's

too many clicks,

but a lot of times it's nice to have some flexibility and some

adaptive script logic to actually make for a better user experience.

If you go overboard with it, of course it'll make it a worse user experience.

But sort of finesse is key when you're designing

user experience for these custom scripts,

because they need to be usable at the end of the day.

All right, great. Let's jump back over to JMP.

We'll look at this.

This next section, we're talking about the secondary user input GUI.

Again, we're just going to start…

we're going to create a variable for targets.

Maybe we have multiple parameters that we're studying all at once.

Here's another error check for a missing target.

Then like said, here's that big heavy lifter

for that secondary input window is all of that code there.

Then we'll just run these last little bits.

Again, these are just pulling

information out of that secondary input window.

We'll go ahead and run all of that together,

and we'll see that here's this target value.

Again, it's already recognized that it's for thickness.

We said at the beginning that we want to know all of these thickness values.

Are they equal to a value of one micrometer on average?

Is that the mean value there?

We'll go there, and we'll go ahead and hit OK,

and we'll see that everything went through okay.

No errors inside of there.

That's all of the inputs that we need from the user at this point.

The next thing that the user would see is nothing .

They would sit and wait maybe for a couple

of seconds, maybe for 10 seconds if it's a really heavy script.

But at this point it's all of the actual

analysis that needs to happen in the report generation.

Before we jump into that, let's jump into the different error

checking that we've exemplified inside of our script for you here.

Inside of this primary input GUI, we do have this error check.

You can see the code numbers.

Essentially, it's just saying, "Hey, let's make sure that our alpha

significance level is between zero and one .

If it's outside of zero and one, you're going to throw this dialog box here

where it's going to tell you what went wrong.

This error checking is a nice

example of inline error checking for us there.

We have a different type of error checking.

We give you a second kind,

which is going to be this function- based checking.

When we're talking about this secondary input window,

we do have this missing target expression.

This is an expression in JSL

other scripting languages, call these functions.

But again, this is just a nice way for us to also just call this expression to say,

actually, was there a missing target inside of there?

If the user hits okay with an empty

target value, you're going to get out this big box here.

Okay, awesome.

Those are examples of error checking.

Let's JMP into the actual analysis then.

Like I said at the beginning, the heavy lifter for all of these custom

scripts is always going to be relying upon these JMP native platforms .

Those have all of that quality already built into it.

It's a lot of risk mitigation that we didn't do something wrong when we coded.

That statistical analysis and also that we know that it's the most accurate

statistical analysis that is available, that quality inside of there.

For this example,

we're going to be focusing on a distribution platform here.

This is just the standard JMP native distribution platform here.

The nice part about JSL and these native platforms is that you can directly

interact with these native platforms through JSL.

On the next slide, we'll show some tips and tricks,

for how you can actually interact,

and pull just the specific values that you want.

There's a lot of good information that's presented on these different JMP native

platforms,

but oftentimes there's just a couple of key elements that we really

need to show, to report out to different engineers.

All right.

Let's JMP over to the code then, and let's go ahead and run this part.

This next divider is actually just

going to be all of the actual analysis grouped together.

We'll just go through portion by portion here.

This is just creating some container variables.

We'll talk about that in a second.

But let's go ahead and run that, and we see that that was all.

Okay, Let's open up the log inside of there.

Yap, everything is okay.

Now this is the actual distribution platform .

This is us creating that distribution.

This vlist box is going to send it to our output box .

That's going to prepare us for our final report generation.

But if we just want to inspect this while we're doing some development,

if we run the code from here up to here, but do not include the comma.

If you do not include the comma and you hit run,

we'll see that we actually get out our nice distribution platform here.

We've done some nice things.

We've added the target value inside of here.

You can see that we're already testing for the mean and the hypothesis value,

is that target value that we're interested in, we get some nice summary statistics

mean standard deviation, so on and so forth inside of there.

But that's the way that you could always

create the same standardized distribution report.

Oftentimes different people with different JMPs will have different preferences,

because we've specified each element of this platform,

it's always going to generate the exact

same distribution platform coming out of there.

Okay, so that's the distribution platform.

Now let's see, how do we actually interact with this distribution platform to create

a nice custom script that's going to be over here?

It's a little bit scary the first time you look at it,

but you end up finding out that this properties functionality

that's built directly into JMP is going to be our best friend.

Ultimately right now what we're showing is how can I pull those summary statistics

that I want to display in my bottom line up front summary table?

How do we pull those statistics directly out of that distribution platform?

All of that calculation was already done for me.

How do I then report it somewhere else?

It's going to be from this property's functionality.

For us, we'll see that we're interested in the P value of that statistical test.

We want the mean of our data set, standard deviation of our data set,

and the lower and upper confidence intervals.

We'll see that we then are going to insert

all of those values into those containers inside of here.

Let's take a look at our distribution platform,

and see how we can use this show properties function.

We're on our distribution platform.

If you go to the summary statistic, I want to pull out this mean value.

How do I know the code to pull that out and interact with it?

We're going to right- click, and we're going to go to Show Properties.

Once you're in Sow Properties, you can click on this box path right here.

This box path, this is now the exact code that you can use to reference any

of the numbers inside of this blue highlighted box.

You'll see that these are the same items that are shown over here.

This is the mean value, the standard deviation value,

lower and upper confidence intervals there.

You'll see you can sort of see it

on the bottom right here that it says this value get one.

This is for the mean and it wants to return that first value out of it .

We would add the value of get one to get the mean next to this.

You'll notice that this says report platform here.

If we look back over here,

it says report dbox . Now why do we say dbox there?

Well, D box is the specific name that we gave our distribution platform.

Right?

We're saying refer to this platform that we just created and pull out

those specific values and store them into these container variables.

That's exactly what's happening in all of this segment of code.

Let's go ahead and flip over to our JSL custom script,

and let's run this next portion.

Actually, sorry.

I need to close out of my distribution platform.

Otherwise it may corrupt that there's a couple of distribution platforms

all contending at the same time.

We're going to run all of this section and we're also going to get up to here

where we're going to pull out those summary statistics.

We hit run and we see great everything went through just fine there.

That's how we actually are going

to interact with those heavy lifter native JMP platforms.

Again, rely upon the stuff that's already built and you can already trust,

and then we'll build further from there.

The next thing that we're going to show is, well, how do we create

that summary table?

I just showed you how I can pull out these mean values,

the standard deviation value, these confidence intervals

that we're leveraging that distribution platform over here on the right.

This is just how we can create this summary table.

What you see on the left is exactly generated by this code on the right here.

You can see that we already look at these targets.

This means all of these other containers that we already initialized previously…

Just to remind us where these values came from,

it looks something like this .

We're pulling out these different values out of the distribution platform.

You'll notice again that we already have

this hypothesized mean of our target of one.

That's, of course, coming from that secondary

input window of one there.

We're going to go ahead and drop that target of one there.

The other important thing on this summary table,

like said, is this nice custom decision- making .

That we can put whatever logic we want to put inside of here.

It's kind of silly for this one sample analysis, example.

But overall, it really is one of these things that…

This is where you as your as your company,

as your profession, you get to implore your own expert opinion

about how decisions should be made.

You can look at the statistics and say,

"No, actually this is how we would like to make decisions, and want to put

that right up front so that it's immediately clear

to anybody who opens up this report

of how we analyze and what decision we come to you."

Let's go ahead and just run that code.

We're not going to be able to see

the portion like we did with the distribution box.

We're only going to be able to see

the summary table when we do that final output report.

But you'll notice here's that custom decision- making right here.

For saying how do we decide if it's not equal or equal,

what we're going to look at the P value?

Of course, that's kind of silly, but the point stands for more complex reports.

We'll go ahead and run that portion of the script and we'll see.

Let's pull up our.

Yap, it seems like no error is coming out of this log inside of there.

Let's flip back over here.

The last component.

We've talked about native platforms, we've talked about summary tables.

Now we need to talk about visualization.

Again, visualizations.

The reason why we have them,

is that their immediate and transparent data quality checks .

It's something that anybody can look

at and they can immediately draw some value out of it.

The way that I found the most value out

of these visualizations and these custom reports, is not necessarily for me,

it's really for the other reviewers who are the module experts .

The process experts can quickly look

at data and they'll say, " Yeah, that looks weird."

That's not how that process behaves .

Or they'll look at it and they'll say,

Yeah, that makes sense that that's how that process behaves.

But these visualizations give a lot more than just the pure statistics,

especially when you're talking to somebody

who's not a statistics professional or statistics expert.

Again, visualizations, they're great.

They allow for proper

checking, for data corruption, as well as analysis corruption.

If you see something weird in your visualization,

you should not trust the analysis that's associated to that.

On the slide right now, it's just an example of how we can turn

our data table into a nice refined visualization over here.

We've even added that target line inside of there that the user defined for us.

Next slide is a word of caution about how we use these scripts.

These visualizations, again, they should highlight these data concerns,

but the user needs to know how to use them.

I said that this order number is what determines

the x-axis on our visualization.

If the user enters data table entry because they say, " That's the order,

that's what it is in the table."

They'll get something that looks like this .

This is what we've been looking at together so far through this presentation.

But again, I told you it's a more accurate representation.

Is this measurement number?

Something happened to this data table to get it sorted in a different order.

If we plot this visual- based of off of measurement number,

we're going to get something that looks like this.

Everybody here should notice

this immediately as a red flag that something is wrong.

We should never have data that's trending in this manner.

Either there was something wrong with the process or there was something

wrong with how we were measuring the data with our metrology then.

But we shouldn't really be trusting the results of this analysis.

When we see a visual like this.

We need to go and recollect the data, figure out what went wrong there.

Again, just a word of caution that if you want to use this,

you need to teach your engineers the right way to use it as well.

Just for us to say, hey, "How do we create these nice, beautiful visualizations?"

We like to use the Graph Builder platform. It's a wonderful platform that JMP offers.

It's super intuitive and easy to use.

You can make a beautiful, wonderful display here and you say,

"Yes, this is exactly how I want to display my data."

Then you can use this platform to automatically generate your JSL code

by clicking on the little red triangle up here.

Going save script to script window.

You'll get out a set of code that looks something like this.

The one word of caution is that of course, these variables

are going to be hard coded inside of here,

so you're just going to have to update that so that it interacts nicely

with your user input so that it adapts,

to whatever your user inputted into that GUI there.

But these are all of the elements then that go into the final report

and this is what that final report looks like.

Again, pretty straightforward.

We just say create a new output window.

We're actually going to make this a tab box here.

We only have one tab called Report, but in our more complex reports,

we'll actually have sometimes up to like 10 or 12

different tabs inside if you're all with different information.

But we have this summary table…

Again, we already created that summary table.

Let's put it there.

We have this nice graphical plot that we'll put over here and then we have

that nice distribution platform and we'll put that inside of there.

We have the overall takeaways right up top,

and then we have all of the supporting evidence underneath it.

Let's flip over to JMP and I know I'm just slightly over time here,

so we'll finish up quick.

Back over to JMP and we will run that final portion of the code here.

We're going to run both the Graph Builder.

Let's build that graph and send it over to the report,

and then let's generate the final report here.

We go ahead and hit Run and there's that platform.

The nice thing about these custom analysis scripts

and again, it's just a nice thing about JMP, in general,

is that all of these reports are going to be interactive .

Even though this was a custom report,

this platform is still connected to this platform down here,

These platforms are still going to be connected to this overall.

Remember that we're working with the copy data table now .

Nothing gets corrupted, but it's still going to be connected over here .

You can select different points inside of here and figure out,

well, what measurements are those responding to inside of there?

That's overall, these custom analysis reports,

what they look like, how we can make them.

Again, it's just a simple case, but let's move forward here,

into some overall conclusions and insights.

Final takeaways.

At Intel, these scripts have really become a critical

component of our data- driven decision- making .

It makes things so efficient and so fast

and so repeatable and standardized that it's wonderful.

Again, these are all sort of the same ideas.

The only thing to add is that it does also allow you to embed

that custom decision- making

for your company's specific best known methods for your specific processes

and analysis that you're going to be doing.

A quick note to make again,

it's the caveat that we said about that graphing here.

We do need to know that there's going to be some teaching resources that we need

to invest into this.

When we proliferate these scripts,

we can't just give them to the engineers and say, "Go do some analysis."

We need to tell them,

"This is how we intended to do analysis with these specific scripts.

In that same vein of thinking here,

we do have this custom decision- making infrastructure.

It's going to require maintenance.

There's going to be bugs, there's going to be corner cases that you didn't know.

Prince and I have run into plenty of these cases where an engineer

comes to us and says, "This isn't working,"

and we say, " That's weird. Let's look at this ."

We have to spend some time debugging inside of there,

especially, when your company wants to step to a newer version of JMP.

Here at Intel, we just stepped to the newest version of JMP.

Sixteen or 17, one of those,

and we had to go back through all 150 of those scripts,

and make sure that they were still compatible with the new thing .

Again, there's a lot of infrastructure maintenance that you should be aware of,

that that's it's going to come into play .

Especially when you really start

to proliferate this and make this a large repository.

Again, We should also be treating this as a living infrastructure, though .

It changes and that's a good thing.

That's why we have the power as the custom analysis script owner,

that we can change things inside of there

and we can do it immediately and quickly and we can be really agile about that.

Users, they might be hesitant initially.

They're going to learn to love this, they're going to really adopt it,

and they're going to start to do some strange things with these scripts.

They say, "Hey, I love this analysis. What if I did this?"

They're going to start using them in new, nonstandard ways?

You shouldn't get mad at them, these are actually opportunities .

If an engineer is using the script in a nonstandard way,

that means that there's some functionality

gap that they wish they could have that would make their job easier .

We should take that input, and we can revamp our scripts,

we can change the functionality inside of theirs,

and we can roll all of those inputs from the engineers,

into these custom scripts immediately,

and we can start providing more value to our engineers.

Okay, so I'm going to end it here. I know I'm a little bit over time.

Kirsten Sorry about that. I'll say thank you here.

Here's mine and Prince's emails.

Feel free to reach out to us if you have any questions

or you want to ask anything.

Thank you.

Files

All right. Hi, Welcome.

Thanks for joining this online session

of the 2023 America's JMP Discovery Summit.

I'm presenting on behalf of myself and my team member, Prince Shiva.

Both of us are data scientists here with Intel.

Today, our talk is streamlining statistical analysis

with custom JSL scripts,

and we'll focus on how at Intel we develop

these custom analysis scripts using JMP scripting language JSL.

Okay, this is our agenda for the next 25 -30 minutes or so.

I should have time for Q&A at the end of the session.

I shouldn't go that far over time.

Hopefully stay within the 30 minutes.

Normally, I would stop and ask for questions,

but since this is a prerecorded question, I've been informed

that I'll actually be able to answer questions in the chat.

If you have any questions, just feel free to just type them into the chat.

Maybe put a slide number inside of there as well.

The slide numbers are on every slide, and I can just directly answer

those questions in chat, or you can write them down and ask them live afterwards.

There is a callout on the bottom

of the slide here that all of the code and data tables that we're going to be

sharing today, we're going to be walking through a live demonstration.

All of those are available on the conference or summit website.

Go ahead, go download those,

open them now and get them ready to go, get your JMP open.

We have a couple of introductory

things that we're going to be going through to start here.

Take that time to get yourself settled

so that you can walk through this code with me and see where it is.

But cool. This is our agenda.

We're going to be going through some quick

presenter introductions and as a background on our team.

We'll do some background on motivation,

on why we like to do these custom JSL scripts and what value they have.

We'll go through a really high- level

overall analysis flow for these custom scripts, and then the meat

of the presentation is going to be this one sample analysis demonstration.

We're actually going to show a really simple

custom analysis script for one sample analysis,

and we'll go step by step through each of those different sections

to make that custom script.

We'll finish with some conclusions, recommendations and final thoughts.

All right. Present your introductions here.

I'm here with Prince Shiva.

You can't see him in this camera here, but he's here with me.

He's a data scientist here at Intel.

His research interests are in process control

system development for manufacturing,

as well as operational cost optimization through different data science methods.

He's been with Intel for the past four years,

and he has about the same amount of experience with JMP as JSL.

That's when he started working with JMP was when we came to Intel.

My name is Logan Mathesen. I'm also a data scientist here at Intel.

My research interests are in Bayesian and Black box optimization,

statistical surrogate modeling and design and analysis of experiments.

I've been with Intel for the past two years.

I've been coding with JSL.

I got introduced just before I came to Intel,

and I've been working with JMP for the last six years or so.

I've been really lucky,

so I know the value of these nice click and drop user- friendly kind of interfaces

that JMP has that makes statistical analysis so nice.

Just a little bit about our team, Prince and are on the team.

Together we are team members, so our team here at Intel,

we are the statistical lead for Intel manufacturing.

We are responsible for the experimental design and analysis of different

experiments that happen within our modules at Intel.

We also do statistical method development and maintenance.

Any new statistical methods that we want to employ,

or different methods that we need to maintain,

as well as doing all of the statistical training for the engineers here at Intel.

We'll train them on the basic concepts

of statistics as well as, how to interact with our custom scripts.

To do that analysis for them.

Like said, we do have a custom JSL repository.

We proliferate, own and maintain that repository.

It has over 150 different JSL scripts

for automated, streamlined analysis and reporting.

These scripts are really nice because they embed internal

best- known methods,

directly into the analysis.

Decisions that our team has made about the right way to do statistics,

we've embedded all of that decision- making,

directly into these custom analysis scripts,

and that means that they are reproducible, repeatable across the world.

Actually, everyone in Intel manufacturing

is using these scripts for their statistical analysis.

Okay.

Again, just to give some more motivation and background for the value

of these custom analysis scripts, these automated analysis,

they do improve data exploration, visualization and analysis,

as well as standardizing all of those types of activities.

It's always the same kind of exploration, visualization and analysis.

It's really helpful for common analysis activities .

Engineers are a lot of times doing

the same sort of activities when they're talking about analysis.

Maybe they're establishing and monitoring process control charts.

Maybe they're qualifying newly installed tooling to make sure that that tooling

is performing the way that we would expect it to, to have a qualified tool.

Maybe they're doing product qualification to make sure that we can actually produce

a product of quality on that different tooling.

Maybe it's metrology qualification, making sure that our metrology are taking

accurate and reliable measurements, or maybe it's some sort of yield analysis.

But as you can see,

these are all very common engineering activities that get done,

hundreds of times a week across the world here at Intel.

All of these things can be automated

and being a nice standard report format.

For me, and I know Prince, maybe he feels the same way.

Selfishly, I really love these scripts because we do review

all of that statistical analysis that comes through.

As the statistical lead, that's what our team is responsible for.

For me, selfishly, it's really great because I always

seem to see the same analysis.

I know the way that it's supposed to be

analyzed, and I'm able to see that same exact report coming out every time.

It makes for a very efficient analysis review,

as well as analysis generation there.

Over here on the right- hand side,

you're going to be seeing this automated analysis output.

It's just a nice block diagram to show

the components that go into these custom analyzes.

Right up top, we'll sort of do a bottom

line up front that we'll have a nice summary table that has those custom

statistical decisions baked into it .

If you only have one thing and you only need to review one specific thing,

you're just going to look at that summary table,

with that final decision that comes from our best- known methods.

Then there's a lot of supporting information underneath that to help give

a more full picture of the analysis for a deeper dive.

Of course, we're going to include different visualizations, plots,

summary statistics, and then we'll see in a second.

But this is really the heavy lifter behind these custom JSL scripts .

These native JMP platforms that we're used to.

Specifically, we're going to be looking

at a distribution platform in a second to help us with our analysis.

But this is really the backbone of all of these scripts .

This is where the heavy lifting gets done.

Okay.

Let's go into a really high- level, general analysis flow.

On the left- hand side, this is what the user is going to be

viewing or experiencing, as they're using these custom JSL scripts.

They're of course going to load some data table of interest.

We're hoping that they have some data if they're trying to do a data

analysis here, and then they're going to go navigate to the custom analysis

script that they need to run, they'll go ahead and execute that.

That's then going to pop up in input GUI for them to interact with .

This input GUI is actually going to be almost

identical to the regular JMP input that we see .

The distribution platform of, "Hey, you want to do this type of analysis,

tell me what fields in your data table, what columns go where ."

For this kind of analysis after they enter in all of that input GUI information,

they're then going to wait as the script manipulates and analyzes that data,

and then it's going to present them with a final output report GUI.

What we like to do with our analysis,

the more complex ones, is that we'll often have some extra

interactivity that can be done inside of that final report.

The engineer can do any final tweaking that they want,

complete their final analysis, and then they have that exact report

ready to go that they can share with anybody to share this analysis.

Now, underlying that, us is the developer.

What does this look like to get a custom script done?

First off, we need to generate that input GUI.

We then copy data tables because we never

want to be manipulating an original data table that a user has.

That's a great way to break someone's trust

and make them not want to use a script is by destroying their data.

Always make that copy.

We then execute those standard JMP analysis

that was talking about, those native platforms.

We store any critical variables out of those,

into some code variables so that we can reference them later.

In other portions of the report,

we go ahead and create any visualizations through maybe Graph Builder or a similar

platform in JMP, and then we create that sort of final

analysis summary table or that decision table.

Then we present the user with that final output report .

We generate that final report for them.

Again, all sort of background here.

The rest of the presentation is going to be us going live

through both of these flows .

Seeing what the underlying script looks like,

and then also seeing what the actual user

is going to be experiencing as they're going through this.

All right.

Here's our basic analysis demonstration.

Again, this is going to be from the position of a beginner,

from a JMP beginner.

One of the things that we're going

to be doing in the spirit of a simpler context for these education purposes,

is that we're only going to be covering a simple one- sample analysis.

If you have a set of data,

is the mean of that set of data equal to a given value.

Again, all of this JSL code and all of these data tables

that we're going to be showing, are available online for us there.

Let's go ahead and jump into it.

First things first.

We have a data table here, with 16 different entries.

Let me find my mouse. There it is.

Awesome.

We have these 16 different data table entries.

We're going to be interested in this parameter here, thickness .

We have some process parameter thickness

and we're going to say, "Is the average thickness equal to one micrometer?"

That is the statistical question for this analysis demonstration that we have here.

Over here on the right- hand side, again,

we would imagine that the user would have some sort of data table open.

But if the user decided to run this custom script without that data table open,

here's just an example of some code that would check to see if a data table

was open, and if not, it would allow the user to open up a data table.

Otherwise, it's going to say, "Yes, this is the data table that I want to look at.

Let's start my analysis."

Any good thing that we should do anytime we get any sort of data is always

just make some sort of visualization, get our hands on it.

Here's just a little visualization of this thickness,

by this data table entry here.

One through 16, that's going to be on our x- axis there.

All right. Let's jump over to JMP again.

Hopefully, you have your JMP open if you would like to follow along,

at least hopefully you have your JSL code up and going.

I'm going to open up just my JMP home window here.

You'll see that I already have the data table open,

and I already have the JSL script open and ready to go.

I'm not going to go through opening them here.

The other thing that's important is this log.

We are going to be talking about this log.

This is sort of your best friend,

as you're developing any sort of these scripts

to make sure that everything is running appropriately.

Let's go ahead and open up all of these here and let's take a look.

On the left- hand side, we do have that actual JSL script inside of there.

We have a nice header, we have some log lines inside of here.

If you highlight something and hit this run button, it will run just that portion.

If nothing is highlighted and you hit the run button,

it's going to compile and run the whole script.

Just be careful with that.

Again, inside of these scripts, we're going to have a lot of these dividers.

Prince and have really done our best to do some really thorough commenting

and some really thorough dividing inside of here to make it easy for anybody

to pick up and read this and hopefully jump on their own custom scripts here.

But again, everything up here,

just printing some log lines, making sure this stuff is going well,

clearing variables, opening up tables if we need to, so on and so forth.

This is really just some initial workspace

cleanup kind of things that we're going to do,

so let's highlight all of that and hit Run.

We'll see out here in our log that yes, indeed,

that completed successfully inside of there.

Let's go ahead and flip back over to our slides.

We're primed and ready to go, our workspace is ready to go.

Again, the first thing that we need to do as the script developers that we need

to present that primary input GUI to our user .

This is what it's going to look like on the left -hand side,

and again, it should look very familiar to a standard JMP input window.

On the right side, this is sort of the meet and the primary

way that we get that GUI going, it's going to be this column dialog box.

You'll see we have a title,

we have this response list, which is going to be these response variables, response.

This is going to be the variable name for us moving forward so that we can

recognize what the user entered into this field.

We can see that this is a required numeric.

That's because minimum column is one and data type is numeric.

We have this run ID list, which is going to be our order number.

What order were these measurements taken?

This is going to be critical for our visualization.

This will be the x-axis on our visualization,

and then of course, we have sort of an alpha box.

This is going to be the alpha- level for our significance testing,

for saying whether or not our mean is equal to our hypothesized mean.

It'll default at 0.05, but the user can always change that

as is called out in the bottom right- hand corner here.

There are some other elements in this section of the code.

We'll look at it really briefly as we go through it right now.

But that's the high level of what else is done inside of there in words.

Let's go ahead and flip back over

to JMP and let's take a look at this primary user input window.

Again, it's going to be this next

divider which starts up here, and goes down to about there.

We'll start from here.

Again, we have some log lines, and then like said…

Sorry, let me grab that common line or I'll get an error.

We have a log line,

and then like I said, this is the meat and bones of that primary input GUI.

Let's go ahead and hit run on that and we'll see.

Here it is.

We have these tables open because we're

looking at this specific summit data table .

We have these columns available.

Thickness will be our response variable

measurement number will go on that order number .

We're actually saying, "Hey,

something might have happened to this data table that it got sorted or something,

but this was the actual measurement order that these things were taken.

We're going to put that as our order number variable inside of here

and we're going to go ahead and click okay and we'll see

that everything went through fine.

Like I said, after this,

there's some other error checking, some buttons down here.

We'll see that there's a lot of print lines to make sure.

"Hey, is that input window working the way we think it is?

Are restoring the variables

and the way that we thought we were storing them?"

This is just a developer kind of check for us inside of here.

Let's go ahead and run that, and we can see that, yes, indeed,

our response is thickness,

and that run ID is that measurement number.

And alpha was unchanged at 0.05.

We do have some error checking.

We'll get to that in a couple of slides where we'll talk about all of that.

Just one quick note down here.

That cancel button that we saw inside of our …

There we go. This cancel button that's over here .

We have the OK button and we have a Cancel button

if the user ends up selecting…

Sorry. Let me go back here.

Computer, work with me. There we go.

If we end up hitting that Cancel button, what happens?

Well, JSL actually doesn't have anything pre- coated in.

This is us putting in a condition that if Cancel is entered,

we're going to go ahead and throw this and kill the script.

Let's go ahead and run.

These last sections here, and then we'll flip back over

to the slides and that's how we're going to run our primary input GUI .

That's simple. That's all it is there.

Pretty straightforward to get such a nice interface inside of here .

Less than 100 lines of code inside of there.

Cool.

Next up, we're going to talk about creating that copy data table.

Again, we never want to corrupt our users data table.

On the right- hand side,

we're seeing the code for how to create that copy really well commented.

Every single line has a nice comment to tell you exactly what's happening

inside of there, even if you're not familiar with JMP or JSL.

We'll go ahead and scroll down and we're just going to run all of this.

You'll notice that right now in the top right,

we have the summit data table, that's going to turn into a copy .

That's the original currently.

But when I run this, we're now going to open up a copy of this table.

There's this copy data table.

We'll pop it back up in this corner and you'll notice that now this script

is operating over this copy data table .

We are no longer doing anything on that original data table.

Any manipulations we do is on that copy. We're good to go.

We're set. It's clean there.

Let's go ahead and flip back over to the slides and we'll move forward.

After we've given that primary input GUI,

oftentimes we do need some secondary or even tertiary input GUI.

The users provided us some initial

information about how they would like their data to be analyzed.

Now it's some follow- up questions.

Come on to that.

For us, for this one sample analysis,

again, we're looking at our thickness variable.

This is what it's going to look like, that secondary input window .

We know which variable we're targeting, we want to analyze, but what is the target

value that we want to compare it against?

What's the value of interest here?

You'll even notice that in the title

of this that we're already calling out the value for thickness .

This script is already starting to be smart and it's already starting to be

adapted for us,

of listening to what the user said in the primary input GUI

and proliferating that, into that secondary input GUI.

It really just makes it clean for users as they step through these.

When we get to more complex kinds of scripts.

Inside of here in the middle, we see this is…

Again, the main code to generate that secondary input window.

There are a couple of other

functionalities inside of the code that will walk through a little bit.

Just one note, secondary input windows,

they're not necessary, but they are nice to have.

Of course, if you have too many tertiary input windows, it'll slow us down and it's

too many clicks,

but a lot of times it's nice to have some flexibility and some

adaptive script logic to actually make for a better user experience.

If you go overboard with it, of course it'll make it a worse user experience.

But sort of finesse is key when you're designing

user experience for these custom scripts,

because they need to be usable at the end of the day.

All right, great. Let's jump back over to JMP.

We'll look at this.

This next section, we're talking about the secondary user input GUI.

Again, we're just going to start…

we're going to create a variable for targets.

Maybe we have multiple parameters that we're studying all at once.

Here's another error check for a missing target.

Then like said, here's that big heavy lifter

for that secondary input window is all of that code there.

Then we'll just run these last little bits.

Again, these are just pulling

information out of that secondary input window.

We'll go ahead and run all of that together,

and we'll see that here's this target value.

Again, it's already recognized that it's for thickness.

We said at the beginning that we want to know all of these thickness values.

Are they equal to a value of one micrometer on average?

Is that the mean value there?

We'll go there, and we'll go ahead and hit OK,

and we'll see that everything went through okay.

No errors inside of there.

That's all of the inputs that we need from the user at this point.

The next thing that the user would see is nothing .

They would sit and wait maybe for a couple

of seconds, maybe for 10 seconds if it's a really heavy script.

But at this point it's all of the actual

analysis that needs to happen in the report generation.

Before we jump into that, let's jump into the different error

checking that we've exemplified inside of our script for you here.

Inside of this primary input GUI, we do have this error check.

You can see the code numbers.

Essentially, it's just saying, "Hey, let's make sure that our alpha

significance level is between zero and one .

If it's outside of zero and one, you're going to throw this dialog box here

where it's going to tell you what went wrong.

This error checking is a nice

example of inline error checking for us there.

We have a different type of error checking.

We give you a second kind,

which is going to be this function- based checking.

When we're talking about this secondary input window,

we do have this missing target expression.

This is an expression in JSL

other scripting languages, call these functions.

But again, this is just a nice way for us to also just call this expression to say,

actually, was there a missing target inside of there?

If the user hits okay with an empty

target value, you're going to get out this big box here.

Okay, awesome.

Those are examples of error checking.

Let's JMP into the actual analysis then.

Like I said at the beginning, the heavy lifter for all of these custom

scripts is always going to be relying upon these JMP native platforms .

Those have all of that quality already built into it.

It's a lot of risk mitigation that we didn't do something wrong when we coded.

That statistical analysis and also that we know that it's the most accurate

statistical analysis that is available, that quality inside of there.

For this example,

we're going to be focusing on a distribution platform here.

This is just the standard JMP native distribution platform here.

The nice part about JSL and these native platforms is that you can directly

interact with these native platforms through JSL.

On the next slide, we'll show some tips and tricks,

for how you can actually interact,

and pull just the specific values that you want.

There's a lot of good information that's presented on these different JMP native

platforms,

but oftentimes there's just a couple of key elements that we really

need to show, to report out to different engineers.

All right.

Let's JMP over to the code then, and let's go ahead and run this part.

This next divider is actually just

going to be all of the actual analysis grouped together.

We'll just go through portion by portion here.

This is just creating some container variables.

We'll talk about that in a second.

But let's go ahead and run that, and we see that that was all.

Okay, Let's open up the log inside of there.

Yap, everything is okay.

Now this is the actual distribution platform .

This is us creating that distribution.

This vlist box is going to send it to our output box .

That's going to prepare us for our final report generation.

But if we just want to inspect this while we're doing some development,

if we run the code from here up to here, but do not include the comma.

If you do not include the comma and you hit run,

we'll see that we actually get out our nice distribution platform here.

We've done some nice things.

We've added the target value inside of here.

You can see that we're already testing for the mean and the hypothesis value,

is that target value that we're interested in, we get some nice summary statistics

mean standard deviation, so on and so forth inside of there.

But that's the way that you could always

create the same standardized distribution report.

Oftentimes different people with different JMPs will have different preferences,

because we've specified each element of this platform,

it's always going to generate the exact

same distribution platform coming out of there.

Okay, so that's the distribution platform.

Now let's see, how do we actually interact with this distribution platform to create

a nice custom script that's going to be over here?

It's a little bit scary the first time you look at it,

but you end up finding out that this properties functionality

that's built directly into JMP is going to be our best friend.

Ultimately right now what we're showing is how can I pull those summary statistics

that I want to display in my bottom line up front summary table?

How do we pull those statistics directly out of that distribution platform?

All of that calculation was already done for me.

How do I then report it somewhere else?

It's going to be from this property's functionality.

For us, we'll see that we're interested in the P value of that statistical test.

We want the mean of our data set, standard deviation of our data set,

and the lower and upper confidence intervals.

We'll see that we then are going to insert

all of those values into those containers inside of here.

Let's take a look at our distribution platform,

and see how we can use this show properties function.

We're on our distribution platform.

If you go to the summary statistic, I want to pull out this mean value.

How do I know the code to pull that out and interact with it?

We're going to right- click, and we're going to go to Show Properties.

Once you're in Sow Properties, you can click on this box path right here.

This box path, this is now the exact code that you can use to reference any

of the numbers inside of this blue highlighted box.

You'll see that these are the same items that are shown over here.

This is the mean value, the standard deviation value,

lower and upper confidence intervals there.

You'll see you can sort of see it

on the bottom right here that it says this value get one.

This is for the mean and it wants to return that first value out of it .

We would add the value of get one to get the mean next to this.

You'll notice that this says report platform here.

If we look back over here,

it says report dbox . Now why do we say dbox there?

Well, D box is the specific name that we gave our distribution platform.

Right?

We're saying refer to this platform that we just created and pull out

those specific values and store them into these container variables.

That's exactly what's happening in all of this segment of code.

Let's go ahead and flip over to our JSL custom script,

and let's run this next portion.

Actually, sorry.

I need to close out of my distribution platform.

Otherwise it may corrupt that there's a couple of distribution platforms

all contending at the same time.

We're going to run all of this section and we're also going to get up to here

where we're going to pull out those summary statistics.

We hit run and we see great everything went through just fine there.

That's how we actually are going

to interact with those heavy lifter native JMP platforms.

Again, rely upon the stuff that's already built and you can already trust,

and then we'll build further from there.

The next thing that we're going to show is, well, how do we create

that summary table?

I just showed you how I can pull out these mean values,

the standard deviation value, these confidence intervals

that we're leveraging that distribution platform over here on the right.

This is just how we can create this summary table.

What you see on the left is exactly generated by this code on the right here.

You can see that we already look at these targets.

This means all of these other containers that we already initialized previously…

Just to remind us where these values came from,

it looks something like this .

We're pulling out these different values out of the distribution platform.

You'll notice again that we already have

this hypothesized mean of our target of one.

That's, of course, coming from that secondary

input window of one there.

We're going to go ahead and drop that target of one there.

The other important thing on this summary table,

like said, is this nice custom decision- making .

That we can put whatever logic we want to put inside of here.

It's kind of silly for this one sample analysis, example.

But overall, it really is one of these things that…

This is where you as your as your company,

as your profession, you get to implore your own expert opinion

about how decisions should be made.

You can look at the statistics and say,

"No, actually this is how we would like to make decisions, and want to put

that right up front so that it's immediately clear

to anybody who opens up this report

of how we analyze and what decision we come to you."

Let's go ahead and just run that code.

We're not going to be able to see

the portion like we did with the distribution box.

We're only going to be able to see

the summary table when we do that final output report.

But you'll notice here's that custom decision- making right here.

For saying how do we decide if it's not equal or equal,

what we're going to look at the P value?

Of course, that's kind of silly, but the point stands for more complex reports.

We'll go ahead and run that portion of the script and we'll see.

Let's pull up our.

Yap, it seems like no error is coming out of this log inside of there.

Let's flip back over here.

The last component.

We've talked about native platforms, we've talked about summary tables.

Now we need to talk about visualization.

Again, visualizations.

The reason why we have them,

is that their immediate and transparent data quality checks .

It's something that anybody can look

at and they can immediately draw some value out of it.

The way that I found the most value out

of these visualizations and these custom reports, is not necessarily for me,

it's really for the other reviewers who are the module experts .

The process experts can quickly look

at data and they'll say, " Yeah, that looks weird."

That's not how that process behaves .

Or they'll look at it and they'll say,

Yeah, that makes sense that that's how that process behaves.

But these visualizations give a lot more than just the pure statistics,

especially when you're talking to somebody

who's not a statistics professional or statistics expert.

Again, visualizations, they're great.

They allow for proper

checking, for data corruption, as well as analysis corruption.

If you see something weird in your visualization,

you should not trust the analysis that's associated to that.

On the slide right now, it's just an example of how we can turn

our data table into a nice refined visualization over here.

We've even added that target line inside of there that the user defined for us.

Next slide is a word of caution about how we use these scripts.

These visualizations, again, they should highlight these data concerns,

but the user needs to know how to use them.

I said that this order number is what determines

the x-axis on our visualization.

If the user enters data table entry because they say, " That's the order,

that's what it is in the table."

They'll get something that looks like this .

This is what we've been looking at together so far through this presentation.

But again, I told you it's a more accurate representation.

Is this measurement number?

Something happened to this data table to get it sorted in a different order.

If we plot this visual- based of off of measurement number,

we're going to get something that looks like this.

Everybody here should notice

this immediately as a red flag that something is wrong.

We should never have data that's trending in this manner.

Either there was something wrong with the process or there was something

wrong with how we were measuring the data with our metrology then.

But we shouldn't really be trusting the results of this analysis.

When we see a visual like this.

We need to go and recollect the data, figure out what went wrong there.

Again, just a word of caution that if you want to use this,

you need to teach your engineers the right way to use it as well.

Just for us to say, hey, "How do we create these nice, beautiful visualizations?"

We like to use the Graph Builder platform. It's a wonderful platform that JMP offers.

It's super intuitive and easy to use.

You can make a beautiful, wonderful display here and you say,

"Yes, this is exactly how I want to display my data."

Then you can use this platform to automatically generate your JSL code

by clicking on the little red triangle up here.

Going save script to script window.

You'll get out a set of code that looks something like this.

The one word of caution is that of course, these variables

are going to be hard coded inside of here,

so you're just going to have to update that so that it interacts nicely

with your user input so that it adapts,

to whatever your user inputted into that GUI there.

But these are all of the elements then that go into the final report

and this is what that final report looks like.

Again, pretty straightforward.

We just say create a new output window.

We're actually going to make this a tab box here.

We only have one tab called Report, but in our more complex reports,

we'll actually have sometimes up to like 10 or 12

different tabs inside if you're all with different information.

But we have this summary table…

Again, we already created that summary table.

Let's put it there.

We have this nice graphical plot that we'll put over here and then we have

that nice distribution platform and we'll put that inside of there.

We have the overall takeaways right up top,

and then we have all of the supporting evidence underneath it.

Let's flip over to JMP and I know I'm just slightly over time here,

so we'll finish up quick.

Back over to JMP and we will run that final portion of the code here.

We're going to run both the Graph Builder.

Let's build that graph and send it over to the report,

and then let's generate the final report here.

We go ahead and hit Run and there's that platform.

The nice thing about these custom analysis scripts

and again, it's just a nice thing about JMP, in general,

is that all of these reports are going to be interactive .

Even though this was a custom report,

this platform is still connected to this platform down here,

These platforms are still going to be connected to this overall.

Remember that we're working with the copy data table now .

Nothing gets corrupted, but it's still going to be connected over here .

You can select different points inside of here and figure out,

well, what measurements are those responding to inside of there?

That's overall, these custom analysis reports,

what they look like, how we can make them.

Again, it's just a simple case, but let's move forward here,

into some overall conclusions and insights.

Final takeaways.

At Intel, these scripts have really become a critical

component of our data- driven decision- making .

It makes things so efficient and so fast

and so repeatable and standardized that it's wonderful.

Again, these are all sort of the same ideas.

The only thing to add is that it does also allow you to embed

that custom decision- making

for your company's specific best known methods for your specific processes

and analysis that you're going to be doing.

A quick note to make again,

it's the caveat that we said about that graphing here.

We do need to know that there's going to be some teaching resources that we need

to invest into this.

When we proliferate these scripts,

we can't just give them to the engineers and say, "Go do some analysis."

We need to tell them,

"This is how we intended to do analysis with these specific scripts.

In that same vein of thinking here,

we do have this custom decision- making infrastructure.

It's going to require maintenance.

There's going to be bugs, there's going to be corner cases that you didn't know.

Prince and I have run into plenty of these cases where an engineer

comes to us and says, "This isn't working,"

and we say, " That's weird. Let's look at this ."

We have to spend some time debugging inside of there,

especially, when your company wants to step to a newer version of JMP.

Here at Intel, we just stepped to the newest version of JMP.

Sixteen or 17, one of those,

and we had to go back through all 150 of those scripts,

and make sure that they were still compatible with the new thing .

Again, there's a lot of infrastructure maintenance that you should be aware of,

that that's it's going to come into play .

Especially when you really start

to proliferate this and make this a large repository.

Again, We should also be treating this as a living infrastructure, though .

It changes and that's a good thing.

That's why we have the power as the custom analysis script owner,

that we can change things inside of there

and we can do it immediately and quickly and we can be really agile about that.

Users, they might be hesitant initially.

They're going to learn to love this, they're going to really adopt it,

and they're going to start to do some strange things with these scripts.

They say, "Hey, I love this analysis. What if I did this?"

They're going to start using them in new, nonstandard ways?

You shouldn't get mad at them, these are actually opportunities .

If an engineer is using the script in a nonstandard way,

that means that there's some functionality

gap that they wish they could have that would make their job easier .

We should take that input, and we can revamp our scripts,

we can change the functionality inside of theirs,

and we can roll all of those inputs from the engineers,

into these custom scripts immediately,

and we can start providing more value to our engineers.

Okay, so I'm going to end it here. I know I'm a little bit over time.

Kirsten Sorry about that. I'll say thank you here.

Here's mine and Prince's emails.

Feel free to reach out to us if you have any questions

or you want to ask anything.

Thank you.

Streamlining Statistical Analysis with Custom JSL Scripts - (2023-US-30MP-1488)

Presenter

Files

Advanced Statistical Modeling

Automation and Scripting

Basic Data Analysis and Modeling

Consumer and Market Research

Content Organization

Data Blending and Cleanup

Data Exploration and Visualization

Design of Experiments

Mass Customization

Predictive Modeling and Machine Learning

Quality and Process Engineering

Reliability Analysis