Hello everyone. My name is Steve Hampton.
I work at Precision Castparts.
I'm the process control manager there and I'm here today
to talk about unleashing your productivity with JMP customization.
I live with my family in Vancouver, Washington.
I have been in castings my entire career.
The last 15, I ha ve been with PCC, which is investment castings,
and I am a self- proclaimed stat nerd.
I think this little post-it note,
it briefly describes a lot of my conversation
I have with my wife,
where she just gives me a very strange look as I try and explain
why I am not watching TV and I'm playing around on JMP on Saturday,
because I have a tasty beer to go along with it.
So when I'm not nerding out on stats,
my other thoughts usually focus around work
if it's not around fun activities outside,
and work is pretty cool.
We make a lot of different products, but the one that I really like to show off
is this 6 foot in diameter one- piece titanium casting.
It's called an engine section stator.
It goes on the Trent XWB engine which goes on the A 350 airplane,
if you're keeping track.
And you can actually see it tucked right behind those main fan blades
in this second picture,
as the first thing that the air will see
before it enters into the core of the engine.
So just a really cool industry to be in,
aerospace and some high- tech investment castings.
So why am I here?
Well, I love JMP,
I love talking to people that love JMP,
and I love talking to people that love stats.
So great to be around like minded people.
And I hate clicking buttons to get things accomplished.
So if you remember back in the day,
there is a little known Christmas movie called The Grinch, [inaudible 00:02:02],
and he had a scene where he's just saying, "The noise, the noise, the noise,"
and I feel like that's how I am a lot of times when I'm in JMP.
It's the clicks, the clicks, the clicks, they just drive me insane.
And I'm here because I like flexibility, and I think most people do.
So I like to share some things that I've done
to increase my flexibility with JMP.
Interesting note, when I thought about this presentation,
this mindset actually started way back in the day.
I remember loving my NES, but the controller, very quickly,
I had the thoughts on, "Well, why can't I have Jump be B and Run be A?"
Because that worked better for me.
And then that only got worse.
The controllers got more buttons, which was great.
Added to my need for flexibility
but it took a long time for them to start to allow us to customize things.
Now they're pretty good.
But before the consoles got pretty good, I really found my benchmark
for life of interfacing with a program in computer games,
because not only do I have a keyboard that had tons of buttons
so I could just immediately cause an action to take place,
but I could remap all of them.
So that's been my baseline
when I compare everything that I interact with that's electronic to that.
So there is hope, though,
because we have the humble toolbar, at least you think it's humble,
and the scripting,
which is everyone that's involved with it knows it's incredibly powerful.
Real quick, our first efficiency power up is the toolbar.
So it's like Mario getting a little mushroom there.
He goes from a small little guy to a bigger guy.
A toolbar is by default pretty limited, but it's really easy.
Just go turn on things and you immediately have a lot of access
to things you can do with just one click on actions
that you'll probably do a lot during data manipulation.
So I recommend you keep on Reports, Tables, Tools, Analyze, and Data Table,
which I marked up here in red boxes on the right.
Another tip is that you can actually turn on toolbars
for these windows differently or independently,
and you can move them around independently,
which is really nice if you want to have a custom set for each window,
but be careful, if you move around too much,
you' ll lose some of your efficiency as your hunt for where the icon is,
has changed from one window to another.
Even better, you can make your own.
So if you go into the customizing toolbars and new toolbar, you can make your own.
You see all those blue ones are ones that I've made.
I think I've actually now made more toolbars for myself
than come in JMP.
So w inner winner chicken dinner, I guess.
And the black ones are actually ones
that I've added some additional icons to as well.
So I think that combination gets you to raccoon Mario,
which I thought was a pretty neat stuff back in the day.
Always wanted to be raccoon Mario.
Just some real quick other little things before we get into JMP
is you can link frequently used buttons if you use the built in command.
This really works well if you still want to be able to undo
when you click something.
If you link to a script and you run it, you can't undo it,
which is a little bit annoying,
but I usually put my script as a Run JSL in this file so it's linked,
not embedding it in the toolbar,
because then I can change the file outside of the toolbar
and update the functionality
without having to dig back into this toolbar menu.
You can use built in icons.
I use PowerPoint to just give a shorthand of what the icon does
and then save it as bitmap and upload it.
It's not great.
If anyone has any better way of doing it, I'd love to talk to you at some point.
But then you can also assign shortcuts, you can hide built in toolbar icons
that you don't really want to be interested in.
You can also add to your standard toolbars.
So really scripting in JMP is super powerful,
and then when you combine it with the toolbar,
you get a pretty legendary efficiency team that I think looks like this.
That's how it appears when I think about the combination.
So let's get into JMP.
So I have this data table here, and it's got a lot of columns.
Normally, this would take a fair amount of cleaning up.
First thing I do is just understand,
is there the right column types and the right column amounts?
So I can immediately see these two columns that are highlighted,
that they're supposed to be continuous.
You normally could right click,
go into column info and change things up here.
It's already bothering me because that's too many clicks.
I can't change it to continuous here because it's a categorical base.
So I made myself a little macro that I can go ahead and click
and it's done.
No matter how many I select, it's done.
It kind of is a combination.
It's a one click standardized attribute, which is great.
You can also see over here
I have this column that's the date and it's messed up.
Once again, I don't want to right click and dig into subfolder.
I just want to click and go.
So I have made myself a two date function here.
So now I have a two date.
I also have this batch, which is right, it is technically continuous,
but a lot of ways I want to use it would be more ordinal,
but I want to have both.
I've made myself a script that just throws out another column
called Batch Nom and it's now a nominal column.
And the reason that you might want to use this,
if I select these guys and go into a filter,
maybe I want to filter being able to drag,
but if I want to grab just a single or just a couple batches,
then it's a lot easier to do it in the nominal state.
And also the way it shows up on some of the graphs
can be better i n one way or the other.
Then the next thing we can do is see that this date is individual date for a day.
A lot of times we roll things up by weeks.
So I could use the awesome built in formula columns
and go and get a year week column as well,
but it doesn't really mean a lot for most people because it's not a date.
It's like, "What does the week five 2020 mean?"
So I have built in a function where I take the date
and it will return the next Sunday after that date.
And so now I have a weekend column where I can bin it by weeks
and it's really easy for people to understand
and it's continuous versus the other way of doing it makes it nominal.
So a lot of advantages to me in that.
So I have just a lot of things that helps me clean up.
The last thing is since this came from the categorical to numeric,
then it has some missing things in here.
I know these missings are actually zeros
because if it doesn't have any data, it means there wasn't any defect.
So since I do this a lot,
I actually have this recode missing to zeros and recode zeros missing.
So recode missing to zeros, there we go.
So I haven't had to actually go in here, recode and then do more.
Once again, already too much typing.
For data manipulation steps that you do,
adding in some scripting really can make you super effective in the data cleanup,
and so you don't have to think about scripting just for analysis
that you're running a lot.
Just think about it in more micro steps to get some efficiency gains.
The next thing is,
I'm going to take us into Graph Builder, and let's bring this up.
And so I spend a ton of time in Graph Builder
because it's one of my favorite platforms.
You really get a feel for your data and it's easy to get people
that maybe aren't as deep into stats to understand what's going on.
So this is probably the main platform I live in.
And as I bring this up, immediately you can see like,
"Oh well, since defect one is not in the right condition,
the graph doesn't look great."
But the nice thing is that I don't have to go to the data table.
When I first started,
I hated going back to the analysis or in the data table.
Or I put them side by side, but then everything gets crunched up.
So what the win was here is that by learning about the report layer
and being able to pull out the state of different reports, in this case,
I can pull up the state of what is selected in this box,
I can actually select it in the data table.
So now that is selected in the data table,
and I could use my Go To Continuous, and now I'm back in business.
So I call this staying in the workflow.
I learned about that term from watching an on demand webinar about formulas
and they were talking about staying in the workflow as far as staying in JMP.
Don't go to Excel, do some formulas and bring it back into JMP.
Like learning this use of formulas in JMP because its formula maker is amazing
and you're staying in the workflow.
So I'm saying you're staying in the workflow
of staying in your analysis window, and that's where you want to live.
I don't want to have to go back to the data table.
So I'm going to use a standard toolbar to put a column switcher on
and we're going to get all of these...
Oh my goodness, all of these columns here.
So we got a column switcher,
and I also have put in another script here
where I can now select from the data table with my column switcher, which is great.
And it opened up another world of using a script that Jordan Hiller
had helped me with when I was just starting down
my scripting path of what we called newcome.
So it was a way of taking data,
this data is not good data, it's not fully completed parts.
So I want to get rid of this, but I don't want to just hide and exclude.
If I use my little shortcut, Ctlr+Q, that I remapped, that's gone.
That's what I wanted on this slide,
but now I lose all the information on that row.
And I don't want to have to use Row E ditor,
I don't want to have to use subsect with linking.
That's all tuny c licking.
So what I have here,
I'm on the right response in the column switcher,
I can select these guys and I can run my newcome script
and now those data points are removed.
So very quickly, you can go through with the column switcher and the newcome
and be able to remove data that is either an outlier
that you know shouldn't be in the data, or is causing problems,
or is actual bad data that should be out.
And I see a lot of bad data in the form of,
it's out of place in the sequence of time.
So this one's obvious, right?
That's obviously bad data.
It's obvious to me.
So I'm just going to blow it out.
So here's an interesting one.
This is my interesting one at first.
So you can see that the A,
it's got some really crazy ones here, and these are all bad data.
So another way you can look at that is I'm going to use...
This toolbar is actually something you can just select as a standard script.
You can just select this function in JMP to redo.
So now I have my new column.
I can take this out and I can do a box spot
and I can say, "Okay, cool, here's outliers."
So that's a way to blow things out.
You can see I had a lot of them, but these guys are not outliers.
And really, I'm using outliers in place of bad data
because bad data usually shows up as an outlier.
But these ones were not.
[inaudible 00:14:40] show up as an outlier in the box spot.
They are not bad data.
So I'm going to nuke out all these guys.
And you can see now,
I don't have anything on the low side that's saying is an outlier,
but I do know that I have outliers still
and they' re outliers that I'm going to call in time.
So this is so far away from the other data points
that I know from my experience and looking at the [inaudible 00:15:10]
that these are not real data points, they are data that we have jacked up.
So I can go in here and select all these points
that are bad data because of where they are in time
and get rid of those.
And you will never see that from a standard outlier analysis.
So now I have a very nice looking curve, everything is cleaned up,
and I was able to do that pretty darn fast.
So it's a really powerful tool.
If we go back along here, this is an interesting one.
So I can see that I have this outlier right here.
I'm going to nuke it.
But you can see that there is a shift,
and I unfortunately in my data table try to label it as a trial.
So I could use the right click row, name selection and column,
but there's still a lot of steps in there, so I'm just going to select.
And I've made myself a binning column.
So when I click this, whatever was selected is now binned.
So I can very easily see what's going on.
I can now even add in my text box and see the differences of the means.
That's really useful.
I mean, you can bin things as trial, not trial.
I use good, bad a lot.
So if my continuous data isn't great because of the measurement system,
but it does do an okay job, it's just saying the part is good or bad,
I can bin it with this
and then do an analysis with the pass, fail,
like a logistics analysis.
So that's great. I also really like the dynamic selection.
So if I were to go back here,
I'm going to take the binning off.
And now I have this selected column
where it just changes it to a one if I select it.
Now, I can dynamically go through and select different things,
and I can see the mean.
[inaudible 00:17:16] j ust real quickly.
Okay, this grouping right here, its mean is 100 and above it is 288.
And it's really useful for poking a data.
Let's say right here, what's going on with this data?
One, I can select it and see what differences and means are.
But then two, I could see what the trend would have been like
if this had not happened.
So I can do a little bit investigation.
And then I actually use inverse selection a lot,
which is buried in the row menu.
So I just have a toolbar here, so now I can inverse it.
Everything basically is the same
except for that now the bulk of the data is highlighted,
which sometimes makes it easier.
So that's great to use to analyze.
The other thing I have is sometimes you might want to,
say based upon what's selected here, what else is selected?
So I call this my selected other columns.
And then we're going to go and say,
for this little grouping that was different,
what else shared the equipment one that this grouping used.
And when I click that, you can see that barely any of the rest
of the B product used equipment one level,
but a lot of item A did, and A is actually higher here.
So it might be something that if we wanted to possibly not have this higher level,
maybe we need to look at using the same equipment
that the rest of B is using.
A lot of different ways to slice and dice and learn things.
The last thing is it could be I have two products here,
but let's say I don't want to do two products,
so I want to subset it.
So I would go in, I have these subsetting icons,
because once again, I just want to do it in one click
and I do a lot of subsetting so it makes sense.
So now, I have this new table.
But what if I want to have the same graph though, and build that?
I don't necessarily want to have to rebuild it from scratch,
and there's some other ways to copy and paste some scripts over,
but I do this enough that I actually
am going to save the script to the clipboard
and then I can bring this back
and I can actually run the script from my clipboard.
Hey, now I have a graph
and it's all built up the exact same way I had before.
So this is a really nice way
to keep the efficiency you had from a previous table with a new table.
Now, you'll see here, I'm going to close this
and it pops up a window because it's saying,
"What do you want to do with your other windows that's open?"
And then if I were to click what to do with that,
it [inaudible 00:20:24] say,
"Hey, you didn't save this, what do you do with that?"
And it's like a lot of times I have subset windows
just because I want to be exploring things.
And so all the clicking to close things is driving me crazy.
So I actually made myself a little close everything around that table.
And if you're in a window, it'll go close the base table
and it doesn't ask you anything.
So I can do real quick little explorations on little data sets and then close it down
and just stay in the workflow and go fast.
If I did want to save something, I made this little macro
where it's going to save out in a generic name
to a standard file location.
And so I don't have to think about like,
where am I going to save it and dive into a bunch of save menu.
So if I want to move it a later time, I can,
but I know at least where all my main things I want to keep are.
And then if I do change something, say I change something...
Actually, let's even say I change something from the graph.
So I'm going to blow out all these guys.
And if I wanted to now save this, I can't just click save
because that's going to try and save this window.
So I found it really useful to just have the save data table button
that shows up so I can, once again,
stay in the workflow of the analysis window
and save my base data table.
And once I'm done, I can close and get out of there.
All right.
That's everything I wanted to cover for there.
So let's move on to a real quick example for functional data.
This will be super quick.
For functional data, the one thing I use a bunch is,
if I have functional data that has a timestamp,
you can see that's not super useful if I'm trying to look at all my lots
because there's a big gap between the times.
I could possibly step through
and see what the shape is looking for.
That's not super fun.
And so what I have is I have this.
I make a counter column which just uses the cumulus sum function.
I can say, "What do I want to do it by?"
And I can add up to four items that I subgroup the cumulative sum.
I'm just going to do pieces
because that's really the only thing that matters,
and what I get out of that is I get a counter column
that now everything shows up nice on one graph.
And this is really good,
but it only works well if the timestamps are pretty comparable.
If the timestamps are all over the place
because it's assuming the timestamps are the same,
then you have to get a little bit more creative.
Okay.
So back to the presentation.
So we got through all these things, but what I really want to show
as we tail out of here is,
for the ultimate in freedom and efficiency,
you need to use scripts to expand JMP's functionality
to fit your exact needs.
So there's a lot of times,
and hopefully you're putting them into the wished lists on the community,
but there's a lot of simple ones you can actually take care of yourself.
So you can see a nuclear Godzilla up there
and we all know that a nuclear bomb plus Godzilla makes him king on the monsters.
And so it's a little known fact probably
that JMP plus scripting of functions makes you the king of data analysis.
And I've gotten a lot of value from the scripting index,
the two JMP books that are listed here and the user community,
especially these two guys who I owe massive amounts of beer
as gratitude for the time they saved my bacon
and probably thousands of other people as well.
So let's get into what we're going to do here.
So the first thing is, we'll go back to this table.
If I'm just doing more of an exploratory analysis
or trying to get an explanation model versus predictive model,
I'll use partition without a validation column.
And this is nice because people that don't have JMP pro,
they can use this as well.
And what I do is...
Yeah, we'll just put all this stuff in, that'll be fine.
And we're going to go click O kay.
And now I can actually...
I like to split by LogW orth, so I can actually split by LogW orth
and it's showing the minimum LogW orth out of this tree.
And so I'll just split until I get below two.
Okay, there's two. Go back, and here's my model.
Our square is 44.9.
Now, whenever counts get low,
I do think that I might be overfitting a little bit,
which is why I like this minimum split size,
so I can prune back.
Let's just say minimum split size is way too low.
So I'm going to go 15
and then okay.
So definitely left splits.
Our square is still not too bad, and we can see our main factors
that are contributing to our defects.
These top three,
I really like using the assess variable importance
since it reorders what you're looking at into the main
or the first boxes in the order.
And I love the optimize and desirability.
Once again, you have to keep clicking into the red box to run this.
So I came up with a little macro to control the profiler .
So I can actually come in here and say, "All right, I want to first maximize
because it defaults to max
and I can now remember the settings and we'll say max,
and then I can alter the desirability to make it to the min
and I can maximize and remember settings and we could say min.
I could copy the paste settings, set to a row.
I could link profilers and it's modal.
Or non-modal, I apologize.
So it can just stay up and out of the way when I don't need it,
but yeah, it makes using the profiler,
which is already just super powerful, super efficient as well.
That's what I really like, and I suggest you grab
from when I put them onto my page for this presentation.
Then the next thing is, I got to go back here.
I'm going to do some neural net stuff.
So I definitely want to make a validation column.
So I have these built in ones of the splits that I like,
so it automatically creates it for me.
So now I have my validation
and I have a normal random uniform one
in case I wanted to do any prediction screeners.
And that helps with looking at cut out points,
but in this case, we're just looking at neural nets.
And where I got from here is I really like the Gen Reg,
how it has this model comparison,
and I really like in Bootstrap Force how you have a tuning table.
When you're using a neural net, it can be very painful
to feel like you're getting the right model
because every step you have to change it, rerun it,
and then look to see what's going on.
And sometimes it just feels like you're spinning wheels.
So through time, I found some models that I really like,
and so I just built this platform where I'm going to recall.
Here's everything,
and I put down the number of boosts,
number of tours is really low just so this run faster.
And I can go ahead and run this.
And so what it's going to do now,
is for the models that I've put into my tuning table,
ideally down the road,
I like to have a tuning table be a little bit more in that first menu,
but not there yet.
So what I will get is I'll get this nice preto
showing my test, my training validation and the different models.
And so I can go through [inaudible 00:28:45] cool.
Which one got me the closest
without having to run these each individually?
So I do see that it looks like this TanH(10)Linear(10)Boosted(5),
overall, the average of all the R squares puts it at the highest,
and it looks like everything's pretty close.
So let's just start with this one.
And the next thing I like to do is actually look at the min and maxes,
and see did it actually predict in the range that I was expecting?
So let's see, what did we say?
I said 10, 10 and 5 boost.
So 10, 10 and boost five. There we go.
So I'll look at the min and max.
So it predicted 5- 112.
It's good, it didn't predict negative.
That's definitely something I look for a model with defects or hours,
because you're not supposed to have zero on any of those
or negatives on any of those.
And the defects we had was 1- 51.
So yeah, it did okay.
It's predicting on the high side,
so I might go in here and be like,
is there anything else that was actually predicting on the lower side or closer
and still had good test values?
So this is a really powerful tool
because then I can just go into my actual window here
and I can go down here and this is my model.
And I could save my model out.
I could save this first formula out, I can save this neural,
just a certain one to my data table and then just use that from here on now.
And it's already got built in my minimaxes here.
Let's save from there.
I find this to be a very powerful improvement for the neural net platform,
which I already think is pretty powerful.
And then also if you're just in standard JMP,
the last thing I'll show is,
I started trying to give some additional functionality for standard JMP people.
And so here, you can...
It contains how many initial nodes you have,
what's the number that you want to step the nodes up,
how many loops you want to go through
with your validation percent,
and if you wanted to do assess importance, you're going to click Okay.
And what it does is,
it runs all your models
and it does the same thing except for...
I had a chance here to work on getting the min max improved.
So here I can see.
Here's my min max, is what I was actually predicting,
and then here I can see my training and validation.
So ideally you want them to be as close together and as high as possible
and then predict well.
So here I'm looking at TanH(8), which puts me here.
So that's pretty good.
So that's probably the one I would go with.
They're the closest, it doesn't overpredict.
This one actually is predict...
Even though it has a higher training, this one has a higher training,
they're actually predicting negative values
and then this one seems like it's getting over complex.
So that's what I would go with.
It's pretty useful for more standard users to get some more
out of the neural net platform for them.
Finally, let's just go quickly to some dim data stuff.
We have the dim data example of get specs.
So dim data example.
So if our process that we do at our plant
is we'll get a bunch of data and then we will calculate
a spec limit from that.
Usually it's either three or four sigma spec limit,
so PPK of 1 or 1.33,
and then we'll present that to the customer.
That can take a long time in old days where we would manually run analysis
and then best fit and then write it down
or just use the normal distribution for everything
and then calculate it in Excel.
You have this option in JMP to do process capability
and you can change it to calculate it off of a multiplier.
And that's great because then you get your specs.
The problem is you have a lot.
Even if you hit the broadcast button, you have to enter that for each one.
So what I did
was definitely with help from a bunch of other people,
because this got above my pay grade and scripting very quickly,
is I went in and made this macro where I could say,
what do you want the signal number to be?
Click Okay and it goes through
and it will spit out this for everyone or every distribution.
Now I can right click, go into make combined data table.
I have my data table.
Then I can go here, select all for lower- up spec limit,
use my Subset button, and this here,
now I can submit that to the customer.
Here's my upper- lower spec limits for all these things.
I did that in hopefully less than a minute
and it used to take someone to do that half their day, if not more.
So using scripting to improve what you want to do,
and the functionality and flexibility is great.
Dim data unstacked table, where is that?
Dim data unstacked table.
Coming in at the home,
here we have a bunch of dimensional data done by parts.
The thing is, some of it is subgrouped
and some of it is [inaudible 00:34:57] data.
By using my subgrouping macro,
I can select all my Ys, say what I want to check,
and it will then put it as a subgroup or as an individual.
And that allows me to go in and use my Control Chart Builder.
So I can say these are individuals, these are subgroups,
and I'm going to subgroup by this.
Click Okay, and it takes a little bit to run.
So I have one here, and it will actually put all the mixed
control chart types all in one window,
which is really nice because then I can now actually make a combined table
of everything of the control limits in one table, which you can't do.
You'd have to do a lot more steps
of concatenating individual tables together.
So that's great.
You can also do the same thing with Process Screener,
where I can put in individual and IMR here and then XB ar stuff here,
and I can output a table here
that shows for mixed subgrouping types IMR and XBar,
and I can see the PPK of them
and their out of spec rates and their alarm rates all in one.
So it's nice to be able to keep everything together
and have multiple windows open depending on their subgrouping type.
And finally, the gauge R&R.
Gauge R&R, especially something like a CMM,
where you can have a lot of codes to do [inaudible 00:36:44] on,
so it can be a lot of work.
So I made a macro.
The first thing you got to do to make this work really well
is you got to add in specs.
So I have this little script I made where I can select columns
and then I can append columns if I need to.
If I forgot one, I'm going to load from a spec table,
click Okay,
and then I will save this to the column properties.
And I can actually use this as non-modal,
so I can just keep it off the side in case you want to change something,
and then I can go in and run my selected column gauge R&R.
We're not going to go too crazy, but I'll just select these guys.
It says, "Hey, you're going to run a gauge R&R in these.
A re you okay with that?"
Click Okay.
We'll say part and operator and go.
It won't take too long.
And why this is nice?
Is because you can see that if I go to connect the means,
that connects really nicely like you'd expect.
If I were to pull up a traditional gauge R&R ,
then it gaps because I don't have for each hit number,
because the hit number for different codes are different.
I'm missing data.
So these don't apply to this actual item
and it makes the charts get all messed up.
But by using my macro,
I can have a local data filter for each item.
And when I select that local data filter, then all the things I'm not using go away.
Now the charts look great.
That adds a lot of how those charts look improvements.
All the data down below is the same.
Okay, that got us through everything.
So I'm going to move on to some final thoughts.
Okay, final thought.
So I definitely encourage you to use the toolbar.
Consistent layout, icon use and naming conventions
are key for your effectiveness.
Get into scripting.
Here's some things I suggest that you focus on,
and definitely use the log now that it will record your steps for you.
It saves you a lot of typing.
And really think beyond what JMP currently does
and try and see if you can actually add that functionality yourself.
For developers, I like to keep moving
to keeping commands as flat as possible to get things out of submenus.
And for me, I'm working on getting better at making icons,
learning how to reference and pull data from the analysis window,
which is called the report layer, and always including a recall button.
So there are some statistical jokes for you, some of my favorite,
and that's what I got.
So thank you very much for your time and do we have any questions?