Images in Data Tables and related JSL Tools (2020-US-30MP-585)
Heidi Hardner, Senior Staff Engineer, Seagate Technology
Serkay Olmez, Senior Staff Data Scientist, Seagate Technology
A Keras /TensorFlow deep learning project motivated a set of JSL Add-Ins for working with images inside JMP data tables. In turn, the new tools revealed widespread applicability of embedded images for our more conventional JMP analyses. JMP has had the ability to incorporate images in tables for some time but we had not fully utilized it. We will share the value we have found in embracing this feature as well as a suite of scripted tools to help unlock that value. The JSL tools include pulling images into a table, reviewing, marking up and measuring images as well as exporting them to PowerPoint or to a file structure with organization and properties based on other data columns.
Speaker |
Transcript |
Heidi Hardner | Hello, I'm Heidi Hardner, speaking to you from Minneapolis, Minnesota. |
Serkay Olmez | And this Serkay Olmez, and I'm dialing in from Longmont, Colorado. |
Heidi Hardner | We work at Seagate Technology, where we both have a long history using JMP and JSL, and sharing JMP scripts with our colleagues. We're going to demonstrate some use cases from ??? images in JMP data tables and we'll share some related JSL tools. |
At Seagate, recording heads undergo a large number of factory processes, from a wafer fab to installation in the drive. | |
We therefore, frequently look at very wide data where each head is a row and there are many, many columns. | |
Since many of the processes produce images that can include many different images per head or we might have very tall data from many heads passing through a single process, looking very similar to each other in metrology images. | |
Our download data sets today aren't Seagate images, but they'll have similarities to these situations, and I hope drawing these parallels may help you draw parallels to your own use cases. | |
JMP has had images and data tables for several years now, and I was personally slow to appreciate it. | |
I hope you'll see in these use cases, different ways it can be valuable, especially through the flexible interactive connection of images to other data. | |
I also want to highlight the value of storing image paths in the table, especially if you can have shared paths. | |
And lastly, we're both sharing the JSL we'll demo in the supporting materials. So that's a literal takeaway. There's a directory of those at the end of this PowerPoint. | |
Much of the exploration that's useful with images is quite basic and our scripts are ready to use without JSL knowledge, but they also contain a wide range of things you might build on in your own JMP scripts. | |
three things that include publicly accessible image paths through to my second takeaway. | |
The first is a data table from the Mets online catalog. It includes examples with multiple images per object or row in the data table. | |
Browsing a catalog was a huge rabbit hole for me that I highly recommend. The second data set is astronomy images. They're thanks to Russ Durkee at the Shed of Science Observatory. | |
This is row per image metrology data where the images are very similar to each other at first glance. So note the parallels to what I was just saying about our recording head factory data use cases. | |
And speaking of rabbit holes I've been down, the third thing is JSL that implements the Omeka API | |
to access scholarly collections online. This is thanks Dr. Heather Shirey for help accessing these databases. Her urban art mapping project, that's another fun thing you should go check out online. | |
But the technical point here is that anyone can use this code to experience querying the database and getting results that mix image paths with other data to make the kind of table that our tools are centered around. | |
So we're going to move right on to a live demo in JMP. Before we look at those type of data sets, what if you just have a pile of images? | |
image table from folders. And what this does is we could pick a folder. I have a folder with some sample images from the | |
Met catalog data. And you can see that it's got some folders and when I select this folder I quickly get a table with | |
in column all these tabs to the images. It's dug down looking for file extensions it thinks are images and it's also split out some folders for | |
the, you know, the subfolders as attributes in the data. Now fairly often, we have use cases where the folder names amount of some kind of meaningful attribute you'd like to have the data. | |
And this is a script you could customize further in my Seagate version, usually I have a head identifier embedded in the file name, so I have parsing for that that's built in automatically to my own personal version of this script. | |
So quickly, let's get over to one of our main example data sets. So this is the museum data that I mentioned. And so this is meatier, you can see that it has | |
it has a row per object, but it's got some numerical data. Right so it lists the objects. It's got a bunch of attribute data. So there's a bunch of... | |
a more substantial data set here. But then if we scroll over, you can see the inside of the data set are paths to the images, the URLs where they images | |
are hosted and you can see that they're up to five distinct images here for some of the rows. | |
And so my second of my tools image table important images is all that actually brings images into the table, and it can accommodate multiple paths. These could be local paths as well. They don't have to be a URL. | |
Once those are shared, the demo data, you'll be able to go to get... Tt's already done. I wanted to quickly say... to talk about quickly what it's doing behind the scenes. | |
When JMP open something that it recognizes an image, it can be an image object. And there are a bunch of JSL command messages that the image object can take, things like get size, get pixels, save image. And you'll see those in the in the code that we're sharing. | |
Let's go, I want to see the results. ...the images into the table. This script is doing a few things besides just sticking the images in. | |
It is...let me open another example to some other code snippet about this. So the images, they go in column, an expression type column, so it's making an expression data type, how to put the images in. | |
I'm also creating an event handler for that column. This is using a special message for the graph box, add image to the image object and get added right into a graph box. I will show you what that does, but it's, it's ... Click event handler. | |
And it's also resizing, so the fact that this got tall to accommodate these a little bit bigger. | |
That's something that you can do, my by script does that, but also by default, you know, if you put images in, you might get something really tiny like this, which could be cool | |
for seeing a large array, but maybe just worth knowing that, you know, that seems like, I don't know how I could ever look at my images are supposed to come in so small in JMP, you can widen them manually as well. | |
And the script also embeds itself in the table. That's a useful thing to know how to do, and the logic here is that, you know, the | |
data table got really big when we embedded all these images, and it's full images embedded in | |
the data for the full images embedded when this happens, and so I might want to send this data set to somebody and have them be able to go just get this without knowing anything about my special tool menu. They could run this to reimport the images. | |
On the On Click I wanted to show is where when you click on this, it makes a full size or a bigger version of the image here for you to see. | |
So like I said, much of what's great about having images in the data table turns out to be very simple JMP exploratory capabilities. So for example, I can take these columns and I can reorder them. | |
And just move them over by some other things will be more interesting to look at the whole file sizes table. | |
And this way, you know, I can just browse, I can sort, I can see here that I had the images sorted by how many, or the row sorted by how many images that work. | |
So if I scroll down, I can just see visually that, you know, by rearranging, getting different columns of interest next to these. I can actually get a lot of images in a grid. | |
on the table. Here, I have this group and made up my own grouping column. And so I could rearrange that. Just reordering the columns I could stick that over next to the images. | |
in case I wanted to review that by scrolling through. So a lot of just moving around, sorting, and rearranging is part of what's good about having images in a | |
table. So one of my personal use cases is monitoring some important metrics that come from images and in my case, they actually came from a combination of different images. | |
And I'd be looking at this data or looking for outliers and suspicious modes that would recurrently show up in the data. | |
And in this use case, I would have not 85, I have tens of thousands of rows I'm looking at and the images will be a useful part of exploring like what's going wrong with data. | |
Let me make an example plot here of just so...in this case height by width of these museum objects. And let me quickly go back and show one thing you can do. By right clicking here and I can make a label out of this first column. | |
And so then if I'm looking at the data, I can be examining this and say there's something kind of sticking out. Why is this so much wider than it is tall? Well, that wasn't a mistake. It's really a wide | |
artwork there, so that makes sense. So I can have, you know, one fact pop up. If I have a lot of different things I'm interested in sandwiched in this cluster down here, it seems like there's a lot | |
of these guys that are similar in size, maybe that seems like it's interesting. | |
Again, I can, I can look at them individually with the labels. But if my data is really large I could do other exploratory things. I can go back to the data and I could use F7 to page through | |
for things that have been selected. Of course, with having selected things I could just subset some of those to get, you know, to get | |
just a smaller data set. One of the exploratory things I'd like to show, one of my favorites, is under row selection. | |
I could do name selection and columns. So say this is a strange mode and already knows that I'd like to call it oddballs, doing this. | |
And I'm going to say, hey, these are some oddballs in the data I want to go examine. And it's going to give me a column at the end here with ones and zeros for whether things are oddballs or not. | |
And part of what's useful about that is it's just, let's just imagine that my data set is really big. And so just paging through them. Not so good. | |
And let's say it's so big, I don't want to have the images in there. I don't want to pull them in, it was unfeasible. So here I am back with my data set. | |
And I have these oddballs that I'm interested in. And I can go to subset, and instead of just subsetting all of them, say maybe there's thousands here in this group, I can use that column to do stratification. | |
So here I could say, like, I've only selected 11, so let's say I'm going to pick five. I'm going to choose to stratify on the oddballs. | |
And what I'll get is five oddballs and five nonoddballs, randomly chosen, and you know, here I can, I could sort those and I could bring the images back. So now I could say, all right, I just want to see, | |
maybe I just want to see one and two, whatever, just for the small set. So again, hopefully that's showing like how dragging along those paths even a really large data where it doesn't seem useful to | |
have the images or you know feasible to put in the images, I could just get...they'll get a subset. And then I could browse through these and say, what's different about the oddballs. I can sort them out. | |
Now I want to show one other use case. So | |
this use case was me learning to do some deep learning classification of images and so | |
in this case, you know...really forcibly struck me in this case that images really are literally the data. | |
And so here's an example of this. If I, in my case, of course, I'm looking at recording has different views of them. I was trying to classify these different views. | |
Right feel ??? and some or something else that are focused on a different part of the head. And so here I'm simulating that with us cat statue from the art data and categorize them. Here is the back, facing the right, facing front and so | |
you know, this, this, this whole project of doing machine learning isn't happening in JMP but there's a need there | |
for this Python ??? TensorFlow categorization product to provide training data. And in the end, to review the results that come back and it's, you know, ends up being the form of nested files in folders. And so let me quickly show another of our tools. This image | |
Here we go. Image table. These tools. | |
Two folders and here is...I can pick on the images. | |
These are the paths or images in the data table and I can choose some attribute columns and turn those into | |
file folders for nested file folder categories. Here I'm going to tell it to use my file name column as a name. | |
I can all generally have some prefix or change the size. Resizing is another thing that really came in this machine learning. I've set up my model certainly expect upload images of a certain size. A then when this operates, it's going to send my images | |
into the folder. | |
there's snafu here where it went. | |
Let me do it one more time. The images...image table to folders. | |
I think I clicked on the wrong thing. I've got my machine learning images folder here that I want to put these into quickly. | |
Revise that. | |
Clicking on the file names. | |
And it's going to arrange them in the folder. | |
So here I have my test and training data sets and oops, I picked I picked the wrong thing for my categories. Let's categorize these by the things that I chose in the columns. | |
So the idea there is that, you know, I need to put my, I need to put my categorized data into the file folder structure. | |
And the other thing about that is it's useful in the data in...to do this in JMP in the data table so it's useful to come up with these categories training. | |
groups inside JMP. I already showed that we could do the stratify and here's another example where, because I have the data, the other data columns, like the department, in the | |
in the data set. I can go back and see, like, hey, did I split up my test and training set in a balanced way? So when I have,say Egyptian art, I did a good job on Egyptian art | |
Being partly in the task, partly in the training, not so much on the other category. So in JMP I know I have the other data | |
columns. They're convenient for forming up my testing training set, making sure that I have balanced groups and also I can save this table and have a record of what was in the... | |
what was in the training that I did. So there's that. The last thing I wanted to say about this is that this museum data doesn't really capture very well | |
the idea of metrology images looking all very similar. So because these cats are all so unique, you might have to really look closely when you're scrolling through here to see, | |
you know, is this the back of a cat or the front with the face worn off or whatever else, and | |
I'll talk more about the case where metrology images really very similar to each other. And it's very useful potentially to scroll through them just looking at things by eye. After we hear from Serkay with his PowerPoint tool. | |
Serkay Olmez | Okay, so let me share my screen. |
Alright. | |
So what I will be talking about is a script that enables people to push images listed in a in a JMP table into PowerPoint. | |
So what what you want to do is to collect the images from those links listed in the table and push them into PowerPoint with a template you choose. And so you can create this table with the | |
code snippets Heidi already showed and this this script assumes that you have a table already and you have the links in there. | |
And I will just run the script and tell you what the script does. | |
So if I run this, it gives you this interface in which you can select columns as attributes. First of all, you will need to select the image path so that JMP will know where to collect those images from and | |
you can also add labels. Those labels will go into PowerPoint as columns and you can also decide what kind of layout you want in the PowerPoint. So I will just show a grid | |
layout first. So there will be four columns, two rows in this case. And you hit export. What it does is, it goes and triggers PowerPoint and PowerPoint will | |
go and collect those images and it will build the slides and this will take 10 to 20 seconds because it's pulling in 20 images from a server. | |
So once it completes, you you will have the PowerPoint with a nicely laid laid out structure. And those columns, remember, those are the ones the table comes from the columns you selected at the beginning in the UI. So you get this one quickly and you can, of course, change | |
the layout here. For example, let's assume you want to do | |
a different thing. So there's this layout selection here. Let's say you want you want to do one row and maybe leave some space underneath so that you can put some comments in there. So you can just rerun this | |
and then JMP will tell PowerPoint how to set up the layout. And then once this completes, it will have a row for images per slide. | |
Look like this and | |
One other thing that script does is, is to split your slides into categories. Or maybe there's a there's a top attribute in your data and you want to start....you want to start a new slide | |
when that attribute changes. Let's, for example, take the medium and you put it into slide labels so it'll create slide labels at the top. It's like titles. | |
And it also split this little PowerPoint presentation into categories. So it will start a new slide every time this medium attribute changes. If I just do this, | |
it will do the same thing again in 10-20 seconds. Now, it will be grouping them into different categories. So, you will have a nicelu laid out | |
set of slides, | |
which which will be split. So this, for example, you have a slide for medium equals to brass, you will have a bronze one and there seems to be only one item from that. I selected ??? rows in the data and there will be ceramic, etc. So this separates out the slides. | |
The last thing I will show as a demonstration is that sometimes people will want to have a little more complicated layout. A template for example, they will want to look at one image | |
in a bigger way so they want to put a big image to the left, maybe a couple more to the right. So you can do such complicated templates | |
by going...by selecting a different layout and those are done by scripts. So I had this one big to small layout. There will be one big image on the left hand side and there will be two small ones stacked to the right. So if you just do the this | |
it'll go and do the same thing, but now it will be the different layout. | |
I'll give it 10 to 20 seconds to build the slides. | |
There you go. So a big image on the left hand side and two small ones on the right hand side. | |
And the last thing I will show is about customizing the tool. We will be releasing this tool as a part of this large add in that does all those image manipulation. And so if I just opened this one very quickly, | |
so this is user customizable, so users can go in and change what kind of columns they want to put into roles automatically. And so | |
users can make it look different for himself or herself. And one last thing I will mention is about the specialized templates. So those templates | |
I just showed one big to small one, those are created by scripts that live externally to the tool. So we will be providing two of them. But the idea is that if the user is experienced enough, that person can develop | |
his or her own template and put it into the tool and the tool will recognize it, or the developer can develop more and more templates | |
specified...specific to a particular customers or particular users. So, and this will be a part of the add in we're going to release. This is all I have. Heidi can take it back. | |
Heidi Hardner | Okay, okay. |
Sorry about that. I'm back and | |
now we're looking at a different data set. These Shed of Science astronomy images. | |
And as advertised, these images are more...they are metrology images and you're looking at a piece of the night sky here in this very small image. And even in this very small size and scope of images, | |
you can see, you can see movement there. And you can see some anomalous things like here when something pops up. | |
And this is part of the point I wanted to make when I was looking at like classify for my machine learning project and I have images that nominally look very, very similar to each other. It's very easy to scroll through and see something that's that's anomalous like this. | |
But I'm actually going to make a more graphic demonstration of that where we're going to blow this up and scroll through again. So let me quickly | |
describe what we're going to see. So here's a view of the intensity of white intensity on these images as a function of time. So these are time ordered here when we scroll through them. | |
And one thing you're not going to see, that's not very visible by eye, is this big, slow trend here in the intensity. | |
We'll have the labels on. You can see here again that the way you use labels to see. Yeah, whether intensity is very high. It's because there's a streak across the 60-second exposures are something passing by in those images. We can also see though the bumps in the data. And that's where you see the | |
the telescope has drifted, a little bit of the background of this moves and it resets itself, see that when we scroll through. So just a little bit of a preview. | |
I'm going to show expanding this much bigger so that even inside the data table, we can see a quite big view of the images, just a little bit bigger. | |
And start back at the beginning in time. What I want you to do is watch this spot, watch that object as we scroll through the data. | |
And hopefully what we're seeing is you can see that moving to the left, against the background stars. And that's an asteroid, 2171 Kiev. | |
That is, we're seeing it was a real intention of these metrology images. So we just a graphic demonstration that within the right circumstances, just scrolling through images in the table can be used. | |
image table annotation. So this is for the opposite case where you really want to just examine images, one by one. | |
What happens when you click on this, is that, again, you have a choice of using the path or the embedded images. | |
You have the option to have some sort of identifier that we put the date here or it would just give you a row numbers and you can have one other attribute, let's put the telescope here, | |
on the data. And what this does is that it's forming a little markup tool and this tool will let you | |
do some generic classifications. This one's a three, could say something here. Some kind of comment about the data. | |
And also I have cross hairs, so I can do measuring. It's going to measure in pixels on the image and it's going to move the crosshairs to record the positions and the | |
delta there and pixels when I click record. Now, before we go do that, you know, part of the reason there's a lot of different ways you do mark up with commercial | |
tools, part of the motivation for doing this in JMP is, again, that you have, along with the image, all this other data in the table. | |
It's really just represented here by this one column. But again, this is something that is ripe for customization. In my own little version, I'm looking at images | |
and I'm printing here in this view, a little map of where the head was on the wafer or various other pieces of data. Or maybe the image recipe said what the dimensions that I can try to verify. | |
So the ability to have this joining between image and items there, is it possible even for doing that? But then when we click record what happens is, go over to the right. We'll see them to add columns and | |
(The columns are so tall, let's shrink them a little bit.) | |
You can see here in Row 3 and recorded. No, I don't want to move the white cross hairs so I'm getting a vertical distance and the positions of the crosshairs and my comments and classifications are stuck right back in the table in the correct row. | |
Another thing we're seeing here is another column of images. And so this is where I have put some images in that are JMP plots, JMP plots can be images. | |
And in the data table that we're sharing, I have embedded code. You'll see here that makes these images and it involves using get pixels message to the images that we have, | |
getting a matrix of the data, slicing out a slice along where that asteroid is moving and averaging up the intensity. You can flip through and see | |
the peak moving in these images. But the real point of this was for me to just mentioned. | |
Some of our use cases that we have at Seagate all the time where we'll have one row of data for a head and what values in there, such as a peak frequency that came from an entire spectrum or an optimum value that came from an optimization sweep etc and | |
those sweeps could actually become images in the data. And of course there are good reasons why we do that kind of feature extraction on images or complex curves. | |
The simple values like frequency can be used more directly in a variety in JMP analyses like | |
JMP's statistical platforms, but I hope that we both demonstrated that they're even a lot of useful things you could do if you get the images into the table in relation to a bunch of other data columns. | |
In particular, you know, I hope maybe seeing plots as images, you're thinking right back right away to Serkay's PowerPoint demo and plots are kind of things like to arrange in a PowerPoint in certain ways. | |
Anyway, hope the various bits of JSL we've shared might be useful to you and that you feel inspired to do more with images yourself. Thanks. |