Choose Language Hide Translation Bar

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Brian Corcoran, JMP Director of Research and Development, SAS Dieter Pisot, JMP Principal Application Developer, SAS Eric Hill, JMP Distinguished Software Developer, SAS   You know the value of sharing insights as they emerge. JMP Live — the newest member of the JMP product family — reconceptualizes sharing by taking the robust statistics and visualizations in JMP and extending them to the web, privately and securely. If you'd like a more iterative, dynamic and inclusive path to showing your data and making discoveries, join us. We'll answer the following questions: What is JMP Live? How do I use it? How do I manage it? For background information on the product, see this video from Discovery Summit Tucson 2019 and the JMP Live product page.   JMP Live Overview (for Users and Managers) – Eric Hill What is JMP Live? Why use JMP Live? Interactive publish and replace What happens behind the scenes when you publish Groups - From a user perspective Scripted publishing Stored credentials API Key Replacing reports Setup and Maintenance (for JMP Live Administrators) – Dieter Pisot Administering users and groups Limiting publishing Setting up JMP Live Windows services .env files Upgrading Applying a new license Using Keycloak single sign-on Installing and Setting up the Server (for IT Administrators) – Brian Corcoran Choosing architectural configurations based on expected usage Understanding SSL Certificates and their importance Installing the JMP Live Database component Installing the JMP Pro and JMP Live components on a separate server Connecting JMP Live to the database Testing installed configuration to make sure it is working properly (view in My Videos)     (view in My Videos)   (view in My Videos)
Dieter Pisot, JMP Principal Systems Engineer, SAS Stan Koprowski, JMP Senior Systems Engineer, SAS   Data changes, and so do your JMP Live reports. Typical data changes that involve additional observations or modifications to columns of data necessitate updates to published reports. With the first scenario, an existing report might need to be recalculated to reflect the new observations or rows of data that are used in the report. A second option is when you want to restructure the underlying data by adding or removing columns of information that are used in the report. With both situations you must update your report on a regular basis. In this paper we will provide practical examples of how to organize JSL scripts that facilitate the replacement of an existing JMP Live report with a current version. Prior to the live demonstration, we will discuss all key security protocols, including protecting credentials needed to connect to JMP Live. The code presented are designed to be reused and shared with anyone who has a need to publish or replace a JMP Live report on a predefined time interval, such as hourly, daily, weekly or monthly. With some basic JSL knowledge you can easily adopt them for your automated updates to any of your other existing JMP Live reports. Not a coder? No worries, we've got your back. Additionally, we will provide a JMP add-in that schedules the publishing of a new report or the publishing of a replacement report for those with little JSL knowledge using a wizard-based approached.
Zhiwu Liang, Principal Scientist, Procter & Gamble Pablo Moreno Pelaez, Group Scientist, Procter & Gamble   Car detailing is a tough job. Transforming a car from a muddy, rusty, full of pet fur box-on-wheels into a like-new clean and shiny ride takes a lot of time, specialized products and a skilled detailer. But…what does the customer really appreciate on such a detailed car cleaning and restoring job? Are shiny rims most important for satisfaction? Interior smell? A shiny waxed hood? It is critical for a car detailer business to know the answers to these questions to optimize the time spent per car, the products used, and the level of detailing needed at each point of the process. With the objective of maximizing customer satisfaction and optimizing the resources used, we designed a multi-stage customer design of experiments. We identified the key vectors of satisfaction (or failure), defined the levels for those and approached the actual customer testing in adaptive phases, augmenting the design in each of them. This poster will take you through the thinking, designs, iterations and results of this project. What makes customers come back to their car detailer? Come see the poster and find out! (view in My Videos) View more... (Highlight to read)   Speaker Transcript Zhiwu Liang Hello, everyone. I'm Zhiwu Liang statistician for Brussels Innovation Center for Procter and Gamble company I'm I'm working   For the r&d department. Hello. Pablo Moreno Pelaez Yep. So I'm Pablo Moreno Pelaez I'm working right now in Singapore in the r&d   Department for Procter and Gamble's   So we wanted to introduce to you this poster where we want to share a case study in which we wanted to figure out what makes a car detailing your grade.   So as you know, Procter and Gamble, the very famous company about job detailing for cars. No, just a joke. So we had to anonymize or what have they done. So this is the way   We wanted to share this case study, putting it in the context of a car detailing job and what we wanted to figure out here is what were the key customer satisfaction factors for which we then   Build interactive design that we then tested with some of those customers to figure out how to build the model and how to optimize   Job detailing for the car. So how do we minimize the use of some of our ingredients. How do we minimize the time we take for some of the tasks that it takes to do the job details.   So if you go to the next slide. And the first thing that we went to, to take a look. Yes.   Okay, what are the different vectors that a customer we look at when they they take the car to get detail and to get clean and shiny and go back home with a buddy.   A brand new car. What are they looking at clean attributes, they're looking at Shane attributes and they are looking at the freshness of the guy.   From a culinary view that we looked at the exterior cleaning the cleaning of the rooms are the king of the interior   The shine of the overall body, the rooms that windows and of course the overall freshness of the interior   And then we'll wanted to build this by modifying these attributes in different ways and combining the different finishes that it a potential   Car detailing job would give you wanted to estimate and be able to build the model to calculate what the overall satisfaction.   And also what the satisfaction with a cleaning and what their satisfaction with the shine.   Would be modifying those different vectors. These will allow us in the future to use the model.   To estimate. Okay, can we reduce the time that we spend on the rooms, because it's not important, or can we reduce the time that we spend on the interior or reduce the amount of products that we use for freshness. If those are not important.   So really, to then optimize how do we spend the resources on delivering the the car detailing jobs.   So in the next slide.   You can see a little bit with the faces of the study where Zhiwu Liang Yeah, so as popular as sad as the cart. The planning job company. We are very focused on the consumer satisfaction. So for this particular job.   What we have to do is identify what is the key factors which drive the consumer overall satisfaction and clean and shine satisfaction. So in order to do that we separate or our study design and   Data collection experiments industry step. First, we do the Pilar, which is designed to five different scenario. Okay, using the fire cars.   To set up the different level of each offer factors as a moment. We said, have to all of these five factor previous public described in the to level one is low and not as high.   Then we recruit the 20 consumers to evaluate all of the five cards in a different order. The main objective for this Pilar is check the methodology and track the   If the question we asked consumers consumers understand and provide the correct answer, and also define the proper range of each factor.   So after that, we go to the phase one, which is the extend our design space by seven factors. Okay. Some factors to keep, low, high level as we do the Pilar. Some extent to the low, medium, high because we think is more relevant to the consumer by including the more level in our factor.   And since we got more factor and from the customer design point of view, you will generate more   Experiments runs in our study so totally we have an it runs of the cost setting and each of the panelists. We ask them to   Evaluate still five but using the different order or different combination therefore accepted the custom design. When the consumer need to evaluate   Five out of the 90 I said him. We have to using the balance in company blog design technique and to use 120 customers, each of them evaluate five cars.   So totally this   120 customer data we collect we run the model identify what is the main effect. Okay, and what is the interaction in our model.   Then through that we three hours. Not important factor and go to the face to face to using the funnel identified the six factors and for course adding more level for some factor because we saying that low is not low enough in the faith from phase one study and middle it's not   Really matched to our consumer satisfaction. So we had some level of quality lol some factor level for the middle, high   Inserting currently design space, then   The face to design experiments his argument from the phase one.   Was that we get a different okay setting for the 90 different cars, then asked 120 consumer evaluate five in a different camp one   Through that we can remove non we can identify okay what is, what is the past.   Factor setting which have the optimal solution for the consumer satisfaction and clean and shine satisfaction. So as you can see here   We run the 3D model using our   Six factors setting.   Which each of them has played some role for the consumer satisfaction intense or cleaning as shine satisfaction.   For the overall. Clearly, we can see the cleaning ring and shine window cleaning in interior is the key driver for the overall satisfaction. So if consumers in the ring clean and window shine. Normally, either we all agree, he was satisfied for the   For our car detailing job and also we identified significant interaction.   Exterior clean and intuitive clean these two things combined together has different rate to the overall satisfaction with a clean satisfaction and the shine satisfaction model.   We identified very close, very significantly impact factors for clean   Clearly, all of the clean factor relate to the clean satisfaction and for shine also all of the shine one relate to the shine satisfaction.   But still the different perspective lighter clean his focus on the ring and shine is focused on the window. So from validating, we can have the better setting for all the car.   Relief factor which helpers to divide the new projects which achieved the best consumer satisfaction based on all of the   Factors setting. I think   Speaker Transcript Zhiwu Liang Hello, everyone. I'm Zhiwu Liang statistician for Brussels Innovation Center for Procter and Gamble company I'm I'm working   For the r&d department. Hello. Pablo Moreno Pelaez Yep. So I'm Pablo Moreno Pelaez I'm working right now in Singapore in the r&d   Department for Procter and Gamble's   So we wanted to introduce to you this poster where we want to share a case study in which we wanted to figure out what makes a car detailing your grade.   So as you know, Procter and Gamble, the very famous company about job detailing for cars. No, just a joke. So we had to anonymize or what have they done. So this is the way   We wanted to share this case study, putting it in the context of a car detailing job and what we wanted to figure out here is what were the key customer satisfaction factors for which we then   Build interactive design that we then tested with some of those customers to figure out how to build the model and how to optimize   Job detailing for the car. So how do we minimize the use of some of our ingredients. How do we minimize the time we take for some of the tasks that it takes to do the job details.   So if you go to the next slide. And the first thing that we went to, to take a look. Yes.   Okay, what are the different vectors that a customer we look at when they they take the car to get detail and to get clean and shiny and go back home with a buddy.   A brand new car. What are they looking at clean attributes, they're looking at Shane attributes and they are looking at the freshness of the guy.   From a culinary view that we looked at the exterior cleaning the cleaning of the rooms are the king of the interior   The shine of the overall body, the rooms that windows and of course the overall freshness of the interior   And then we'll wanted to build this by modifying these attributes in different ways and combining the different finishes that it a potential   Car detailing job would give you wanted to estimate and be able to build the model to calculate what the overall satisfaction.   And also what the satisfaction with a cleaning and what their satisfaction with the shine.   Would be modifying those different vectors. These will allow us in the future to use the model.   To estimate. Okay, can we reduce the time that we spend on the rooms, because it's not important, or can we reduce the time that we spend on the interior or reduce the amount of products that we use for freshness. If those are not important.   So really, to then optimize how do we spend the resources on delivering the the car detailing jobs.   So in the next slide.   You can see a little bit with the faces of the study where Zhiwu Liang Yeah, so as popular as sad as the cart. The planning job company. We are very focused on the consumer satisfaction. So for this particular job.   What we have to do is identify what is the key factors which drive the consumer overall satisfaction and clean and shine satisfaction. So in order to do that we separate or our study design and   Data collection experiments industry step. First, we do the Pilar, which is designed to five different scenario. Okay, using the fire cars.   To set up the different level of each offer factors as a moment. We said, have to all of these five factor previous public described in the to level one is low and not as high.   Then we recruit the 20 consumers to evaluate all of the five cards in a different order. The main objective for this Pilar is check the methodology and track the   If the question we asked consumers consumers understand and provide the correct answer, and also define the proper range of each factor.   So after that, we go to the phase one, which is the extend our design space by seven factors. Okay. Some factors to keep, low, high level as we do the Pilar. Some extent to the low, medium, high because we think is more relevant to the consumer by including the more level in our factor.   And since we got more factor and from the customer design point of view, you will generate more   Experiments runs in our study so totally we have an it runs of the cost setting and each of the panelists. We ask them to   Evaluate still five but using the different order or different combination therefore accepted the custom design. When the consumer need to evaluate   Five out of the 90 I said him. We have to using the balance in company blog design technique and to use 120 customers, each of them evaluate five cars.   So totally this   120 customer data we collect we run the model identify what is the main effect. Okay, and what is the interaction in our model.   Then through that we three hours. Not important factor and go to the face to face to using the funnel identified the six factors and for course adding more level for some factor because we saying that low is not low enough in the faith from phase one study and middle it's not   Really matched to our consumer satisfaction. So we had some level of quality lol some factor level for the middle, high   Inserting currently design space, then   The face to design experiments his argument from the phase one.   Was that we get a different okay setting for the 90 different cars, then asked 120 consumer evaluate five in a different camp one   Through that we can remove non we can identify okay what is, what is the past.   Factor setting which have the optimal solution for the consumer satisfaction and clean and shine satisfaction. So as you can see here   We run the 3D model using our   Six factors setting.   Which each of them has played some role for the consumer satisfaction intense or cleaning as shine satisfaction.   For the overall. Clearly, we can see the cleaning ring and shine window cleaning in interior is the key driver for the overall satisfaction. So if consumers in the ring clean and window shine. Normally, either we all agree, he was satisfied for the   For our car detailing job and also we identified significant interaction.   Exterior clean and intuitive clean these two things combined together has different rate to the overall satisfaction with a clean satisfaction and the shine satisfaction model.   We identified very close, very significantly impact factors for clean   Clearly, all of the clean factor relate to the clean satisfaction and for shine also all of the shine one relate to the shine satisfaction.   But still the different perspective lighter clean his focus on the ring and shine is focused on the window. So from validating, we can have the better setting for all the car.   Relief factor which helpers to divide the new projects which achieved the best consumer satisfaction based on all of the   Factors setting. I think
Phil Kay, JMP Senior Systems Engineer, SAS   People and organizations make expensive mistakes when they fail to explore their data. Decision makers cause untold damage through ignorance of statistical effects when they limit their analysis to simple summary tables. In this presentation you will hear how one charity wasted billions of dollars in this way. You will learn how you can easily avoid these traps by looking at your data from many angles. An example from media reports on "best places to live" will show why you need to look beyond headline results. And how simple visual exploration - interactive maps, trends and bubble plots - gives a richer understanding. All of this will be presented entirely through JMP Public, showcasing the latest capabilities of JMP Live.   In September 2017 the New York Times reported that Craven was the happiest area of the UK. Because this is an area that I know very well, I decided to take a look at the data. What I found was much more interesting than the media reports and was a great illustration of the small sample fallacy.   This story is all about the value of being able to explore data in many different ways. And how you can explore these interactive analyses and source the data through JMP Public. Hence, "see fer yer sen", which translates from the local Yorkshire dialect as "see for yourself".   If you want to find out more about this data exploration, read these two blogs posts: The happy place?  Crisis in Craven? An update on the UK happiness survey    (view in My Videos)     This and more interactive reports used in this presentations can be found here in JMP Public.
Hadley Myers, JMP Systems Engineer, SAS Chris Gotwalt, JMP Director of Statistical Research and Development, SAS   Generating linear models that include random components is essential across many industries, but particularly in the Pharmaceutical and Life Science domains.  The Mixed Model platform in JMP Pro allows such models to be defined and evaluated, yielding the contributions to the total variance of the individual model components, as well as their respective confidence intervals.  Calculating linear combinations of these variance components is straightforward, but the practicalities of the problem (unequal Degrees of Freedom, non-normal distributions, etc.)  prevent the corresponding confidence intervals of these linear combinations from being determined as easily.  Previously, JMP Pro users have needed to turn to other analytic software solutions, such as the “Variance Component Analysis” package in R, to resolve this gap in functionality and fulfill this requirement.  This presentation is to report on the creation of an add-in, available for use with JMP Pro, that uses parametric bootstrapping to obtain the needed confidence limits.  The add-in, Determining Confidence Limits for Linear Combinations of Variance Components in Mixed Models  , will be demonstrated, along with the accompanying details of how the technique was used to overcome the difficulties of this problem, as well as the benefit to users for which these calculations are a necessity. (view in My Videos)  
Laura Lancaster, JMP Principal Research Statistician Developer, SAS Jianfeng Ding, JMP Senior Research Statistician Developer, SAS Annie Zangi, JMP Senior Research Statistician Developer, SAS   JMP has several new quality platforms and features – modernized process capability in Distribution, CUSUM Control Chart and Model Driven Multivariate Control Chart – that make quality analysis easier and more effective than ever. The long-standing Distribution platform has been updated for JMP 15 with a more modern and feature-rich process capability report that now matches the capability reports in Process Capability and Control Chart Builder. We will demonstrate how the new process capability features in Distribution make capability analysis easier with an integrated process improvement approach. The CUSUM Control Chart platform was designed to help users detect small shifts in their process over time, such as gradual drift, where Shewhart charts can be less effective. We will demonstrate how to use the CUSUM Control Chart platform and use average run length to assess the chart performance. The Model Driven Multivariate Control Chart (MDMCC) platform, new in JMP 15, was designed for users who monitor large amounts of highly correlated process variables. We will demonstrate how MDMCC can be used in conjunction with the PCA and PLS platforms to monitor multivariate process variation over time, give advanced warnings of process shifts and suggest probable causes of process changes.
The purpose of this poster presentation is to display COVID-19 morbidity and mortality data available on-line from Our World in Data whose contributors ask the key question: “How many tests to find one COVID-19 case?” We use SAS JMP Analyze to help answer the question. Smoothing test data from Our World in Data, yields seven-day moving average or SMA(7) total tests per thousand in five countries for which coronavirus test data are reported: Belgium, Italy, South Korea, the United Kingdom and United States. Similarly, seven-day moving average or SMA(7) total cases per million of were derived using the Time Series Smoothing option. Coronavirus tests per case were calculated by dividing smoothed total tests by smoothed total cases and multiplying by a factor of 1,000. These ratios of smoothed tests to smoothed cases were themselves smoothed. Additionally, Box-Jenkins ARIMA(1,1,1) time series models were fitted to smoothed total deaths per million to graphically compare smoothed case-fatality rates with smoothed tests per case ratios.    (view in My Videos)   Auto generated transcript:     Auto-generated transcript...   Speaker Transcript Douglas Okamoto In our poster presentation we display COVID-19 data available from our world and data, who's database sponsors, ask the question why is data on testing important We use JMP version. To help us answer the question. Seven Day moving averages are calculated from January 21 to July 21 Daily per capita COVID-19 tests and coronavirus tests in seven countries United States, Italy, Spain, Germany, Great Britain, Belgium and South Korea. Core by owners test per case where calculated by dividing smooth test by smooth cases and multiplying by a factor 1000 Daily COVID-19 test data yields smoothed test data per thousand in Figure one Testing in LA states in blue trims upward with two tests per thousand daily on July 21st 10 times more than South Korea in red. Which trends downward The x axis is normalized the figure one, two days since moving averages number one or more tests per thousand. In figure two smooth coronavirus cases per million in Europe and South Korea trend downward after peaking months earlier than the US in blue, which averaged 2200 cases per month million on July 21st, with no end in sight. The x axis is normalized to the number of days since moving averages of 10 or more cases per million. Combining tabular results from figure one and figure to smooth COVID-19 test per case in Figure three shows South Korean testing in red peaks at 685 tests per case in May 38 times USP performance in lieu Of 22 tests per case in June. Since the x axis is dated figure three represents a time series. The reciprocal of tests for case cases protest is a measure of product to a positivity one in 22 or 4.5% of positive cases in the US compares with 0.15% positivity in South Korea. And 0.5 to 1.0% in Europe. At a March 30 who press briefing. Dr. Michael Ryan suggested a positive rate less than 10% or even better, less than 3% as a general benchmark of adequate testing. JMP analysis JMP analyzed was used to fit Box Jenkins time series models to smooth test per case in the US for March 13 of April 25 predictive values from April 26 two main ninth or forecast from a fitted model and auto-regressive integrated moving average or ARIMA 111 Model the figure for a time surge of smooth tests per case from mid March to April shows a rise in the number of us test for case not a decline as predicted during the 14 day forecast period. In summary, 10 or more test cases tests were performed per case to provide adequate testing in the United States COVID-19 testing in Europe and South Korea was more than adequate with hundreds of tests per case. Equivalent only the positive rate or number of cases protest was less than 10% in the US. Whereas positivity in Europe and South Korea was well under 3% When our poster was submitted the US totaled 4 million coronavirus cases more than your European countries and South Korea combined Us continues to be plagued by state by state disease outbreaks. Thank you.  
Carlos Ortega, Project Leader, Avantium Daria Otyuskaya, Project Leader, Avantium Hendrik Dathe, Services Director, Avantium   Creativity is at the center of any research and development program. Whether it is a fundamental research topic or the development of new applications, the basis of solid research rests on robust data that you can trust. Within Avantium, we focus on executing tailored catalysis R&D projects, which vary from customer to customer. This requires a flexible solution to judge the large amount of data that is obtained in our up to 64 reactor high-throughput catalyst testing equipment. We use JMP and JLS scripts to improve the data workflow and its integration. In any given project, the data is generated in different sources, including our proprietary catalyst testing equipment — Flowrence ® —, on-line and off-line analytical equipment (e.g., GC, S&N analyzers and SimDis) or manual data records (e.g., MS Excel files). The data from these sources are automatically checked by our JSL scripts and with the statistical methods available in JMP we are able to calculate key performance parameters, elaborate key performance plots and generate automatic reports that can be shared directly with the clients. The use of scripts guarantees that the data handling process is consistent, as every data set in a given project is treated the same way. This provides seamless integration of results and reports, which are ready-to-share on a software platform known to our customers.     Auto-generated transcript...   Speaker Transcript Carlos Ortega Yeah. Hi, and welcome to our presentation at the JMP Discovery Summit. Of course, we would have liked to give this presentation in person, but under the current circumstances, this is the best way we can still share the way we are using JMP in our day-to-day work and how it helps us actually on the more day-to-day work, how to rework our data. However, the presentation in this way with the video has also an advantage for you as a viewer because yeah if you want to grab a coffee right now you just can hit pause and continue when the coffee is ready. But looking at the time, I guess the summit is right now well under way. And most likely, you heard already quite some exciting presentations. How JMP can help you to make more sense out of the data to solve them a statistical tools to gain deeper insight and dive into more parts of your data. However, what we want to do today (and this is also hidden under the title about the data quality assurance), the scripting engine. Everything, which has to do with JSL scripting because we help...this helps us a lot on our day-to-day work to prepare the data, which are then ready to be used for data analysis and by we I mean Carlos Ortega, Daria Otyuskaya, and myself, which I now want to introduce a bit because, yeah, that's the to get a bit better feeling on who's doing this. But of course, as usual, there are some some rules to this, which are the disclaimer about the data we are using. And now if you're a lawyer for sure you're going to press pause to study this in detail. However, for all other people right now, let's dive into the presentation. And of course nothing better than to start with a short introduction of the people you see, you see already the location. We all have in common, which is Amsterdam in the Netherlands and we all have in common that we work at Avantium. company provider for sustainable technologies. However, the different locations we are coming from is all over the world. We have, on the one hand side on the left side, Carlos Ortega, a chemical engineer from Venezuela, which lives in Holland, about six years and works at Avantium about two years as a project leader and services. Then we have on the right side Daria Otyuskaya from Russia also working here for about two years and spending the last five years in the Benelux area where she made her PhD in chemical engineering. And myself. I have the only advantage, can that I can travel home by car as I origin from Germany. I live in Holland since about 10 years and join Avantium about three years ago. But now, let's talk a bit more about Avantium. I just want to briefly lay out a bit of the things we are doing. Avantium, as I mentioned before, provider for sustainable technologies and has three business units. One is Avantium Renewable Polymers, where we actually develop biodegradable polymer called a PEF, which is hundred percent plant based and recyclable. Second, we have a business unit called Avantium Renewable Chemistries, which offers renewable technologies to produce chemicals like MEG or industrial sugars from non food biomass. And last but not least, a very exciting technologies, where we turn CO2 from the air into chemicals via electro chemistry. But not too much to talk about these two business units because Carlos, myself and Daria are all working in the Avantium Catalysis, which was founded in 20 years ago and it's still the founding...the fundamental of Avantiums technology innovations. We are actually providing their We are a service provider in accelerating the research in your company in the catalysts research, to be more more specific. And we offer there, as you can see on the right hand side, systems services and a service called refinery catalyst testing. And what we help companies really to develop the R&D, as you see at the bottom for this. But this is enough about Avantium. Let's talk a bit how we are developing how we are working in projects and how JMP actually can help us there to accelerate the stuff and get better data out of it, which Carlos then later on the show in a demo for us. As mentioned before, we are a service provider and as a service provider, we get a lot of requests from customers to actually develop better catalysts, or better process. And now you might ask yourself, what's the catalyst. A catalyst is actually a material which participates in a reaction when you transform A to A, but doesn't get consumed in a reaction. The most common example of people, which you can see in your day-to-day life is, for example, the exhaust gas catalyst which is installed in your car, which turns off gases from your ...from your car into CO2 and water as an exhaust. And this is things which we get as requests. People come to us and say, "Oh, I would like to develop a new material," or things like, "I have this process, and I want to come with...accelerate my research and Develop a new process for this." And what they use there is when we have an experiment in our team, we are designing experiments of... designing experiments. We are trying to optimize the testing for this and is all we use JMP, but this is not what we want to talk today about. Because as I said before, we are using JMP also to actually merge our data, process them and make them ready for things, which is the two parts, which you see at the bottom of the presentation. We are executing research projects for customer in our proprietary tool called Flowrence, where the trick is that we don't experiment...don't execute tests, one after another, but we execute in parallel. Traditionally, I mean, I remember myself in my PhD, you execute a test one reactor after another, after another, after another. But we are applying up to 64 reactors in parallel, which makes the execute more challenging but allows a data-driven decision. It allows actually to make more reliable data and make them statistically significant. And then we are reporting this data to our customers, which then can either to continue in their tools with their further insights or completely actually rely on us for executing this data and extracting the knowledge. But yeah, enough about the company. And now let me hand over to Carlos, which will explain how JMP and JMP script actually helps us to make us our life significantly easier. Thank you, Hendrik,for the nice introduction. And thank you also for the organizers for this nice opportunity to participate in the JMP discovery summit. So as Hendrik was mentioning, we develop and execute research projects for third parties. And if we think about it, we need to go from design of experiments (and that's of course one very powerful feature from JMP), but also we need to manage information and in this case, as Hendrik was was mentioning, we want to focus on JSL script that allows us to easily handle information and create seamless integration of a process workflows. I'm a project leader in the R&D department and so a day...a regular day in my life here would look something like this. And so very simplistic view. You would have clients who are interested and have a research question and I design experiments and we execute these in our own proprietary technology called Flowrence. So in a simple view the data generated in the Flowrence unit will go through me after some checks and interpretation will goes back to the client. But the reality is somewhat more complex and on one hand, we also have internal customers. That is part of...for example our development team...business development team. And on the other side, we also have our own staff that actually interacts directly with the unit. So they control how the unit operates and monitor everything goes according to the plan. And the data, as you see here with broken lines, the data cannot be struck directly from the unit. The data is actually sent to a data warehouse and then we need a set of tools that allows us to first retrieve information, merge information that comes from different sources, execute a set of tasks that go from cleaning, processing, visualizing information, and eventually we export that data to the client so that the client can get the information that they actually need and that is most relevant for them. If you'll allow me to focus for one second on these different tasks, what we observed initially in the retrieve a merge is that data can actually come from various sources. So in the data warehouse, we actually collect data from the Florence unit, but we also collect data from the analyzer. So for those that they're performing tests in a laboratory, you might be familiar with the mass spectrometry or gas chromatography, for example, and we also collect data on the unit performance. So we also verify that the unit is is behaving as expected. In...as in any laboratory, we would also have manual inputs. And these could be, for example, information on the catalysts that we are testing or calibration of the analytical equipment. Those manual inputs are always of course stored in a laboratory notebook, but also we include that information into an Excel file. And this is where JMP is actually helping us drive the work flow of information to the next level. So what we have developed is a combination of an easy to use vastly known Excel file with powerful features from a JSL script. And not only we include manual data that is available in laboratory notebooks, but we also include in this Excel file formulas that are then interpreted by the JSL script and executed. That allows us to calculate key performance parameters that are tailored or specifically suited for different clients. If we look in more detail into the JSL script, and in a moment I will go into a demo, you will observe that the JSL script has three main sections. One section will prepare the local environment. So on one side we would say we want to clear all the symbols and close tables, but probably the most important feature is when we define "names default to here." So that would allow us actually to run parallel scrapes without having any interference between variables that are named the same in different scripts. Then we have section that is collapsed in this case so that we can show it actually that creates a graphical user interface. And then the user does not interact with the script itself, but actually works through a simple graphical user interface with the buttons that have descriptive button names. And then we have a set of tasks that are already coded in the script. In this case, they are in the form of expressions. Because well, it has two main advantages. One would be a it's easy to later on implement on the graphical user interface. And second, when you have an expression, you can use this expression several times in your code. OK, so moving on into the demo simulation. So I mentioned earlier that we have different sources of data. And on one side we have data that is in fact... that is in fact stored in our database. And this database will contain probably different sources of information, like the unit or different analyzers. In this case, you will see or you see an example Excel table. This only for illustration. So this data is actually taken from the data warehouse directly with our JSL script. So we don't look at this Excel table as a search. We let the software collect the information from the data warehouse. And probably what is most important is that this data, as you see here, can come again from different analyzers, and we're structuring somehow that the first column contains divided names. In this case, we have made some domain names. So, for reasons of confidentiality, but also you will see that all the observations are arranged in rows. So every single row is an observation. And depending on the type of test and the unit we are using, we could think that overall in one day we can collect up to half a million data points in one single day. That depends of course on the analyzer, but you immediately are faced with the amount of data that you have to handle and how JSL script that helps you process information can help you with this activity. Then we also use another Excel file. And this one is also very important, which is an input table file. And this files, specifically with the JSL script, are the ones creating the synergy to allows us to process data easy. What you see in this case, for example, is a reactor loading table and we see different reactors with different catalysts. And this information that seems... is not quantitative, but the qualitive the value is important. And then if we move to a second tab, and these steps are all predefined across our projects, we see the response factors for the analyzers. Different analyzers will have different response factors and it's important to log this information into use through the calculations to be able to get quantitative results. In this case, we observed that the condition that the response factors are targeted by condition instead. Then we have a formula tab. And this is probably a key tab for our script. You can input formulas in this Excel file. You make sure that the variable names are enclosed into square brackets. And the formula, you can use any formula in Excel. Anyone can use Excel; we're very much used to it. So if you type a formula here, that follows ??? syntax in Excel, it will be executed by our JSL script. Then we also included an additional feature we thought it was interesting to have conditionals. And for the JSL script to read this conditional, the only requirement is that the conditionals are enclosed in braces. There are two other tabs I would like to show you, which are highly relevant. One is a export tables tab and the reason that we have this table is because we generate many columns or many variables from my unit, probably 500 variables. But actually the client is only interested in 10, 20 or 50 of them. Those are the ones that really add value to their research. So we can input those variables here and send it to the client. And last but not least, I think many of us have been in that situation where we send an email to a wrong address and that can be actually something frightening when you're talking about confidential information. So we always double, triple check the email addresses and but does it...is it really necessary? So what we are doing here is that we have one Excel file that contains all manual inputs, including the email address of our clients. And these email addresses are fixed so there is no room for error. Whenever you have run the JSL script the right email addresses will be read and the email will be created and these we will see in one minute. So now going into the JSL script, I would like to highlight the following. So the JSL script is initially located in one single file in one single folder and the JSL script only needs one Excel file to write that contains different tabs that we just saw in the previous slide Once you open the JSL script, you can click on the run script button and that will open the graphical user interface that you see on the right. Here we have different options. In this case we want to highlight the option where we retrieve data from a project in that given period. We have selected here only one day this year, in particular, and then we see different buttons that allows us to create updates, for example. Once we have clicked on this button, you will see to the left on the folder that two directories were created. The fact that we create these directories automatically help us to have harmony or to standardize how is a folder structured also across our projects. If you look into the raw database data, you will see the two files were created. One contains the raw data that comes directly from the data warehouse. And the second, the data table contains all merge information from the Excel file and different tables that are available in the data warehouse. The exported files folder does not contain anything at this moment, because we have not evaluated and assessed the data that we created in our unit is actually relevant and valuable for the client. We do this, we are, we ??? and you see here that we have created a plot of reactor temperature versus the local time. And different reactors would be plotted so we have up to 64 in one of our units. And in this case we color the reactors, depending on the location on the unit. Another tab we have here, as an example, is about the pressure. And you see that you can also script maximum target and minimum values and define, for example, alerts to see if value is drifting away. The last table I want to show is a conversion and we see here different conversions collapsed by catalyst. So once we click the export button, we will see that our file is attached into an email and the email already contains the addresses...the email addresses we want to use. And again, I want to highlight how important it is to send the information to the right person. Now this data set is actually located into the exported files folder, which was not there before. And we always can keep track of what information has been exported and sent to the client. With this email then it's only a matter of filling in the information. So in this case, it's a very simple test. So this is your data set, but of course we would give some interpretation or gave maybe some advice to the client on how to continue the tests. And of course, once you have covered all these steps you will close the graphical user interface and that will also close all open tables and the JSL script. Something that I would like to highlight at this point is that these workflow using a JSL script is is rather fast. So what you saw at this moment, of course, it's a bit accelerated because it's only a demonstration, but you don't spend time looking for data and different sources, trying to merge them with the right columns. All these processes are integrated into a single script and that allows us to report to the client on a daily basis amounts of data that otherwise would be would...would not be possible. And the client can actually take data driven decisions with a very fast pace. That's probably the key message that I want to deliver with with this script that we see at this moment. Now, well, I would like to wrap up the presentation with with some concluding remarks and some closing remarks. And so on one side, we developed a distinctive approach for data handling and processing. And when we say distinctive it's because we have created a synergy between an Excel file that most people can use because you are very familiar with Microsoft Office and a JSL script which doesn't need any effort to run. So you click Run, you will get a graphical user interface and a few buttons to execute tasks. Then we have a standardized workflow. And that's also highly relevant when you work with multiple clients and also also from a practical point of view. For example, if one of my colleagues would go on holiday, it will be easy for another project leader for myself, for example, to take over the project and know that all the folder structures are the same, that all the scripts are the same and the buttons execute the same actions. Finally, we can also...we can guarantee seamless integration of data and these fast updates of information with thousands or even half a million data points per day can be quickly sent to clients and then this allows them to take almost online data driven decisions. At the end, our purpose is to maximize the customer satisfaction through a consistent, reliable and robust process. Well, with this, I would like to thank, again, the organizers of these discovery summit. Of course, to all our colleagues at Avantium, who have made this possible, especially to those that have worked intensively on the development of these scripts. And if you are curious about our company or the work we do in Catalysis, please visit one of the links you see here. And with this, I'd like to conclude, thank you very much for for your attention. And yeah, we look forward to your questions.  
Kelci Miclaus, Senior Manager Advanced Analytics R&D, JMP Life Sciences, SAS   Reporting, tracking and analyzing adverse events occurring to patients is critical in the safety assessment of a clinical trial. More and more, pharmaceutical companies and the regulatory agencies to whom they submit new drug applications are using JMP Clinical to help in this assessment. Typical biometric analysis programming teams may create pages and pages of static tables, listings and figures for medical monitors and reviewers. This leads to inefficiencies when the doctors that understand medical impacts of the occurrence of certain events can not directly interact with adverse event summaries. Yet even simple count and frequency distributions of adverse events are not always so simple to create. In this presentation we focus on key reports in JMP Clinical to compute adverse event counts, frequencies, incidence, incidence rates and time to event occurrence. The out of the box reports in JMP Clinical allow fully dynamic adverse event analysis to look easy even while performing complex computations that rely heavily on JMP formulas, data filters, custom-scripted column switchers and virtually joined tables.      Auto-generated transcript...   Speaker Transcript Kelci J. Miclaus Hello and welcome to JMP Discovery Online. Today I'll be talking about summarizing adverse event summaries and clinical trial analysis. I am the Senior Manager in the advanced analytics group for the JMP Life Sciences division here at SAS, and we work heavily with customers using genomic and clinical data in their research. So before I go through the summarizing and the details around using JMP with adverse event analyses, I want to introduce the JMP Clinical software which our team creates. JMP Clinical is one of the family of products that includes now five official products as well as add ins, which can extend JMP to really allow you to have as many types of vertical applications or extensions of JMP as you want. My development team supports JMP Genomics and JMP Clinical. JMP Genomics and JMP Clinical are respectively vertical applications that are customized, built on top of JMP, that are used for genomic research and clinical trial research. And today I'll be talking about how we've created reviews and analyses in JMP Clinical for pharmaceutical industries that are doing clinical trials safety and early efficacy analysis. The original purpose of JMP Clinical and the instigation of this product actually came through assistance to the FDA, which is a heavy JMP user And their CDER group, the Center for Drug Evaluation and Research. Their medical reviewers were commonly using JMP to help review drugs submissions. And they love it. They're very accomplished with it. One of the things they found though is that certain repetitive actions, especially on very standard clinical data could be pretty painful. Example here is the idea of something called a shift plot which is for laboratory measurements where you compare the trial average of a laboratory of versus the baseline against treatment groups. In order to create this, it took at least eight to 10 steps within the JMP interface of opening up the data, normalizing the data, subsetting it out into baseline versus trial, doing statistics, respectively, for those groups merging it back in, then splitting that data by lab tests so you could make this type of plot for each lab. And that's not even to get to the number of steps within Graph Builder to build it. So JMP clearly can do it, but what we wanted to do is solve their pain at this very standard type of clinical data with a one-click lab shift plots, for example. In fact, we wanted to create clinical reviews in our infrastructure that we call the review builder that are one-click standardized reproducible reviews for many of the highly common standard analyses and visualizations that are required or expected in clinical trial research to evaluate drug safety and efficacy. So JMP Clinical has evolved since that first instigation of creating a custom application for a shift plot into a full-service clinical...clinical trial analysis software that covers medical monitoring and clinical data science, medical writing teams, biometrics and biostatistics, as well as data management around the study data involved with clinical trial collection. This goes for both safety and efficacy but also operational integrity or operational anomalies that might be found in the collection of clinical data as well. Some of the key features around JMP Clinical that we find to be especially useful for those that are using the JMP interface for any types of analyses are things like virtual joins. So we have an idea of a global review subject filter, which I'll show you during the demonstrations for adverse events, that really allow you to integrate and link the demography information or the demographics about our subjects on a clinical trial to all of the clinical domain data that's collected. And this architecture, which is enabled by virtual joins within the JMP interface with row state synchronization, allow you to really have instantaneous interactive reviews with very little to no data manipulation across all the types of analyses you might be doing in a clinical trial data analysis. Another new feature we've added to the software that also leverages some of the power of the JMP data filter, as well as creation of JMP indicator columns, is this ability to, while you're interactively reviewing clinical trial data, find interesting signals that say, in this example, the screenshot shown is subjects that had a serious adverse event while on the clinical trial, find those interesting signals, and quite immediately, create an indicator flag that is stored in metadata with your study in JMP Clinical that's available for all other types of analyses you might do. So you can say, I want to look now at my laboratory results for patients that had a serious adverse event versus those that didn't to see if there's also anomalies that might be related to an adverse event severity occurrence. Another feature that I'll also be showing with JMP Cclinical and the demonstration around adverse event analysis is the JMP Clinical API that we've built into the system. One of the most difficult things of providing and creating and developing a vertical application that has out-of-the box one-click reports is that you get 90% of the way there and then the customer might say, oh, well, I really wanted to tweak it, or I really wanted to look at it this way, or I need to change the way the data view shows up. So one of the things we've been working hard on in our development team is using JMP scripting JSL to surface an API into the clinical review, to have control over the objects and the displays and the dashboards and the analyses and even the data sets that go into our clinical reviews. So I'll also be showing some of that in the adverse event analysis. So let's back up a little bit and go into the meat of adverse events and clinical trials now that we have an overview of JMP Clinical. There's really two kind of key ways of thinking of this. There's that safety review aspect of a clinical trial where that's typically counts and percentages of the adverse events that might occur. And a lot of the medical doctors, monitors, or reviewers often use this data to understand medical anomalies, you know, a certain adverse event starts showing up more commonly, with one of the treatments that could have medical implications. There's also the statistical signal detection, the idea of statistically assessing our adverse events occurring at an unusual rate in one of the treatment groups versus the other. So here, for example, is a traditional static table that you see in many of the types of research or submissions or communications around a clinical trial adverse event analysis. Basically it's a static table with counts percents and if it is more statistically oriented, you'll see things like confidence intervals and p values as well around things like odds ratios or a relative risks or rate differences. Another way of viewing this can also be visually instead of with a tabular format so signal detection, looking at say odds ratio or the, the risk difference might use the Graph Builder in this case to show the results of a statistical analysis of the incidence of certain adverse events and how they differ between treatment groups, for example. So those are two examples. And in fact, from the work we've done and the customers we've worked with around how they view and have to analyze adverse events, the JMP Clinical system now offers several common adverse event analyses from simple counts and percentages to incidence rates or occurrences into statistical metrics such as risk difference, relative risk, odds ratio, including some exposure adjusted time to event analyses. We can also get a lot more complex with the types of models we fit and really go into mixed or Bayesian models as well in finding certain signals with our adverse event differences. And also we use this data heavily in reviewing just the medical data in either a medical writing narrative or patient profile. So now I'm going to jump right into JMP Clinical with a review that I've built around many of these common analyses. So one of the things you'll notice about JMP Clinical is it doesn't exactly look like JMP, but it is. It's a combined integrated solution that has a lot of custom JSL scripting to build our own types of interfaces. So our starter window here lays out studies, reviews, and settings, for example. And I already have a review built here that is using our example nicardapine data. This is data that's shipped with the product. It's also available in the JMP sample library. It's a real clinical trial, looking at subarachnoid hemorrhage. It was with about 900 patients. And so what this first tab of our review is looking at is just the distribution of demographic features of those patients, how many were males versus females, their race breakdowns, what treatment group they were given, their sites that the data was taken from, etc. So this is very common, just as the first step of understanding your clinical data for a clinical trial. You'll notice here we have a report navigator that shows the rest of the types of analyses that are available to us in this built review. I'm going to walk through each of these tabs, just quickly to show you all the different flavors of ways we can look at adverse events with the clinical trial data set. Now, the typical way data is collected with clinical trials is an international standard called CDISC format, which typically means that we have a very stacked data set format. Here we can see it, where we have multiple records for each subject indicating the different adverse events that might have occurred over time. This data is going to be paired with the demography data, which is one row per each subject as seen here in this demographic. So we have about 900 patients and you'll see in this first report, we have about 5,000 or 5,500 records of different adverse events that occurred. So this is probably the most commonly used reports by many of the medical monitors and medical reviewers that are assessing adverse event signals. What we have here is basically a dashboard that combines a Graph Builder counts plot with an accompanying table, as they are used to seeing these kind of tables. Now the real value of JMP is its interactivity and that dynamic link directly to your data so that you can select anywhere in the data and see it in both places. Or more powerfully, you can control your views with column switchers. Now here we can actually switch from looking at distribution of treatments to sex versus race. You'll notice with race, if we remember, we had quite a few that were white in this study, so this isn't a great plot when we look at it by percent or by counts, so we might normalize and show percents instead. And we can also just decide to look at the overall holistic counts of adverse events as well. Another part of using this as this column switcher is the ability to you know categorize what kind of events those were. Was it a serious adverse event? What was the severity of it? Was the outcome that they are when they recovered from it or not? What was causing it? Was it related to study drug? All of these are questions that medical reviews will often ask to find interesting or anomalous signals with adverse events in their occurrences. Now one of the things you might have already noticed in this dashboard is that I have a control group as column switcher here that's actually controlling both my graph and my table. So when I switched to severity, this table switches as well. This was done with a lot of custom JSL scripting specifically to our purposes, but I'll tell you a secret, in 16 the developer for column switcher is going to allow us to have this type of flexibility so you can tie multiple platform objects into the same columns switcher to drive a complex analysis. I'm going to come back to this occurrence plot, even though it looks simple. Here's another instance of it that's actually looking at overall occurrence where certain adverse events might have occurred multiple times to the same subject. I'm going to come back to these but kind of quickly go through the rest of the analyses and these reviews before coming back to some of the complexities of the simple graph builder and tabulate distribution reports. The next section in our review here is an adverse event incident screen. So here we're making that progression from just looking at counts and frequencies or possibly incidence rates into more statistical framework of testing for the difference in incidence of certain adverse events in one treatment group for another. And here we are representing that with a volcano plot. So we can see actually that phlebitis, hypotension and isothenuria occur much more often in our treatment group, those that were treated with nicardipine, versus those on placebo. So we can actually select those and drill into a very common view for adverse events, which is our relative risk for a cell plot as well, which is lots of lot of times still easier to read when you're only looking at those interesting signals that have possibly clinical or statistical significant differences. Sometimes clinical trials take a long time. Sometimes they're on them for a few weeks, like this study was only a few weeks, but sometimes they're on them for years. So sometimes it's interesting to think of adverse event incidents differences as the trial progresses. We have this capability as well within the incidence screen report where you can actually chunk up the study day, study days into sections to see how the incidents of adverse events change over time. And a good way to demonstrate that might be with an exploding volcano plot here that shows how those signals change across the progression of the study. So another powerful idea with this, especially as you have longer clinical trials or more complex clinical trials, is instead of looking at just direct incidence among subjects you can consider their time to event or their exposure adjusted rate at which those adverse events are occurring. And that's what we offer within our time to event analyses, which once again, shown in a volcano plot looking here using a Kaplan Meier test at differences in the time to event of certain events that occur on a clinical trial. One of the nice things here is that you can select these events and drill down into the JMP survival platform to get the full details for each of the adverse events that had perhaps different time to event outcomes between the treatment groups. Another flavor of time to event is often called an incidence density ratio, which is the idea of exposure adjusted incidence density. Basically the difference here is instead of using some of the more traditional proportional hazards or Kaplan Meier analyses, this is more like a a poisson style distribution that's adjusted for how long they've actually been exposed to a drug. And once again here we can look at those top signals and drill down to the analogous report within JMP using a generalized linear model for that specific type of model with an adverse event signal detection. And we actually even offer some really complex Bayesian analyses. So one of the things with with this type of data is typically adverse events exist within certain body systems or classes...organ classes. And so there is a lot of posts...or prior knowledge that we can impose into these models. And so some of our customers, their biometrics teams decide to use pretty sophisticated models when looking at their adverse events. So, so far we've walked from what I would say consider pretty simplistic distribution views of the data into distributions and just count plots of adverse events into very complex statistical analyses. I'm going to come back now, back to what is that considered simple count and frequency information and I want to spend some time here showing the power of JMP interactivity that we have. As you recall one of the differences here is that this table is a stacked table that has all of the occurrences of our adverse events for each subject, and our demography table, which we know we have 900 subjects, is separate. So what we wanted was not a static graph, like we have here, or what we would have in a typical report in a PDF form, but we wanted to be able to interactively explore our data and look at subgroups of our data and see how those percentages would change. Now, the difficulty is that the percent calculation needs to come from the subject count in a different table. So we've actually done this by formula...like creating column formulas to dynamically control recalculation of percents upon selection, either within categorizing events or, more powerfully, using our review subject filter tool. So here for example, we're looking at all subjects by treatment. Perhaps serious versus not serious adverse events, but we can use this global data filter which affects each of the subject level reports in our review and instantaneously change our demography groups and change our percentages to be interactive to this type of subgroup exploration. So here, now we can actually subgroup down to white females and see what their adverse event percentage and talents are, or perhaps you want to go more granular and understand for each site, how their data is changing for different sites. So what we really have here is instead of a submission package or a clinical analysis where the biometrics team hands 70 different plots and tables to the medical reviewer to go through, sift through, they have the power to create hundreds of different tables and different subsets and different graphics, all in one interface. In fact, you can really filter down into those interesting categories. So if they were looking say at serious adverse events and they wanted to know serious adverse events that were related to drug treatment very quickly, now we got down to a very small subset from our 900 patients to about nine patients that experienced serious adverse events that were considered related to the treatment. So as a medical reviewer this is a place where Ithen might want to understand all of the clinical details about these patients. And very quickly, I can use one of our action buttons from the report to drill down to what's called a kind of a complete patient profile. So here we see all of the information now, instead of at a summary level, at a subject individual level of everything that occurred to this patient over time, including when they had serious adverse events occur and their laboratory or vital measurements that were taken alongside of that. One of the other main uses of our JMP Clinical system along with this medical review, medical monitor is medical writing teams. So another way of looking at this instead of visually in a graphic or even in a table which these are patient profile tables, you can actually go up here and generate an automated narrative. So here we're going to actually launch to our adverse event narrative generation. Again, one of the benefits and values of our JMP Clinical being a vertical application relying on standard data is that we get to know all the data and the way it is formatted up up up front, just by being pointed to the study. So what we can do here is actually run this narrative that is going to write us the actual story of each of those adverse events that occurred. And this is going to open up a Word doc that has all of the details for this subject, their demography, their medical history, and then each of the adverse events and the outcomes or other issues around those adverse events. And we can do this for one patient at a time or we can actually even do this for all 900 patients at a time and include more complex details like laboratory measurements, vitals, either a baseline or before. And so, medical reviewers find this incredibly valuable be able to standardly take data sources and not make errors in a data transfer from a numeric table to an actual narrative. So I think just with that you can really see some of the power of these distribution views, these count plots that allow you to drill into very granular levels of the data. This ability to use subject filters to look either within the entire population of your patients on a clinical trial or within relevant subgroups that you may have found. Now one thing about the way our global filter works through our virtual joins is this is only information that's typically showing the information about the demography. One of the other custom tools that we've scripted into this system is that ability to say, select all subjects with a serious adverse event. And we can either derive a population flag and then use that in further analyses or we can even throw that subject's filter set to our global filter and now we're only looking at serious...at a subject who had a serious adverse event, which was about...almost 300 patients on the clinical trial had a serious adverse event. Now, even this report, you'll see is actually filtered. So the second report is a different type of aspect of a distribution of adverse events that was new in our latest version which is incidence rates. And here, the idea is instead of normalizing or dividing to calculate a percent by the number of subjects who had an event. If you are going with ongoing trials or long trials or study trials across different countries that have different timing startup times, you might want to actually look at the rate at which adverse events occur. And so that's what this is calculating. So in this case, we're actually subset down to any subjects that had a serious adverse event. And we can see the rate of occurrence in patient years. So for example, this very first one, see, has about a rate of 86 occurrences in every 10 patient years on placebo versus 71 occurrences In nicardipine. So this was actually one which this was to treat subarachnoid hemorrhage, intracranial pressure increasing likely would happen if you're not being treated with an active drug. These percents are also completely dynamic, these these incidence rates. So once again, these are all being done by JMP formulas that feed into the table automatically that respect different populations as they're selected by this global filter. So we can look just within say the USA and see the rates and how they change, including the normalized patient years based on the patients that are from just the USA, for example. So even though these reports look pretty simple, the complexity of JSL coding that goes beyond building this into a dashboard is basically what our team does all day. We try to do this so that you have a dashboard that helps you explore the data as you know, easily without all of these manipulations that could get very complex. Now the last thing I wanted to show is the idea of this custom report or customized report. So this is a great place to show it too, because we're looking here at adverse events incidence rates. And so we're looking by each event. And we have the count, or you can also change that to that incidence rate of how often it occurs by patient year. And then an alternative view might be really wanting to see these occurrences of adverse events across time. And so I want to show that really quick with our clinical API. So the data table here is fully available to you. One of the things I need to do first off is just create a numeric date variable, which we have a little widget for doing that in the data table, and I'm going to turn that into a numeric date. Now you'll notice now this has a new column at the end of the numeric start date time of the adverse event. You'll also notice here is where all that power comes from the formulas. These are all actually formulas that are dynamically regenerated based on populations for creating these views. So now that we have a numeric date with this data, now we might want to augment this analysis to include a new type of plot. And I have a script to do that. One of the things I'm going to do right off the bat is just create a couple extra columns in our data set for month and year. And then this next bit of JSL is our clinical API calls. And I'm not going to go into the details of this except for that it's a way of hooking ourselves into the clinical review and gaining access to the sections. So when I run this code, it's actually going to insert a new section into my clinical review. And here now, I have a new view of looking at the adverse events as they occurred across year by month for all of the subjects in my clinical trial. So one of the powers, again, even with this custom view is that this table by being still virtually joined to our main group can still fully respond to that virtual join global subject filter. And so just with a little bit of custom API JSL code, we can take these very standard out-of-the-box reports and customize them with our own types of analyses as well. So I know that was quite a lot of an overview of both JMP Clinical but, as well as the types of clinical adverse event analyses that the system can do and that are common for those working in the drug industry or pharma industry for clinical trials, but I hope you found this section valuable and interesting even if you don't work in the pharma area. One of the best examples of what JMP Clinical is is just an extreme extension and the power of JSL to create an incredibly custom applications. So maybe you aren't working with adverse events, but you see some things here that can inspire you to create custom dashboards or custom add ins for your own types of analyses within JMP. Thank you.  
Wenjun Bao, Chief Scientist, Sr. Manager, JMP Life Sciences, SAS Institute Inc Fang Hong, Dr., National Center for Toxicological Research, FDA Zhichao Liu, Dr., National Center for Toxicological Research, FDA Weida Tong, Dr., National Center for Toxicological Research, FDA Russ Wolfinger, Director of Scientific Discovery and Genomics, JMP Life Sciences, SAS   Monitoring the post-marketing safety of drug and therapeutic biologic products is very important to the protection of public health. To help facilitate the safety monitoring process, the FDA has established several database systems including the FDA Online Label Repository (FOLP). FOLP collects the most recent drug listing information companies have submitted to the FDA. However, navigating through hundreds of drug labels and extracting meaningful information is a challenge; an easy-to-use software solution could help.   The most frequent single cause of safety-related drug withdrawals from the market during the past 50 years has been drug-induced liver injury (DILI). In this presentation we analyze 462 drug labels with DILI indicators using JMP Text Explorer. Terms and phrases from the Warnings and Precautions section of the drug labels are matched to DILI keywords and MedDRA terms. The XGBoost add-in for JMP Pro is utilized to predict DILI indicators through cross validation of XGBoost predictive models by the term matrix. The results demonstrate that a similar approach can be readily used to analyze other drug safety concerns.        Auto-generated transcript...   Speaker Transcript Olivia Lippincott What wenjba It's my pleasure to talk about this, Obtain high quality information from FDA drug labeling system and in the JMP discovery. And today I'm going to talk about four portions. The first, I'll give some background information about drug post marketing monitoring and what is the effort from the FDA regulatory agency and industry. And also, I'm going to use a drug label data set to analyze the text and using the Text Explorer in JMP and also use the add in JMP add in XGBoost to analyze this DILI information and then give the conclusion and also the XGBoost tutorial by Dr. Russ Wolfinger is also present in this JMP Discovery Summit so please go to his tutorial if you're interested in XGBoost. So the drug development, according to FDA description for the drug processing, it can be divided by five stages and the first two stages, discover and research preclinical, many in the in the for the animal study and chemical screen, and then later three stages involve the human. And JMP has three products, including JMP Genomics, JMP Clinical, JMP and JMP Pro, that have covered every stage of the drug discovery. And JMP Genomics is the omics system that can be used for omics and clinical biomarkers selection and JMP Clinical is specific for the clinical trial and post marketing monitoring for the drug safety and efficacy. And also for the JMP Pro can be used for drug data cleaning, mining, target identification, formulation development, DOE, QbD, bioassay, etc. So it can be used every stage of the drug development. So in this drug development, there's most frequent single cause, called a DILI, actually can be stopped for the clinical trial. The drug can be rejected for approval by the FDA or the other regulatory agency, or be recalled once the drug is on market. So this is the most frequent the single cause called the DILI and can be found the information in the FDA guide and also other scientific publications. So what is DILI? This actually is drug-induced liver injury, called DILI and you have FDA, back almost more than 10 years ago in 2009, they published a guide for DILI, how to evaluation and follow up, FDA offers multiple years of the DILI training for the clinical investigator and those information can still find online today. And they have the conferences, also organized by FDA, just last year. And of course for the DILI, how you define the subject or patient could have a DILI case, they have Hy's Law that's included in the FDA guidance. So here's an example for the DILI evaluation for the clinical trial, here in the clinical trial, in the JMP Clinical by Hy's Law. So the Hy's Law is the combination condition for the several liver enzymes when they elevate to the certain level, then you would think it would be the possible Hy's Law cases. So you have potentially that liver damages. So here we use the color to identify the possible Hy's Law cases, the red one is a yes, blue one is a no. And also the different round and the triangle were from different treatment groups. We also use a JMP bubble plot to to show the the enzymes elevations through the time...timing...during the clinical trial period time. So this is typical. This is 15 days. Then you have the subject, starting was pretty normal. Then they go kind of crazy high level of the liver enzyme indicate they are potentially DILI possible cases. So, the FDA has two major databases, actually can deal with the post-marketing monitoring for the drug safety. One is a drug label and which we will get the data from this database. Another one is FDA Adverse Event Reporting System, they then they have from the NIH and and NCBI, they have very actively built this LiverTox and have lots of information, deal with the DILI. And the FDA have another database called Liver Toxic Knowledge Base and there was a leading by Dr. Tong, who is our co are so in this presentation. They have a lot of knowledge about the DILI and built this specific database for public information. So drug label. Everybody have probably seen this when you get prescription drug. You got those wordy thing paper come with your drug. So they come with also come with a teeny tiny font words in that even though it's sometimes it's too small to read, but they do contain many useful scientific information about this drug Then two potions will be related to my presentation today, would be the sections called warnings and precautions. So basically, all the information about the drug adverse event and anything need be be warned in these two sections. And this this drug actually have over 2000 words describe about the warnings and precautions. And fortunately, not every drug has that many side effects or adverse events. Some drugs like this, one of the Metformin, actually have a small section for the warning and precautions. So the older version of the drug label has warnings and precautions in the separate sections, and new version has them put together. So this one is in the new version they put...they have those two sections together. But this one has much less side effects. So JMP and the JMP clinical have made use by the FDA to perform the safety analysis and we actually help to finalize every adverse event listed in the drug labels. So this is data that got to present today. So we are using the warning and precaution section in the 462 drug labels that extracted by the FDA researchers and I just got from them. And the DILI indicator was assigned to each drug. 1 is yes and the zero is no. So from this this distribution, you can see there's about one...164 drugs has potential DILI cases and 298 doesn't and the original format for the drug label data is in the XML format and that can be imported by JMP multiply at once. So for the DILI keywords and was a many years effort by the FDA to come up this keyword list. Then they actually by the expert, reading hundreds of drug label and then decided what could potentially become the DILI cases. So then they come up with those about 44 words or terms to be indicated as a keyword, to be indicated for the drug could be the DILI cases. And you may also heard about MedDRA, which is a medical dictionary for regulatory activities. They have different levels of a standardized terms and most popular one is preferred term. I'm going to be using today. So in the warning and precaution, you can see if we pull everything together, you have over 12,000 terms in the warnings and the precautions section. And you can see that "patients" and "may" is a dominant which made not...should not be related to the medical cases and the medical information in this case. So we can remove that, you can see that not any other words are so dominant in this word cloud, but it still have many medical unrelated words like "use" and like "reported" that we could put into... could remove them to our analysis list. So in the in the Text Explorer, we can put them into the stop word and also we normally were using the different Text Explorer technology is stemming, tokenizing, regex, recoding and deleting manually. to clean up the list. But it had 12,000 terms, so it could be very time consuming. But since we have the list we are interested in, so we want to take advantage that we already knew what we are interested in the terms in this system. So what we're going to do and I'm going to show you in the demo that we'll only use the DILI keywords, plus the preferred term from the MedDRA to generate the interesting terms and the phrases to do the prediction. So here is the example we saw using only the DILI keywords. Then you see everything over here, you can see even in the list. You have a count number showed at the side for each of terms, how many times they are repeated in the warnings and precaution section and also you can see more colorful, more graphic in the world cloud to get a pattern recognized. And then we add the medical terms, that was the medical related terms. So it's still come down from the 12,000 terms to the 1190 terms that was including DILI keywords and medical preferred terms. So we think this would be the good term list to start with to do our analysis. So what we do is in the JMP Text Explorer, we can save the term...document term matrix. That means if you see 1 that means this document have seen this term, if it says, if this is 0, this means this document has not see, have a case of this word. So then we, in the XGBoost will make k fold, and three k folds, use each one with five columns. So we use in this machine learnign and use XGBoost tree model which is add in for the JMP Pro and we...using the DILI indicator to as a target variable and they use the DILI keywords and also the MedDRA preferred terms that have shown up more than 20 times to...as a predictor. Then we use a cross validation XGBoost then it 300 times interation. Now we got statistical performance metrics, we get term importance to DILI, and we get, we can use the prediction profiler for interactions and also we can generate and the save the prediction formula for new drug prediction. So I'm going to the demo. So this is a sample table we got in the in JMP. So you have a three columns. Basically you have the index, which is a drug ID. Then you have the warnign and precaution, it could have contain much more words that it's appeared, so basically have all the information for each drug. Now you have a DILI indicator. So we do the Text Explorer first. We have analysis, you go to the Text Explorer, you can use this input, which is a warning and precaution text and you would you...normally you can do different things over here, you can minimize characters, normally people go to 2 or do other things. Or you could use the stemming or you could use the regex and to do all kind of formula and in our limitation can be limited. For example, you can use a customize regex to get the all the numbers removed. That's if only number, you can remove those, but since we're going to use a list, we'll not touch any of those, we can just go here simply say, okay, So it come up the whole list of this, everything. So now I'm going to say, I only care about oh, for this one, you can do...you can show the word cloud. And we want to say I want to center it and also I want to the color. So you see this one, you see the patient is so dominant, then you can say, okay this definitely...not the... should not be in the in analysis. So I just select and right click add stop word. So you will see those being removed and no longer showed in your list and no longer show in the word cloud. So now I want to show you something I think that would speed up the clean up, because there's so many other words that could be in the system that I don't need. So I actually select and put everything into the stop word. So I removed everything, except I don't know why the "action" cannot be removed. And but it's fine if there's only one. So what I do is I go here. I said manage phrase, I want to import my keywords. Keyword just have a... very simple. The title, one column data just have all the name list. So I import that, I paste that into local. This will be my local library. And I said, Okay. So now I got only the keyword I have. OK, so now this one will be...I want to do the analysis later. And I want to use all of them to be included in my analysis because they are the keywords. So I go here, the red triangle, everything in the Text Explorer, hidden functions, hidden in this red triangle. So I say save matrix. So I want to have one and I want 44 up in my analysis. I say okay. So you will see, everything will get saved to my... to the column, the matrix. So now I want to what I want to add, I want to have the phrase, one more time. I also want to import those preferred terms. into the my database, my local data. Then also, I want to actually, I want to locally to so I say, okay. So now I have the mix, both of the the preferred terms from the MedDRA and also my keywords. So you can see now the phrases have changed. So that I can add them to my list. The same thing to my safe term matrix list and get the, the, all the numbers...all the terms I want to be included. And the one thing I want to point out here is for these terms and they are...we need to change the one model format. This is model type is continuing. I want to change them to nominal. I will tell you why I do that later. So now I have, I can go to the XGBoost, which is in the add in. We can make...k fold the columns that make sure I can do the cross validation. I can use just use index and by default is number of k fold column to create is three and the number of folds (k) is within each column is five, we just go with the default. Say, okay, it will generate three columns really quickly. And at the end, you are seeing fold A, B, C, three of them. So we got that, then we have... Another thing I wanted to do is in the... So we can We can create another phrase which has everything...that have have everything in...this phrase have everything, including the keywords and PT, but I want to create one that only have the only have only have the the preferred term, but not have the keyword, so I can add those keywords into the local exception and say, Okay. So those words will be only have preferred terms, but not have the keywords. So this way I can create another list, save another list of the documentation words than this one I want to have. So have 1000, but this term has just 20. So what they will do is they were saved terms either meet... have at least show up more than 20 times or they reach to 1000, which one of them, they will show up in the my list. So now I have table complete, which has the keywords and also have the MedDRA terms which have more than 20, show more than 20 times, now also have ??? column that ready for the analysis for the XGBoost. So now what I can do is go to the XGBoost. I can go for the analysis now. So what I'm going to do show you is I can use this DILI indicator, then the X response is all my terms that I just had for the keyword and the preferred words. Now, I use the three validation then click OK to run. It will take about five minutes to run. So I already got a result I want to show you. So you have... This is what look like. The tuning design. And we'll check this. You have the actual will find a good condition for you to to to do so. You can also, if you have as much as experience like Ross Wolfinger has, he will go in here, manually change some conditions, then you probably get the best result. But for the many people like myself don't have many experienced in XGBoost, I would rather use this tuning design than just have machine to select for me first, then I can go in, we can adjust a little bit, it depend on what I need to do here. So this is a result we got. You can use the...you can see here is different statistic metrics for performance metrics for this models and the default is showed only have accuracy and you can use sorting them by to click the column. You can sorting them and also it has much more other popular performance metrics like MCC, AUC, RMSE, correlation. They all show up if you click them. They will show up here. So whatever you need, whatever measurement you want to do, you can always find here. So now I'm going to use, say I trust the validation accuracy, more than anything else for this case. So I want to do is I want to see just top model, say five models. So what here is I choose five models. Then I go here, say I want to remove all the show models. So you will see the five models over here and then you can see some model, even though the, like this 19 is green, it doesn't the finish to the halfway. So something wrong, something is not appropriate for this model. I definitely don't want to use that one, so others I can choose. Say I want to choose this 19, I want to remove that one. So I can say I want to remove the hidden one. So basically just whatever you need to do. So if you compare, see this metrics, they're actually not much, not much different. So I want to rely on this graphic to help me to choose the best one to do the performance. So then you choose the good one. You can go here to say, I like the model 12 so I can go here, say I want to do the profiler. So this is a very powerful tool, I think quite unique to JMP. Not many tools have this function. So this gives you an opportunity to look at individual parameters in the in the active way and see how they how they change the result. For example those two was most frequently show up in the DILI cases. And you can see the slope is quite steep and that means if you change them, they will affect the final result predictions quite a bit. So you can see when the hepatitis and jaundice both says zero, you actually have very low possibility to get the DILI as one. So is low case for the possible DILI cases. But if you change this line, to the 1, you can see the chance you get is higher. And if you move those even higher. So you have, you will have a way to analyze, if they are the what is the key parameters or predictor to affect your result. And for this, some of them, even their keyword, they're pretty flat. So that means if you change that, it will not affect the result that much. So So this is and also we here, we gave the list you can get to to see what is the most important features to the calculate variables prediction. So you can see over here is jaundice and others are quite important. And for the for the feature result, once you get the data in, this is all the results that we we have. And you can say, well, what...how about the new things coming? Yes, we have here, you can say, I want to save prediction formula. And you can see it's actively working on that. And then in the table, by the end of table, you will see the prediction. So remember we had one...this was, say, well, the first drug, second was pretty much predict it will be the DILI cases and the next two, third, and the fourth, and the fifth was close to zero. So we go back to this DILI indicator and we found out they actually list. The first five was right one. So, in case you have...don't have this indicator when you have the new data come in, you don't have to read all the label. You run the model. You can see the prediction. Pretty much you knew if it is it is DILI cases or not. So my deomo would be end here, and now I'm going to give a conclusion. So we are using the Text Explorer to extract the data keyword and MedDRA terms using Stop Words and Phrase Management without manually selection, deletion and recoding. So we use a visualization and we created a document term matrix for prediction. And also we use machine learning for the using the XGBoost modeling and we want to quickly to run the XGBoost to find the best model and perform predict profile. And also we can save the predict formula to predict the new cases. Thank you. And I stop here.  
Hadley Myers, JMP Systems Engineer, SAS Chris Gotwalt, JMP Director of Statistical Research and Development, SAS   The need to determine confidence intervals for linear combinations of random mixed-model variance components, especially critical in Pharmaceutical and Life Science applications, was addressed with the creation of a JMP Add-In, demonstrated at the JMP Discovery Summit Europe 2020 and available at the JMP User Community. The add-in used parametric bootstrapping of the sample variance components to generate a table of simulated values and calculated “bias-corrected” (BC) percentile intervals on those values. BC percentile intervals are better in accounting for asymmetry in simulated distributions than standard percentile intervals, and a simulation study using a sample data set at the time showed closer-to-true α-values with the former. This work reports on the release of Version 2 of the Add-In, which calculates both sets of confidence intervals (standard and BC percentiles), as well as a third set, the “bias-corrected and accelerated” confidence interval, which has the advantage of adjusting for underlying higher-order effects. Users will therefore have the flexibility to decide for themselves the appropriate method for their data. The new version of the Add-In will be demonstrated, and an overview of the advantages/disadvantages of each method will be addressed. (view in My Videos)     Auto-generated transcript...   Speaker Transcript Hello, my name is Chris Gotwalt 00 08.966 3 has been developed for variance components models, we we think 00 25.566 7 statistical process control program, one has to understand 00 40.466 11 ascertain how much measurement error is attributable to testing 00 55.500 15 there might be five or 10 units or parts tested per operator, 00 10.766 19 different measuring tools is small enough that differences in 00 26.033 23 measurement to measurement, repeatability variation, or 00 39.900 27 measurement systems analyses, as well as a confidence interval on 00 52.766 31 interval estimates in the report and obtain a valid 95% interval 00 07.033 35 calculate confidence intervals, because we believed it would be 00 23.400 39 and the sum of the variance components. Unfortunately, the 00 38.266 43 r&r study. So because variance components explicitly violate 00 57.300 47 you were to use the one click bootstrap on variance components 00 10.566 51 less. So when we were designing fit mixed, and the REML 00 27.933 55 independent. So back to the drawing board. So it turns out 00 44.666 59 in JMP. One approach is called the parametric bootstrap that 00 01.333 63 comparison of the two kind of families of bootstrap. So the 00 18.333 67 they're, they're not assuming any underlying model. And it's 00 37.200 71 the rows in the data table are independent from one another. 00 52.766 75 values, it has the advantage that we don't have to make this 00 09.866 79 bootstrap simulation. The downside to this is that you 00 25.133 83 do a quick introduction to what the bootstrap...the parametric 00 41.966 87 to identify or wanted to estimate the crossing time of a 00 04.733 91 162.8. Now, we want to use a parametric bootstrap to to go 00 22.466 95 has the ability to save the simulation formula back to the 00 35.933 99 that uses the estimates in the report as inputs into a random 00 53.300 00.666 104 And we take our estimates and pull them out into a separate 00 17.666 108 And then what we have can be seen as a random sample from the 00 37.000 112 formula column for the crossing time. And that is automatically 00 53.900 116 those...on the crossing time, or any quantity of interest. When 00 15.366 120 simulation, create a formula column of whatever function of 00 28.366 124 derive quantity of interest and obtain confidence intervals 00 47.233 128 the add in so that you're able to do this quite easily for 00 05.033 132 133 we'll start by showing you how to run the add in yourself once 00 25.500 137 first version was presented at the JMP 2020 Discovery Summit 00 42.566 141 overview, but we'll show you the references where you can dive in 00 58.866 145 perfectly fine as well. So I'm going to go ahead and start with 00 14.700 149 makes use of the fit mixed platform, right, created from 00 31.333 153 the add in will only work with JMP Pro. So someone might, 00 49.066 157 want some measure like reproducibility. So that would 00 10.166 161 as we said, to calculate the estimate for these, there's no 00 26.066 165 columns here. The reality is much, much, much more 00 43.066 169 of the estimate without considering the worst case 00 59.233 173 production that the actual variance is higher than they have 00 19.733 177 don't risk being out of spec in production. So to run the add in 00 35.700 181 From here, I can select the linear combination of confidence 00 55.266 185 simulations, you get a better estimate of the confidence 00 10.500 189 2500. I'm going to leave it as 1000 here just for demonstration 00 28.733 193 operator or the batch variable, and then press perform analysis. 00 45.533 197 purpose of this demonstration, I think I will stop it early. 00 07.733 201 calculated confidence limits, the bootstrap quantiles, which are 00 28.933 205 these two tabs. But if you'd like to see how those compare, 00 42.366 209 so what does enough mean, enough for your confidence limits to 00 57.400 213 stopped it before a thousand. So that's how the add in works. And 00 15.466 217 distributed around the original estimate, they are in fact 00 37.366 221 relaunch this analysis. So you'll see that when the 00 56.433 225 European Discovery, required bounded variance confidence 00 16.766 229 that, if that happens for some of the bootstrap samples or for 00 40.466 233 early, again, I'll just let it run a little bit. Yeah, so I, as 00 00.966 237 the samples are allowed, in some cases, to be below zero. So in 00 28.400 242 simulation column here, this column of simulated 00 49.100 246 see them both at the same time. It's a bit... it's a bit tricky, 00 16.300 252 right components, it's a good idea to run the add in directly 00 31.766 256 that column is then deleted. So one thing to to mention, before 00 50.500 260 accounts for the skewness of the bootstrap distributions, right, 00 14.200 264 that. And then the accelerated takes that even further. So here 00 27.200 268 thing to mention is that the alpha in this represents the 00 43.700 272 value for which it's been calculated? And what can we do to 00 03.233 276 up to investigate the four different kinds of the variance 00 24.700 280 method, the bias corrected method and the BCa. We also 00 43.000 284 study. So for all 16 combinations of these three 00 01.566 288 combinations of confidence intervals, and kept track of how 00 20.400 292 293 coverage as we're varying these three variables, and we see here 00 45.300 297 298 techniques. And the second best is the bias corrected and 00 09.966 302 the best one. Now, if you turn no bounds on, which means that 00 28.433 306 variance components with a pretty close to 95% coverage. 00 48.200 310 intervals are performing similarly at about 93%. But 00 02.200 07.800 315 to what a master's thesis paper's research would have, would 00 27.966 319 potentially more work to be done. There's other interval 00 42.566 323 things like generalized confidence intervals. General 00 59.466 327 intervals might also do the trick for us as well. Hadley's 00 19.966 331 so that you can now do parametric bootstrap simulations 00 37.566 335 16. When you bring that up, you can enter the linear combination 00 51.766 339  
Monday, October 12, 2020
Kamal Kannan Krishnan, Graduate Student, University of Connecticut Ayush Kumar, Graduate Student, University of Connecticut Namita Singh, Graduate Student, University of Connecticut Jimmy Joseph, Graduate Student, University of Connecticut   Today all service industries, including the telecom face a major challenge with customer churn, as customers switch to alternate providers due to various reasons such as competitors offering lower cost, combo services and marketing promotions. With the power of existing data and previous history of churned customers, if company can predict in advance the likely customers who may churn voluntarily, it can proactively take action to retain them by offering discounts, combo offers etc, as the cost of retaining an existing customer is less than acquiring a new one.  The company can also internally study any possible operational issues and upgrade their technology and service offering. Such actions will prevent the loss of revenue and will improve the ranking among the industry peers in terms of number of active customers. Analysis is done on the available dataset to identify important variables needed to predict customer churn and individual models are built. The different combination of models is ensembled, to average and eliminate the shortcomings of individual models.  The cost of misclassified prediction (for False Positive and False Negative) is estimated by putting a dollar value based on Revenue Per User information and cost of discount provided to retain the customer.     Auto-generated transcript...   Speaker Transcript Namita Hello everyone I'm Namita, and I'm here with my teammates Ayush, Jimmy and Kamal from University of Connecticut to present our analysis on predicting telecom churn using JMP. The data we have chosen is from industry that keeps us all connected, that is the telecom and internet service industry. So let's begin with a brief on the background. The US telecom industry continues to witness intense competition and low customer stickiness due multiple reasons like lower cost, combo promotional offers, and service quality. So to align to the main objective of preventing churn, telecom companies often use customer attrition analysis as their key business insights. This is due to the fact that cost of retaining an existing customer is far less than acquiring a new one. Moving on to the objective, the main goal here is to predict in advance the potential customers who may attrite. And then based on analysis of that data ,recommend customized product strategies to business. We have followed the standard SEMMA approach here. Now let's get an overview of the data set. It consists of total 7,043 rows of customers belonging to different demographics (single, with dependents, and senior) and subscribing to different product offerings like internet service, phone lines, streaming TV, streaming movies and online security. There are about 20 independent variables; out of it, 17 are categorical and three are continuous. The dependent target variable for classification is customer churn. And the churn rate for baseline model is around 26.5%. Goal is now to pre process this data and model it for future analysis. That's it from my end over to you, Ayush. Ayush Kumar Thanks, Namita. I'm Ayush. In this section, I'll be talking about the data exploration and pre processing. In data exploration, we discovered interesting relationships, for instance, variables tenure and monthly charges both were positively correlated to total charges. These three variables we analyzed using scatter plot matrix in JMP, which validated the relationship. Moreover, by using explore missing values functionality, we observed that total charges column had 11 missing values. The missing values were taken care of as a total charges column was excluded due to multicollinearity. After observing the histograms of the variables using exclude outlier functionality, we concluded that the data set had no outliers. The variable called Customer ID had 7,043 unique values which would not add any significance to the target variable. So customer ID was excluded. We were also able to find interesting pattern among the variables. Variables such a streaming TV and streaming movies convey the same information about the streaming behavior. These variables were grouped into a single column streaming to by using our formula in JMP. The same course of action was taken for the variables online backup and online security. We ran logistic regression and decision tree in JMP to find out the important variables. From the effects summary, it was observed that tenure, contract type, monthly charges, streaming to, multiple line service, and payment method showed significant log worth and very important variables in determining the target. The effects on ??? also helped us to narrow down a variable count to 12 statistically significant variables, which formed the basis for further modeling. We use value of ??? functionality and moved Yes of our target variable upwards. Finally, the data was split into training validation and test in 16 20 20 ratio using formula random method. Over to you now, Kamal. Kamal Krishnan Sorry, I am Kamal. I will explain more about the different models built in JMP using the data set. We in total built eight different types of model. On each type of model, we tried various input configuration and settings to improve the results of mainly sensitivity. As our target was to reduce the number of false negatives in the classification. JMP is very user friendly to redo the models by changing the configurations. It was easy to store the results whenever a new iteration of the model is done in JMP and then compare outputs in order to select the optimized model from each type. JMP allowed us to even change the cutoff values from default 0.5 to others and observed the prediction results. This slide shows the results of selected model from eight different type of models. First, as our top target variable journeys categorical we built logistic regression. Then we build decision tree, KNN, ensemble models like Bootstrap forest and boosted tree. Then we built machine learning models like neural networks. JMP allowed us to set the random seed in models like neural networks and KNN. This helped us to get the same outputs we needed. Then we built naive Bayes model. JMP allowed us to study the impact of various variables through prediction profiler. We can point and click on to change the values in the range and see how it impacts the target variable. By changing the prediction profiler in naive bayes, we observed that increase in tenure period helps in reducing the churn rate. On the contrary, increase in monthly charges increases the churn rate. Finally, we did ensembel of different combination of models to average and eliminate the shortcomings of individual models. We found that in ensembling neural network and naive bayes has higher sensitivity among ???. This ends the model description. Over to you, Jimmy. JJoseph Thank you, Kamal. In this section we will be comparing the models and looking deeper dive into each model detail. The major parameters used to compare the models are cost of misclassification in dollars, sensitivity versus accuracy chart, lift ratio, and area under the curve values. The cost of misclassification data is depicted on the right, top corner of the slide. Cost of false positives and false negative determined using average monthly charges. That cost of false negative model predicted no turn for customer potentially leaving, calculated to dollar (85) and cost of false negative at dollar (14) after discounting 20% to accommodate additional benefits. The cost comparison chart clearly indicate that the niave bayes has the lowest cost. Going on to total accuracy rates chart with it is between 74 to 81%, not much variation in most of the models. And lift, a measure of probability to find a success record compared to baseline model, varies between 1.99 to 3.11. The AUC or ROC curve is another measure us to determine the strength of the model with different type of values. As chart indicates all the models did equally well in this category. The sensitivity and accuracy chart measure the models' success to predict the customer churn accurately. The chart indicates two facts How many customers that the model can correctly predict; to how often the prediction be accurate. This measure is used as the major parameter to decide the best performing model and naive bayes did well in this category. Based on the various metrics and considering the cost of failed prediction of models, naive bayes came out as the best and parsimonious model to predict the customer churn for the given data set. It has lowest misclassification ratio, high sensitivity, and reasonably good total accuracy. If you discount some of its inherent drawbacks, such as lack of a statistical model to support, the model is completely data driven and easily explainable. Moving on to the conclusions drawn, the significant variables in the data set are contract and tenure of customer enrolled. From modeling, we observed that churning of customer is high for 1) those without dependent in demography; 2) those who pay a high price for their phone services, low customer satisfaction rate on high end services; 3) customers stick to the original single line on service easy switch over to competitors. So based on those findings, the recommendations are 1) targeted customer promotion focused on in income generation; 2) push long term contract with additional incentives; 3) build a product line combo focusing on customer needs. In conclusion, we use JMP tool to do analysis and predictive models on limited data set. It is very effective and powerful to to do those analysis, please reach out to us if you have any further questions. Thank you.  
Steve Hampton, Process Control Manager, PCC Structurals Jordan Hiller, JMP Senior Systems Engineer, JMP   Many manufacturing processes produce streams of sensor data that reflect the health of the process. In our business case, thermocouple curves are key process variables in a manufacturing plant. The process produces a series of sensor measurements over time, forming a functional curve for each manufacturing run. These curves have complex shapes, and blunt univariate summary statistics do not capture key shifts in the process. Traditional SPC methods can only use point measures, missing much of the richness and nuance present in the sensor streams. Forcing functional sensor streams into traditional SPC methods leaves valuable data on the table, reducing the business value of collecting this data in the first place. This discrepancy was the motivator for us to explore new techniques for SPC with sensor stream data. In this presentation, we discuss two tools in JMP — the Functional Data Explorer and the Model Driven Multivariate Control Chart — and how together they can be used to apply SPC methods to the complex functional curves that are produced by sensors over time. Using the business case data, we explore different approaches and suggest best practices, areas for future work and software development.     Auto-generated transcript...   Speaker Transcript Jordan Hiller Hi everybody. I'm Jordan Hiller, senior systems engineer at JMP, and I'm presenting with Steve Hampton, process control manager at PCC Structurals. Today we're talking about statistical process control for process variables that have a functional form.   And that's a nice picture right there on the title   slide. We're talking about statistical process control, when it's not a single number, a point measure, but instead, the thing that we're trying to control has the shape of a functional curve.   Steve's going to talk through the business case, why we're interested in that in a few minutes. I'm just going to say a few words about methodology.   We reviewed the literature in this area for the last 20 years or so. There are many, many papers on this topic. However, there doesn't really appear to be a clear consensus about the best way to approach this statistical   process control   when your variables take the form of a curve. So we were inspired by some recent developments in JMP, specifically the model driven multivariate control chart introduced in JMP 15 and the functional data explorer introduced in JMP 14.   Multivariate control charts are not really a new technique they've been around for a long time. They just got a facelift in JMP recently.   And they use either principal components or partial least squares to reduce data, to model and reduce many, many process variables so that you can look at them with a single chart. We're going to focus on the on the PCA case, we're not really going to talk about partial   the   partial least squares here.   Functional Data Explorer is the method we use in JMP in order to work with data in the shape of a curve, functional   data. And it uses a form of principal components analysis, an extension of principal components analysis for functional data.   So it was a very natural kind of idea to say what if we take our functional curves, reduce and model that using the functional data explorer.   The result of that is functional principal components and just as you you would add regular principal components and push that through a model driven multivariate control chart,   what if we could do that with a functional principal components? Would that be feasible and would that be useful?   So with that, I'll turn things over to Steve and he will introduce the business case that we're going to discuss today. 1253****529 All right. Thank you very much. Jordan.   Since I do not have video, I decided to let you guys know what I look like.   There's me with my wife Megan and my son Ethan   with last year's pumpkin patch. So I wanted to step into the case study with a little background on   what I do, and so you have an idea of where this information is coming from. I work in investment casting for precision casting...   Investment Casting Division.   Investment casting involves making a wax replicate of what you want to sell, putting it into a pattern assembly,   dipping it multiple times in proprietary concrete until you get enough strength to be able to dewax that mold.   And we fire it to have enough strength to be able to pour metal into it. Then we knock off our concrete, we take off the excessive metal use for the casting process. We do our non destructive testing and we ship the part.   The drive for looking at improved process control methods is the fact that   Steps 7, 8, and 9 take up 75% of the standing costs because of process variability in Steps 1-6. So if we can tighten up 1-6,   most of ??? and cost go there, which is much cheaper, much shorter, then there is a large value add for the company and for our customers in making 7, 8, and 9 much smaller.   So PCC Structurals. My plant, Titanium Plant, makes mostly aerospace components. On the left there you can see a fan ??? that is glowing green from some ??? developer.   And then we have our land based products, which right there's a N155 howitzer stabilizer leg.   And just to kind of get an idea where it goes. Because every single airplane up in the sky basically has a part we make or multiple parts, this is an engine sections ???, it's about six feet in diameter, it's a one piece casting   that goes into the very front of the core of a gas turbine engine. This one in particular is for the Trent XWB that powers the Airbus A350   jets.   So let's get into JMP. So the big driver here is, as you can imagine, with something that is a complex as an investment casting process for a large part, there is tons of   data coming our way. And more and more, it's becoming functional as we increase the number of centers, we have and we increase the number of machines that we use. So in this case study, we are looking at   data that comes with a timestamp. We have 145 batches. We have our variable interest which is X1.   We have our counter, which is a way that I've normalized that timestamp, so it's easier to overlay the run in Graph Builder and also it has a little bit of added   niceness in the FTP platform. We have our period, which allows us to have that historic period and a current period that lines up with the model driven multivariate control chart platform,   so that we can have our FDE   only be looking at the historic so it's not changing as we add more current data. So this is kind of looking at this if you were in using this in practice, and then the test type is my own validation   attempts. And what you'll see here is I've mainly gone in and tagged thing as bad, marginal or good. So red is bad, marginal is purple, and green is good and you can see how they overlay.   Off the bat, you can see that we have some curvey   ??? curves from mean. These are obviously what we will call out of control or bad.   This would be what manufacturing called a disaster because, like, that would be discrepant product. So we want to be able to identify those   earlier, so that we can go look at what's going on the process and fix it. This is what it looks like   breaking out so you can see that the bad has some major deviation, sometimes of mean curve and a lot of character towards the end.   The marginal ones are not quite as deviant from the mean curves but have more bouncing towards the tail and then good one is pretty tight. You can see there's still some bouncing. So this is where the   the marginal and the good is really based upon my judgment, and I would probably fail an attribute Gage R&R based on just visually looking at this. So   we have a total of 33 bad curves, 45 marginal and 67. And manually, you can just see about 10 of them are out. So you would have an option if you didn't want to use a point estimate, which I'll show a little bit later that doesn't work that great, of maybe making...   control them by points using the counter. And how you do that would be to split the bad table by counter, put it into an individual moving range control chart through control chart building and then you would get out,   like 3500 control charts in this case, which you can use the awesome ability to make combined data tables to turn that that list summary from each one into its own data table that you can then link back to your main data table and you get a pretty cool looking   analysis that looks like this, where you have control limits based upon the counters and historic data and you can overlay your curves. So if you had an algorithm that would tag whenever it went outside the control limits, you know, that would be an option of trying to   have a control....   a control chart functionality with functional data. But you can see, especially I highlighted 38 here, that you can have some major deviation and stay within the control limits. So that's where this FDE   platform really can shine, in that it can identify an FPC that corresponds with some of these major deviations. And so we can tag the curves based upon those at FPCs.   And we'll see that little later on. So,   using the FDE platform, it's really straightforward. Here for this demonstration, we're going to focus on a step function with 100 knots.   And you can see how the FPCs capture the variability. So the main FPC is saying, you know, beginning of the curve, there's...that's what's driving the most variability, this deviation from the mean.   And setup is X1 and their output, counters. Our input, batch number and then I added test type. So we can use that as some of our validation in FPC table and the model driven multivariate control chart and the period so that only our historic is what's driving the FDE fit.   And so   just looking at the fit is actually a pretty important part of making sure you get correct   control charting later on, is I'm using this P Step   Function 100 knots model. You can see, actually, if I use a B spline and so with Cubic 20 knots, it actually looks pretty close to my P spline.   But from the BIC you can actually see that I should be going to more knots, so if I do that, now we start to see them overfitting, really focusing on the isolated peaks and it will cause you to have an FDE   model that doesn't look right and causes you to not be as sensitive and your model driven multivariate control chart.   0
Steve Figard, Director of Cancer Research Lab, Bob Jones University Evan Becker, student, Bob Jones University Luke Brown, student, Bob Jones University Emily Swager, student, Bob Jones University Rachel Westphal, student, Bob Jones University   Colorectal cancer is both the third most common and the third leading cause of deaths associated with cancer.  Previous studies in this lab have demonstrated the in vitro cytotoxic effect of almonds on human GI cancers.  An almond extract was prepared and processed using a pseudo-digestion procedure in order to mimic the effects the extract would have in a physiological system.  This extract demonstrated a dose response in vitro cytotoxicity to human gastric adenocarcinoma and was cytotoxic to a human colorectal adenocarcinoma, but had no effect on a healthy human colon epithelial cell line.  The extract was processed through filters with molecular weight cutoffs of 100,000 Da and 5,000 Da to estimate the size of any anticancer molecules, and it was found that the responsible molecules were less than 5,000 Da in molecular weight.  In addition, a polyphenol extract of the almonds was prepared and shown to have similar effects as the whole almond extract. It was concluded that the anticancer agents are likely polyphenols. Finally, four flavonoids commonly found in almonds were compared to the polyphenol extract, and they showed similar cytotoxicity.     Auto-generated transcript...   Speaker Transcript Evan Becker Hello, this is the 2019 BJU cancer research team. My name is Evan Becker and along with me today are Luke Brown, Emily Swagger, Rachel Westfall and we are all under the direction of Dr. Stephen Figard at the Department of Biology and Bob Jones University. And our paper, our presentation is on the molecular weight characterization of an almond component cytotoxic to gastrointestinal cancer cell lines. All right, for the introduction. Colorectal cancer is currently of great concern in the medical community as it is the third leading cause of cancer deaths for both men and women nationwide. Previous research from the BJU cancer lab has shown promise by demonstrating that almonds have a cytotoxic effect on LoVo colorectal cancer in vitro. A pseudo digestion procedure for an almond extract was also used as a way to mimic how the extract would work in a physiological system. The same almond extract was shown to induce a dose response and the human gastric adenocarcinoma cancer cell line AGS. Also this extract causes no negative effects in the human colon epithelial cell line CCD, indicating that the cytotoxic effects of almonds do not affect normal healthy cells. Passing our almond extract through molecular weight cut off filters of 100,000 Daltons and 5000 Daltons respectively, we were able to determine that the molecules present in the almonds inducing the set of toxic effect must be smaller than 5000 Daltons. As a result, polyphenols were determined to be a possible cause of the cytotoxic effect and the polyphenol extract was conducted on the almonds with this treatment showing very similar cytotoxic effects on the cancer cell lines. I think we're ready. Rachel Westphal To the cell lines that we use included AGS, which is stomach cancer; LoVo, which is colon cancer; and CCD, which was our normal cell line. We used 5-fluorouracil as our positive control for cell death. And we used an in vitro pseudo digestion of the almonds to mimic physiological digest...physiological digestion. WST cell proliferation assay, we utilize that to determine absorbance with a plate reader and use that to calculate percent viability. And then JMP was used for statistical analysis, we used ANOVA analysis, the Tukey-Kramer HSD test and the Wilcox and non parametric comparison. p values less than 0.05 were considered statistically significant. Can go to the next slide. Yeah. Luke Brown All right. The first set of tests I'd like to introduce you to are tests regarding the AGS human cancer cell line and this is a human gastric adino carcinoma. Now JMP was an important tool for us because it helped us to, first of all, determine the standard deviation in a few pilot studies we conducted. This then allowed us to use the software to run a power analysis to determine our sample size. Now looking at some of the data we got here, first of all, you'll see among both Figure 1 and Figure 2 what it has in common is PBS phosphate buffered saline and 5-FU or 5-fluorouracil, which is well documented and established cancer treatment for cancer, such as gastric and colorectal cancers. Now I'd like to direct your attention to Figure 1 here. Previous studies in our lab had already established that it seems almond extract does have some sort of cytotoxic effect that's selective to these cancer lines. Well, we hope to establish in Figure 1 here is, first of all, the dosage effect. And second of all, we wanted to narrow down what molecular weight, we'd be looking at to establish what compound or compounds are responsible for this effect. As you see, moving up from 12% almond extract all the way to 100% almond extract, we do see a dosage response. And we were able to establish this using the Tukey-Kramer HSD test inside JMP to be able to establish that these groups are statistically significantly different. Now you see that both the 100,000 molecular weight cut off filter and less than 5000 molecular weight cut off filter are statistically the same as the 100% almond extract. This led us to believe that whatever compound or compounds is responsible for this effect is going to be relatively small at less than 5,000 Daltons. Given this, we moved on to Figure 2 and we looked at some research showing that flavonoids have been shown to have a similar effect in walnuts. cyanidin, delphinidin, malvidin and petunidin. As you can see both the extract and all these flavonoids were actually more effective than the 5-FU in this treatment. So given this information, I'm going to pass the next slide to Emily here to give you some more information about another set of data. Emily So to make sure that we weren't just looking at results particular to AGS, we also ran another cancer cell line called LoVo. LoVo is a colon cancer cell line. As you can see, we didn't do all of the extensive dosage treatments on this particular one, because we'd kind of shown that with AGS. We were just particularly looking at is this particular to a certain line? So for LoVo, you can see that we don't have as low a value for 5-FU.  
Nascif Neto, Principal Software Developer, SAS Institute (JMP Division) Lisa Grossman, Associate Test Engineer, SAS Institute (JMP division)   The JMP Hover Label extensions introduced in JMP 15 go beyond traditional details-on-demand functionality to enable exciting new possibilities. Until now, hover labels exposed a limited set of information derived from the current graph and the underlying visual element, with limited customization available through the use of label column properties. This presentation shows how the new extensions let users implement not only full hover label content customization but also new exploratory patterns and integration workflows. We will explore the high-level commands that support the effortless visual augmentation of hover labels by means of dynamic data visualization thumbnails, providing the starting point for exploratory workflows known as data drilling or drill down. We will then look into the underlying low-level infrastructure that allows power users to control and refine these new workflows using JMP Scripting Language extension points. We will see examples of "drill out" integrations with external systems as well as how to build an add-in that displays multiple images in a single hover label.     Auto-generated transcript...   Speaker Transcript Nascif Abousalh-Neto Hello and welcome. This is a our JMP discovery presentation from details on demand to wandering workflows, getting to know JMP hover label extensions. Before we start on the gory details, we always like to talk about the purpose of a new feature introduced in JMP. So in this case, we're talking about hover labels extensions. And why do we even have hover labels in the first place. Well, I always like to go back to the visual information seeking mantra from Ben Shneiderman, which is he tried to synthesize overview first, zoom and filter, and then details on demand. Well hover labels are all about details on demand. So let's say I'm looking at this bar chart on this new data set and in JMP, up to JMP 14, as you hover over a particular bar in your bar chart, it's going to pop up a window with a little bit of textual data about what you're seeing here. Right. So you have labeled information, calculated values, just text, very simple. Gives you your details on demand. But what if you could decorate this with visualizations as well. So for example, if you're looking at that aggregated value, you might want to see the distribution of the values that got that particular calculation. Or you might want to see a breakdown of the values behind the that aggregated value. This is what we're gonna let you know with this new visualization, with this new feature. But on top of that, it's the famous, wait, there is more. This new visualization basically allows you to go on and start the visual exploratory workflow. If you click on it, you can open it up in its own window, which allows you to which can also have its visualization, which you can also click and get even more detail. And so you go down that technique called the drill down and eventually, you might get to a point where you're decorating a particular observation with information you're getting from maybe even Wikipedia in that case. Not going to go into a lot of details. We're going to learn a lot about all that pretty soon. But first, I also wanted to talk a little bit about the design decisions behind the implementation of this feature. Because we wanted to have something that was very easy to use that didn't require programming or, you know, lots of time reading the manual and we knew that would satisfy 80% of the use cases. But for those 20% of really advanced use cases or for those customers that know their JSL and they just want to push the envelope on what JMP can do, we also want to make available, something that you could do through programming. But basically, your top of the context of ??? on those visual elements. So we decided to go with architectural pattern called plumbing and porcelain, and that's something we got to git source code control application, which is basically you have a layer that is very rich and because it's very rich, very complex, which gives you access to all that information and allows you to customize things that are going to happen as far as generating the visualization or what happens when you click on that visualization And on top of that, we built a layer that is more limited, its purpose driven, but it's very, very easy to do and requires no coding at all. So that's the porcelain layer. And that's the one that Lisa is going to be talking about now. Up to you. Lisa. I'm going to stop sharing and Lisa is going to take over. Lisa Grossman Okay so we are going to take a high level look at some of the features and what kind of customization system, make the graphic ??? So, let us first go through some of the basics. So by default when you hover over a data point or an element in your graph. you see information displayed for the X and Y roles used in the graph, as well as any drop down roles such as overlay and if you choose to manually manually label a column in the data table, that will also appear as a hover label. So here we have an example of a label, the expression column tha contains an image. And so we can see that image is then populated in hover label in the back. And to add a graphlet to your hover label, you have the option of selecting some predefined graphlet presets, which you can access via the right mouse menu under hover label. Now these presets have dynamic graph role assignments and derive their roles from variables used in your graph. And presets are also preconfigured to be recursive and that will support drilling down. And for preset graphlets that have categorical columns, you can specify which columns to filter by, by using the next in hierarchy column property that's in your data table. And so now I'm going to demo real quick how to make a graphlet preset. So I'm going to bring up our penguins data table that we're going to be using. And I'm going to open up Graph Builder. And I'm going to make a bar chart here. And then right clicking under hover label, you can see that there is a list of different presets to choose from, but we're going to use histogram for this example. So now that we have set our preset, if you hover over a bar, now you can see that there's a histogram preset that pops up in your hover label. And it's also... it is also filtered based on our bar here, which is the island Biscoe. And the great thing about graphlets is if I hover over this bar, I can see another graphlet. And so now you can easily compare these two graphlets to see the distribution of bill lengths for both the islands Dream and Biscoe. And then you can take it a step further and click on the thumbnail of the graphlet and it will launch a Graph Builder instance in its own window and it's totally interactive so you can open up the control panel of Graph Builder and and customize this graph further. And then as you can see, there's a local data filter already applied to this graph, and it is filtered by Biscoe, which is the thumbnail I launched. So, that is how the graphlets are filtered by. And then one last thing is that if I hover over these these histogram bars, you can see that the histogram graphlet continues on, so that shows how these graphlet presets are pre configured to be recursive. So closing these and returning back to our PowerPoint. So I only showed the example of the histogram preset but there are a number that you can go and play with. So these graphlet presets help us answer the question of what is behind an aggregated visual element. So the scatter plot preset shows you the exact values, whereas the histogram, box plot or heat map presets will show you a distribution of your values. And if you wanted to break down your graph and look at your graph with another category, then you might be interested in using a bar, pie, tree map, or a line preset. And if you'd like to examine your raw data of the table, then you can use the tabulate preset. But if you'd like to further customize your graphlet, you do have the option to do so with paste graphlets. And so paste graphlet, you can easily achieve with three easy steps. So you would first build a graph that you want to use as a graphlet. And we do want to note here that it does not have to be one built from Graph Builder. And then from the little red triangle menu, you can save the script of the graph to your clipboard. And then returning to your base graph or top graph, you can right click and under hover label, there will be a paste graphlet option. And that's really all there is to it. And we want to also note that paste graphlet will have static role assignments and will not be recursive since you are creating these graph lets to drill down one level at a time. But if you'd like to create a visualization with multiple drill downs, then you can, you have the option to do so by nesting paste graphlet operations together, starting from the bottom layer going up to your top or base later. So, and this is what we would consider our Russian doll example, and I can demo how you can achieve that. So we'll pull up our penguins data table again. And we'll start with the Graph Builder and we'll we're going to start building our very top layer for this. So let's go ahead build that bar chart. And then let's go on to build our very second...our second layer. So let's do a pie with species. And then for our very last layer, let's do a scatter plot. OK, so now I have all three layers of our...of what we will use to nest and so I will go and save the script of the scatter plot to my clipboard. And then on the pie, I right click and paste graphlet. And so now when you hover, you can see that the scatter plot is in there and it is filtered by the species in this pie. So I'm going to close this just for clarity and now we can go ahead and do the same thing to the pie, save the script, because it already has the scatter plot embedded. So save that to our clipboard, go over to our bar, do the same thing to paste graphlet. And now we have... we have a workflow that is... that you can click and hover over and you can see all three layers that pop up when you're hovering over this bar. So that's how you would do your nested paste graphlets. And so we do want to point out that there are some JMP analytical platforms that already have pre integrated graphlets available. So these platforms include the functional data explorer, process screening, principal components, and multivariate control charts, and process capabilities. And we want to go ahead and quickly show you an example using the principal components. Lost my mouse. There we go. So I launch our table again and open up principal components. And let's do run this analysis. And if I open up the outlier analysis and hover over one of these points, boom, I can see that these graphlets are already embedded into this platform. So we highly suggest that you go and take a look at these platforms and play around with it and see what you like. And so that was a brief overview of some quick customizations you can do with hover label graphlets and I'm going to pass this presentation back to Nascif so he can move you through the plumbing that goes behind all of these features. Nascif Abousalh-Neto Thank you, Lisa. Okay, let's go back to my screen here. And we... I think I'll just go very quickly over her slides and we're back to plumbing, and, oh my god, what is that? This is the ugly stuff that's under the sink. But that's where you have all the tubing and you can make things really rock, and let me show them by giving a quick demo as well. So here Lisa was showing you the the histogram... the hover label presets that you have available, but you can also click here and launch the hover label editor and this is the guy where you have access to your JSL extension points, which is where you make, which is how those visualizations are created. Basically what happens is that when you hover over, JMP is gone to evaluate the JSL block and capture that as an in a thumbnail and put that thumbnail inside your hover label. That's pretty much, in a nutshell, how it goes. And the presets that you also have available here in the hover label, right, they basically are called generators. So if I click here on my preset and I go all the way down, you can see that it's generating the Graph Builder using the histogram element. That's how it does its trick. Click is a script that is gonna react to when you click on that thumbnail, but by default (and usually people stick with the default), if you don't have anything here, it's just, just gonna launch this on its own window, instead of capturing and scale down a little image. In here on the left you can see two other extension points we haven't really talked much about yet. But we will very soon. So I don't want to get ahead of myself. So, So let's talk about those extension points. So we created not just one but three extension points in JMP 15. And they are, they're going to allow you to edit and do different functionality to different areas of your hover label. So textlets, right, so let's say for example you wanted to give a presentation after you do your analysis, but you want to use the result of that analysis and present it to an executive in your company or maybe we've an end customer that wants a little bit more of detail in in a way that they can read, but you would like make that more distinct. So textlet allows you to do that. But since you're interfacing with data, you also want that to be not a fixed block of text, but something that's dynamic that's based on the data you're hovering over. So to define a textlet, you go back to that hover label editor and you can define JSL variables or not. But if you want it to be dynamic, typically, what you do is you define a variable that's going to have the content that you want to display. And then you're going to decorate that value using HTML notation. So, here is how you can select the font, you can select background colors, foreground colors, you can make it italic, and basically make it as pretty or rich of text as you as you need to. Then the next hover labelextension is the one we call gridlet. And if you remember the original or the current JMP hover label, it's basically a grid of name value pairs. To the left, you have names of your...that would be the equivalent to your column name, and to the right, you have the values which might be just a column cell for a particular row if it's a marked plot. But if it's aggregated like a bar chart, this is going to be a mean or an average medium, something like that. The default content from here, like Lisa said before, is derived at both from the...originally is derived both from whatever labeled columns you have in your data table and also, whatever role assignments you have in your graph. So if it's a bar chart, you have your x, you have your y. You might have an overlay variable and everything that in at some point contributes to the creation of that visual element. Well with gridlets you can now have pretty much total control of that little display. You can remove entries. It's very common that sometimes people don't want to see the very first row, which has the labeles or the number of rows. Some people find that redundant. They can take it out. You can add something that is completely under your control. Basically it's going to evaluate the JSL script to figure out what you want to display there. One use case I found was when someone wanted an aggregated value for a column that was not individualization. Some people call those things hidden columns or hidden calculations. Now you can do that, right, and have an aggregation for the same rows that the rest of that that are being displayed on that visualization. You can rename. We usually add the summary statistic to the left of anything that comes from a y calculated column. If you don't like that, now you can remove it or replace it with something else. And as well...and then you can do details like changing the numeric precision or make text bold or italics or red or... even for example, you can make it red and bold, if the value is above a particular threshold. So you can have something that, as I move over here, if the value is over the average of my data I make it red and bold so I can call attention to that. And that will be automatic for you. And finally, graphlets. We believe that's going to be the most useful and used one. Certainly don't want that to cause more attention because you have a whole image inside your tool tip and we've been seeing examples with data visualizations, but it's an image. So it can be a picture as well. It can be something you're downloading from the internet on the fly by making a web call. That's how I got the image of this little penguin. It's coming straight from Wikipedia. As you hover over, we download it, scale it and and put it here. Or you can, for example, that's a very recent use case, someone had a database of pictures in the laboratory and they have pictures of the samples they were analyzing and they didn't want to put them on the data table because the data table would be too large. Well, now you can just get a column, turn that column into a file name, read from the file name, and boom, display that inside your tool tip. So when you're doing your analysis, you know, exactly, exactly what you're looking at. And just like graph...gridlets, we're talking about clickable content. So again, for example, if I wanted and I showed that when I click on this little thumbnail here, I can open a web page. So you can imagine that even as a way to integrate back with your company. Let's say you have web services that they're supported in your company, and you want to, at some point, maybe click on an image to make a call to kind of register or capture some data. Go talking for a web call to that web service. Now that's something you can do. So I like to call, we talk about drill in and drill down, that would be a drill out. That's basically JMP talking to the outside world using data content from your exploration. So let's look at those things in the little bit more detail. So those those visualizations that we see here inside the hover label, they are basically... that's applied to any visualization. Actually it's a combination of a graph destination and the data subset. So in the Graph Builder, for example, you'll say, I want the bar chart of islands by on my x axis and on my y axis, I want to show the average of the body mass of the penguins on that island. Fine. How do you translate that to a graphlet, right? Well, basically when you select the preset or when you write in your code if you want to do it, but the preset is going to is going to use our graph template. So basically, some of the things are going to be predefined like that. The bar element, although if you're writing it your own, you could even say I want to change my visualization depending on my context. That's totally possible. And you're going to fill that template with a graph roles and values and table data, table metadata. So, for example, let's say I have a preset of doing that categorical drill down. I know it's going to be a bar chart. I don't know what a bar chart is going to be, what's going to be on my y or my x axis. That's going to come from the current state of my baseline graph, for example, I'm looking at island. So I know I want to do a bar chart of another category. So that's when the next in hierarchy and the next column comes into play. I'm making that decision on the fly, based on the information that user is giving me and the graph that's being used. For example, if you look here at the histogram, it was a bar chart of island by body mass. This is a histogram of body mass as well. If I come here to the graph and change this column and then I go back and hover, this guy is going to reflect my new choice. That's this idea of getting my context and having a dynamic graph. The other part of the definition of visualization is the data subset. And we have a very similar pattern, right. We have...LDF is local data filter. So that's a feature that we already had in JMP, of course, right. And basically, I have a template that is filled out from my graph roles here. It's like if it was a bar chart, which means my x variable is going to be a grouping variable of island. I know I wanted to have a local data filter of island and that I want to select this particular value so that it matches the value I was hovering over. This happens both when you're creating the hover label and when you're launching the hover label, but when you create a hover label, this is invisible. We basically create a hidden window to capture that window so you'll never see that guy. But when you launch it, the local data filter is there and as Lisa has shown, you can interact with it and even make changes to that so that you can progress your your, your visual exploration on your own terms. So I've been talking about context, a lot. This is actually something that you should need to develop your own graphlets, you need to be familiar with. We call that hover label execution context. You're going to have information about that in our documentation and it's basically if you remember JSL, it's a local block. We've lots of local variables that we defined for you and those those variables capture all kinds of information that might be useful for someone to find in the graphlet or a gridlet or a textlet. It's available for all of those extension points. So typically, they're going to be variables that start with a nonpercent... Not a nonpercent...I'm sorry. To prevent collisions with your data table column names, so it's kinda like reserved names in a way. But basically, you'll see here that that's that's code that comes from one of our precepts. By the way, that code is available to you through the hover label editor, so you can study and see how it goes. Here we're trying to find a new column. To using our new graph, it's that idea of it being dynamic and to be reactive to the context. And this function is going to look into the data table for that metadata. My...a list of measurement columns. So if the baseline is looking at body mass, body mass is going to be here in this value and at a list of my groupings. So if it was a bar chart of island by body mass, we're going to have islands here. So those are lists of column names. And then we also have any of numeric values, anything that's calculated is going to be available to you. Maybe you want to, like I said, maybe you want to make a logical decision based on the value being above or below the threshold so that you can color a particular line red or make it bold, right. You're going to use values that we provide to you. We also provide something that allow you to go back to the data. In fact, to the data table and fetch data by yourself like the row index of the first row on the list of roles that your visual element discovering, that's available to you as well. And then the other even more data, like for example the where clause that corresponds to that local data filter that you're executing in the context of. And the drill depth, let's say, that allows you to keep track of how many times you have gone on that thumbnail and open a new visualization and so on. So for example, when we're talking about recursive visualizations, every recursion needs an exit condition, right. So here, for example, is how you calculate the exit condition of one of your presets. If I don't have anything more to to show, I return empty, means no visualization. Or if I don't have...if I only show you one value, right, or any of my drill depth is greater than one, meaning I was drilling until I got to a point where just only one value to show in some visualizations doesn't make sense. So I can return empty as well. That's just an example of the kinds of decisions that you can make your code using the hover label execution context. Now, I just wanted to kind of gives you a visual representation of how all those things come together again using the preset example. When you're selecting a preset, you're basically selecting the graph template, which is going to have roles that are going to be fulfilled from the graph roles that are in your hover label execution context. And so that's your data, your graph definition. And that date graph definition is going to be combined with the subset of observations resulting from the, the local data filter that was also created for you behind the scenes, based on the visual element you're hovering over. So when you put those things together, you have a hover label, we have a graphlet inside. And if you click on that graphlet, it launches that same definition in here and it makes the, the local data filter feasible as well. When, like Lisa was saying, this is a fully featured life visualization, not just an image, you can make changes to this guy to continue your exploration. So now we're talking, you should think in terms of, okay, now I have a feature that creates visualizations for me and allow me to create one visualization from another. I'm basically creating a visual workflow. And it's kind of like I have a Google Assistant or an Alexa in JMP, in the sense that I can...JMP is making me go faster by creating, doing visualizations on my behalf. And they might be, also they might be not, just an exploration, right. If you're happy with them, they just keep going. If you're not happy with them, you have two choices and maybe it's easier if I just show it to you. So like I was saying, I come here, I select a preset. Let's say I'm going to get a categoric one bar chart. So that gives me a breakdown on the next level. Right. And if I'm happy with that, that's great. Maybe I can launch this guy. Maybe I can learn to, whoops... Maybe I can launch another one for this feature. At the pie charts, they're more colorful. I think they look better in that particular case. But see, now I can even do things like comparing those two bar charts side by side. And let's...but let's say that if I keep doing that and it isn't a busy chart and I keep creating visualizations, I might end up with lots of windows, right. So that's why we created some modifiers to...(you're not supposed to do that, my friend.) You can just click. That's the default action, it will just open another window. If you alt-click, it launches on the previous last window. And if you control-click it launches in place. What do I mean by that? So, I open this window and I launched to this this graphlet and then I launched to this graphlet. So let's say this is Dream and Biscoe and Dream and Biscoe. Now I want to look at Torgersen as well. Right. And I want to open it. But if I just click it opens on its own window. If I alt-click, (Oh, because that's the last one. I hope. I'm sorry. So let me close this one.) Now if I go back here in I alt-click on this guy. See, it replaced the content of the last window I had open. So this way I can still compare with visualizations, which I think it's a very important scenario. It's a very important usage of this kind of visual workflow. Right. But I can kind of keep things under control. And I don't just have to keep opening window after window. And the maximum, the real top window management feature is if I do a control-click because it replaces the window. And then, then it's a really a real drill down. I'm just going on the same window down and down and now it's like okay, but what if I want to come back. Or if you want to come back and just undo. So you can explore with no fear, not going to lose anything. Even better though, even the windows you launch, they have the baseline graph built in on the bottom of the undo stack. So I can come here and do an undo and I go back to the visualizations that were here before. So I can drill down, come back, branch, you can do all kinds of stuff. And let's remember, that was just with one preset. Let's do something kind of crazy here. We've been talking, we've been looking at very simple visualizations. But this whole idea actually works for pretty much any platform in JMP. So let's say I want to do a fit of x by y. And I want to figure out how...now, I'm starting to do real analytics. How those guys fit within the selection of the species. Right. So I have this nice graph here. So I'm going to do that paste graphlet trick and save it to the clipboard. And I'm going to paste it to the graphlet now. So as you can see, we can use that same idea of creating a context and apply that to my, to my analysis as well. And again, I can click on those guys here and it's going to launch the platform. As long as the platform supports local data filters, (I should have given this ???), this approach works as well. So it's for visualizations but in...since in JMP, we have this spectrum where the analytics also have a visual component, so works with our analytics as well. And I also wanted to show here on that drill down. This is my ??? script. So I have the drill down with presets all the way, and I just wanted to go to the the bottom one where I had the one that I decorated with this little cute penguin. But what I wanted to show you is actually back on the hover label editor. Basically what I'm doing here, I'm reading a small JSL library that I created. I'm going to talk about that soon, right, and now I can use this logic to go and fetch visualizations. In this case I'm fetching it from Wikipedia using a web call. And that visualization comes in and is displayed on my visualization. It's a model dialogue. But also my click script is a little bit different. It's not just launching the guy; it's making a call to this web functionality after getting a URL, using that same library as well. So what exactly is it going to do? So when I click on the guy, it opens a web page with a URL derived from data from my visualization and this can be pretty much anything JSL can do. I just want to give us an example of how this also enables you integration with other systems, even outside of JMP. Maybe I want to start a new process. I don't know. All kinds of possibilities. That I apologize. So So there are two customized...advanced customization examples, I should say, that illustrate how you can use graphlets as a an extensible framework. They're both on the JMP Community, you can click here if you get the slides, but one is called the label viewer. I am sorry. And basically what it does is that when you hover over a particular aggregated graph, it finds all the images on the graph...on the data table associated with those rows and creates one image. And that's something customers have asked for a while. I don't want to see just one guy. I want to see if you have more of them, all of them. Or, if possible, right. So when you actually use this extension, and you click on...actually no, I don't have it installed so... And the wiki reader, which was the other one, is the one I just showed to you. Bbut was what I was saying is that when you click and launch this particular...on this particular image, it launches a small application that allows you to page through the different images in your data table and you have a filter that you can control and all that. This is one that was completely done in JSL on top of this framework. So just to close up, what did we learn today? I hope that you found that it's now very easy to add visualizations, you can visualize your visualizations, if you will. It's very easy to add those data visualization extensions using the porcelain features. You actually have not just richer detail on your thumbnails, but you have a new exploratory visual workflow, which you can customize to meet your needs by using either paste graphlet, if you want to have something easy to do, or you can even use JSL using the hover label editor. We're both very curious to see what you've...how you guys are going to use that in the field. So if you come with some interesting examples, please call us back. Send us a screenshot in the JMP Community and let us know. That's all we have today. Thank you very much. And when we give this presentation, we're going to be here for Q&A. So, thank you.