Choose Language Hide Translation Bar
gail_massari
Community Manager Community Manager
JMP cleanup from Excel data that has some rows that need to be split into separate columns

Recently, I needed to analyze data from a website that stored data in columns and rows. Every column cell in the main row related to the upper left cell (Title). However, each Title had two cells below it (the first cell in the subsequent two rows). One contained a description of the Title; the other contained content tags related to the Title.

Unfortunately, the web page was not set up to allow me to import as .html or .txt using JMP File>Internet Open. So, I cut and pasted into Excel. A JMP systems engineer colleague, Olivia Lippincott @olippincott, said that for Excel users she has met, the Excel format (Figure 1) is common and she often helps them with cleanup.

Multiple rows needed to be split into separate columns. This is how we did it.

First, we opened the Excel file into JMP.

Videos Imported from Excel.JPGFigure 1: Table of Excel data imported into JMPSecond, we created a new column called Title Type, with Column Properties Data Type = Numeric and Modeling Type = Nominal. We populated the cells with a repeated sequence of 1,2,3 by typing 1, 2, and 3 in the first three cells in Title Type column, highlighting them, and then selecting FILL>REPEAT SEQUENCE TO END OF TABLE. This gave us an ID column to use to identify the Title field pattern.

Title Type Added.JPGFigure 2: Add Title Type to identify the pattern 1-3 for the three Title TypesThird, we split the table (TABLE>SPLIT) to break Title Type 2 and Title Type 3 into two different columns. I saved the script to the data table, too, for reference.

Capture.JPGFigure 3: Splitting Title Column into 3 columns by Title TypeFinal Videos Table Split Dialog.JPGFigure 4: JSL script to split Title Column into 3 columns by Title Type

Fourth, we moved some of the columns around, renamed title Title Type 2 to Description and Title Type 3 to Tags. Finally, we got rid of the leading Tags: text on the Tags column by recoding Tags: to a blank space using the Replace String option under the Recode red triangle.Replace String.JPGFigure 5: Accessing Column Recode using Replace String

 

Regplace Tags Recode.JPGFigure 6: Using Recode to replace consistent Tags: text found in each cell in a column

 

Final Table Ready for Analysis.JPGFigure 7. Final Videos Table ready for JMP analysis

The whole cleanup process using JMP took about 1 minute. Perhaps these steps will help you during your own data cleanup and preparation.