Subscribe Bookmark
martindemel

Staff

Joined:

Jul 10, 2014

JMP and Excel - A never ending story? - Part 1

Note: Though JMP can read spreadsheet files both on Mac and Windows, with different tools than Excel, I refer in my posts to a windows machine and Excel. Some features will be related to newer versions (e.g. Excel Import Wizard was introduced in JMP 11, Importing Excel Colors in JMP 13).

In my previous blog post JMP and Excel - A Never ending story? - Part 0  I promised to let you learn about common pitfalls you may face when working with Excel data in JMP. Before I actually do this you need to understand some basic differences between JMP and Excel as well how to get data into JMP. Let's start with the structure of a data set:

  • In Excel you can enter all kinds of data into one cell, no matter what structure the rest of the data has. In JMP a cell is related to its column, and therefore to one specific variable.
  • In addition one spreadsheet in Excel can have multiple worksheets, in JMP you would have several or one combined table.

You can imagine that Excel files in a trivial structure are best to import. Trivial means first row contains the column headers and as of the second row the data is listed. This kind of data you could import easily:

  • Copy the data in Excel and paste it into a JMP data table. If it does not need to be at a certain position you might prefer to use the menu option “EditàPaste” or “Edità Paste with Column Names”
  • Or you use the JMP ribbon in Excel:

JMPExcel_ribbon.PNG

  • Or you drag the file from the explorer into any JMP window and press “Import”
  • Or you use the “Fileà Open” option or within the JMP Starter in the “View” menu.

All will be quite straightforward options if the data structure is trivial (unless there are special data formats used, which might cause troubles, but this I’ll address in a later post).

However not always you have variable names in the first row and the data beyond. Some structures in Excel files are good to gain an overview of the data in a table. At the same time those structure is often very cumbersome for the purpose of statistical analysis. So, how on earth do I get these kind of non-trivial spreadsheet data into JMP? Time for a first example:

JMPExcel_example1_0.png

Here you have data quite typical for Excel spreadsheets: hierarchical headers on top and at the side, some merged cells and the data in the spanned matrix. As you know JMP the data table could not be used for statistical analysis if you’d copy it just as it is. Therefore let’s open it using JMP’s Excel Import Wizard:

JMPExcel_example1_1.png

There are different options, and the most important one for this type of data you’ll find on both this page and the next of the Import Wizard. First you have to define the “Number of rows with column headers” with 3, as you have the year, the quarters and the months each in a separate row. This will combine the header information in one single header per column:
JMPExcel_example1_2.png

Now going to the next page we want to handle the information at the left, the states, counties, and cities. Checking “Treat multiple column header lines as hierarchies” will stack the header information into 3 new columns, providing a better structure for analysis. Now you are ready to import the data and analyze it.

JMPExcel_example1_3.png

If you want to see this in action I can recommend this short video (using JMP 12). One note in case you just started using the Import Wizard. It will save the last setting, which may lead to surprising views on the data, when importin new, but different data. Using the button “Restore Default Settings” you can reset your settings in the wizard.

So far you have read about several ways to import trivial structured data, as well how to handle hierarchical and/or merged data sets using the Excel Import Wizard in JMP. Stay tuned for the next post talking about ways to import multiple worksheets at once. I also will show a way to automatically read multiple Excel sheets with a JSL Script. And then we will talk about potential challenges based on the format of the data and others.

I’m looking forward to your comments and shared experiences.