Expression Analysis with JMP Genomics, Part 1: Experimental Design
Jun 25, 2019 11:04 AM
| Last Modified: Jul 11, 2019 3:21 PM
To perform expression analysis in JMP Genomics, an Experimental Design File (EDF) is required. The EDF contains sample annotation information and/or phenotype information for a related expression data set. There are a few different ways to create this file, but the best method for creating an EDF is dependent upon how your data is organized. For this guide, we create an EDF for a series of data files, which we then use to import each of the files into JMP. The data deals with gene expression levels in response to estrogen treatment during a time-course experiment.
A completed EDF typically consists of rows representing each sample or experimental group, and columns containing the characteristics and experimental conditions of each row. For this experiment, there are 18 experimental groups (three time intervals, with three trials per interval, with both an experimental and control group for each). Therefore, the completed EDF has 18 rows.
Required columns are as follows:
ColumnName: For this example, each ColumnName entry represents a unique experimental trial such as “untreated_control_12hr_1” meaning this row represents a control group, measured at 12 hours, and is the first of the three trials for these conditions.
Note that each entry in this column must be a unique identifier. Each entry must match the column name in the expression data set. It is used to connect/link the two data sets together.
Array (or Chip): A unique number for each array/group. For this example, 1-18.
Experimental Conditions: Column(s) that indicate the experimental conditions for each row/array. In this example, there are two such columns. One would have the Time designation for each array: 12hr, 24hr, or 48hr. The other (Characteristics) would label each array as an estrogen treated or control group.
Optional columns include:
File: Needed if the EDF will be used to import a list of files from a folder. If used, it contains the file name and extension of the file to be imported.
Intensity: Required to import a list of files from a folder. This column typically contains the column header or label for the column that contains the proper expression values.
Any other pertinent information to be included in the EDF.An example of a completed Experimental Design File (EDF).
Many ways to create an EDF:
JMP has multiple tools to help with EDF creation. An EDF can be imported from a delimited text file or from excel. There are also tools in JMP that are useful for EDF creation. These tools can be accessed from the Genomics Starter by selecting Import > Experimental Design File:
For this example, choose Create Design File Template. This tool makes an EDF template by reading raw data files from a selected folder.
To import the raw data, first unzip the file E2_Expression_Data.zip to a single empty folder, Choose the folder in which they are stored in the Folder of Raw Data Files box.
Select the .txt as the file extension in the File Filter Expression drop down menu.
Specify the Number of Channels in Each File. The output file contains a unique row for each channel within each file. Here there is only one channel in each file
In the New Variable Names for Experimental Design box, type the names of each of the experimental group variables on a separate line. In this example, the variables are Time (12, 24, or 48 hours), Characteristics (estrogen treated or control), and Group (repetitions 1-3 for each combination of conditions).
Specify e2_expression_edf as the File Name, and designate the Output Folder where the EDF template will be stored.
Once the template (below) is created, it must be filled in with experimental condition information.
To fill in the EDF template we will use Create ColumnName: This tool is useful for concatenating multiple columns specifying experimental conditions to create a unique set of entries for the required ColumnName field in an EDF.
To begin, we first must fill in the experimental condition columns (Time, Characteristics, Group). For the Time variable, enter 12hr, 24hr, and 48hr into six consecutive rows each. This can be done by entering the value into the firs cell and then highlighting the remaining cells to fill and selecting Fill from the right-click menu as shown below.
Next, fill in the Characteristics and Group columns to match the experimental design as shown below.
Once the experimental conditions are filled in and the window containing the table with the columns you wish to combine is open and in focus on JMP, select Create ColumnName from the Genomics Starter (Import > Experimental Design File).
When Create ColumnName is selected, a dialog box titled Select Columns opens. Here, select the columns to combine that will be used to populate the ColumnName heading in the EDF.
Select Characteristics, Time, and Group in that order and click OK to concatenate the columns.
The result is a completed ColumnName column as shown below.
The last step is to fill in the Intensity. This column tells JMP which variable contains the expression data to import in each text file. For this example, each text file does not have column names (the data begins on line 1), and the expression data is in column 2, so enter Var2 in the intensity column and use the Fill tool to fill all 18 rows. The completed EDF looks like this:
Before performing expression analysis in JMP Genomics, an Experimental Design File must be created. The EDF contains experimental conditions, sample annotation, and/or phenotype data for a related expression data set. There are several ways to create this file. The best method of those outlined above is entirely dependent on the organization of your data. The tools above are designed to make creating your EDF easy and effective for any kind of expression analysis you plan to do. Using this completed EDF, we now can import each of these .txt files into one large data set that can be used to perform an expression analysis in JMP.