turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Getting an average of a subset of data

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 21, 2017 12:13 PM
(577 views)

Hi,

I have intensities of 4 treated and 5 untreated samples taken between at a range of wavelengths (400 - 600 nm) at three different time points (0, 1 , 3 hours).

In the long run, I'm trying to normalize my data so the average intestity at 400nm of the untreated samples is 1 at each time point. In order to do that, I need to figure out how to get the average untreated, 400nm intensity at 0, 1, 3 hours.

This is my current code:

Col Mean( :Intensity, :Time == 1 & :Wavelength( nm ) == 400 & :Treat == "No" )

From graphs I've made (and an Excel pivot sheet), I know the value should be about 0.72 for 1 hour, but I'm getting 1.3 for most rows. I'm also getting 0.68 for the rows that satisfy my 3 "and" statements, but I know that Excel pivot sheets sometimes do funny normalizing things, so I'm not sure if I'm concerned about the differences there.

Any suggestions for making my code consistant would be greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 21, 2017 8:55 PM
(1055 views)

Solution

Hi,

If I understand your question correctly, you are asking to compute mean statistic for a particular column by one or more grouping columns.

In the attached example data table, I saved two scripts to illustrate how you can do this using two different approaches. The one entitled, "Mean by Trt and Time Period and Wavelength" creates a pivot table with a local data filter so you can look at the mean Intensities for various time periods, treatments and wavelengths. The local data filter can be conditionally applied (by using "Conditional" under the red triangle option) so that when selecting a category it will limit the categories of the next column filter only to those that are in the selected category.

The second script entitled, "Tables Summary" is an example of using the Tables-->Summary with the treatment, time period and wavelength as grouping columns.

And for the others that want to see how to perform the pivot table with a script.

```
Tabulate(
Show Control Panel( 0 ),
Set Format( Mean( :Intensity( 12, 1 ) ) ),
Add Table(
Row Table(
Grouping Columns( :Time Period, :Treatment, :Wavelength ),
Analysis Columns( :Intensity ),
Statistics( Mean )
)
),
Local Data Filter(
Add Filter(
columns( :Wavelength, :Time Period, :Treatment ),
Where( :Time Period == "0" ),
Display( :Treatment, Size( 309, 232 ), List Display )
)
)
)
```

And finally, the Tables-->Summary with the corresponding JSL:

```
Data Table( "Getting an average of a subset of data.jmp" ) <<
Summary(
Group( :Time Period, :Treatment, :Wavelength ),
Mean( :Intensity ),
Freq( "None" ),
Weight( "None" )
)
```

Is this more toward what you are expecting?

best,

Stan

7 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 21, 2017 12:43 PM
(570 views)

I am not sure you are interpreting what your code is actually doing. The second element within the Col Mean() function is a "By" statement. That is, the way you have it coded, it will calculate one mean for Intensity for all of the rews that meet the following

:Time == 1 & :Wavelength( nm ) == 400 & :Treat == "No"

and for all rows that do not meet that criteria, it will calculate the mean of Intensity for all of those rows. So when the formula is applied, it can only generate one of two different values. Is this what you are expecting?

Jim

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 21, 2017 1:29 PM
(566 views)

I'm just trying to get the average value of the intensity using all the

values where Time is 1 h, Wavelength is 400, and there has been no

treatment( I@1,400,N). I'll do the same for 0 and 3 hours (I@ 0, 400,N and

I@3, 400,N)

I'm eventually going to make a formula uses if statements to check the time

for the row, then divides the Intensity for that row by the time

appropriate "I@" value.

Roughly:

If t = 0, I/I@0,400,n

Else if t= 1, I/I@1,400,n

Else if t = 3, I/I@3,400,n

Else Error

I hope that explains better what I'm trying to do.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 21, 2017 2:26 PM
(558 views)

I am sure that I do not follow in total detail, the different conditions that you need to put in place, to make this work, however, I have put together an example that I believe will illustrate the method that will work to generate the answer you need.

```
Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA\Big Class.jmp" );
dt << New Column( "Example",
formula(
If( :sex == "F" & :age == 12 | :sex == "M" & :age == 13,
Col Mean( If( :sex == "F" & :age == 12 | :sex == "M" & :age == 13, :height, . ) ),
.
)
)
);
```

Take a look at the formula. It used 2 different decisions to get it's response. First, only those rows where the specific combinations of sex and age are met, should the means be calculated. And if those specifics are not met, the value should be a missing value. The second decision is performed when the first decision is correct, and then, we want to calculate the column mean, but only include the values where the specific values of sex and age combinations are true.

Jim

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 21, 2017 8:55 PM
(1056 views)

Hi,

If I understand your question correctly, you are asking to compute mean statistic for a particular column by one or more grouping columns.

In the attached example data table, I saved two scripts to illustrate how you can do this using two different approaches. The one entitled, "Mean by Trt and Time Period and Wavelength" creates a pivot table with a local data filter so you can look at the mean Intensities for various time periods, treatments and wavelengths. The local data filter can be conditionally applied (by using "Conditional" under the red triangle option) so that when selecting a category it will limit the categories of the next column filter only to those that are in the selected category.

The second script entitled, "Tables Summary" is an example of using the Tables-->Summary with the treatment, time period and wavelength as grouping columns.

And for the others that want to see how to perform the pivot table with a script.

```
Tabulate(
Show Control Panel( 0 ),
Set Format( Mean( :Intensity( 12, 1 ) ) ),
Add Table(
Row Table(
Grouping Columns( :Time Period, :Treatment, :Wavelength ),
Analysis Columns( :Intensity ),
Statistics( Mean )
)
),
Local Data Filter(
Add Filter(
columns( :Wavelength, :Time Period, :Treatment ),
Where( :Time Period == "0" ),
Display( :Treatment, Size( 309, 232 ), List Display )
)
)
)
```

And finally, the Tables-->Summary with the corresponding JSL:

```
Data Table( "Getting an average of a subset of data.jmp" ) <<
Summary(
Group( :Time Period, :Treatment, :Wavelength ),
Mean( :Intensity ),
Freq( "None" ),
Weight( "None" )
)
```

Is this more toward what you are expecting?

best,

Stan

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 22, 2017 1:17 PM
(516 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 22, 2017 1:47 PM
(512 views)

That sort of worked, but it's not easy to apply on particular value (sort of like using the $$ in Excel) to all rows which I deem to be of the same type.

I think it's just a difference in the software, and I will have to get used to the differences.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 22, 2017 1:50 PM
(509 views)

I ended up using your second option, but I still have to manually enter that normalization factor for my calculation... My final code looks like this:

If( :Time == 0, :Intensity / 1.11175075, If( :Time == 1, :Intensity / 0.772546829268293, If( :Time == 3, :Intensity / 1.2335052, "Error" ) )