May 16, 2020 1:37 PM
(487 views)

Hi JMP Community,

What us the proper syntax in JSL for calculating the column mean (or other statistics) for a variable defined __subset__ of rows? In a data table, I can just used the formula "Col Mean (:Data Column, :Grouping Column)" but in a JSL script this does not seem to be a correct usage.

Of note, I'm thinking of using brute force by using a "For each row" structure but that seems quite inefficient in the evaluation of 200+ columns with 1000 rows.

Thank you for your help.

TS

Thierry R. Sornasse

Accepted Solutions

I believe the simplest and most efficient way to do this is to use

Tables==>Summary

This will create a data table very fast, from which you can get your statistics for your JSL, or if you need to have the statistics back into the original data table, a simple Update with matching will do that. Handling 200 columns and 1000 rows is not a problem at all. Below is a simple example that will illustrate how to do this.

```
names default to here(1);
dt=Open("$SAMPLE_DATA/semiconductor capability.jmp");
colNamesLIst = dt << get column names(continuous);
dtStats = dt << Summary(invisible,
Group( :SITE ),
Mean( colNamesList ),
Freq( "None" ),
Weight( "None" ),
Link to original data table(0)
);
dt << Update(
With( dtStats ),
Match Columns( :SITE = :SITE )
);
close( dtStats, nosave );
```

Jim

Re: Calculating Col Mean () for a subset of rows in JSL Script?

Hi JMP Community,

Here is what I have come up with using the "For Each Row" structure:

```
Names Default to Here (1);
dt = Current Data Table ();
ALL_MEAN = Col Mean (:DATA);
MAX_MEAN = -1000;
For Each Row (
if (MAX_MEAN < Col Mean(:DATA, :GROUP,:SGROUP), MAX_MEAN = Col Mean(:DATA, :GROUP,:SGROUP))
);
Show (ALL_MEAN);
Show (MAX_MEAN);
```

Any idea on making this more efficient?

Thanks,

TS

Thierry R. Sornasse

I believe the simplest and most efficient way to do this is to use

Tables==>Summary

This will create a data table very fast, from which you can get your statistics for your JSL, or if you need to have the statistics back into the original data table, a simple Update with matching will do that. Handling 200 columns and 1000 rows is not a problem at all. Below is a simple example that will illustrate how to do this.

```
names default to here(1);
dt=Open("$SAMPLE_DATA/semiconductor capability.jmp");
colNamesLIst = dt << get column names(continuous);
dtStats = dt << Summary(invisible,
Group( :SITE ),
Mean( colNamesList ),
Freq( "None" ),
Weight( "None" ),
Link to original data table(0)
);
dt << Update(
With( dtStats ),
Match Columns( :SITE = :SITE )
);
close( dtStats, nosave );
```

Jim

