turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Formula to find Mode of Daily values

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 1:19 AM
(1456 views)

Hi,

I have a datatable which has multiple samples tracking different parameters for a collection of different systems within those samples (1 sample will have 1 system, but another sample may also have the same system). Each sample has a different state as it approaches it's end. These states are what I'm trying to predict. I have created a decision tree algorithm using some different samples in a seperate datatable which I then have added to this datatable. What this decision tree is trying to do is predict the state of the system at different points in time. I will then measure these predicted states against the known states of the system at these points in time (the algorithm was created with different samples so this is "fresh" data)

Now as you can imagine during the course of a day this formula can be sending hundreds or thousands of different predictions and not all of them are correct so I want a column formula that can look at all of these predictions, by sample number and by day, and output what the most commonly predicted state of the equipment (Mode)

In other words I'd like a column that calculates the mode Alert rating for each day of each sample of the data and output it's predicted Alert rating to each of these rows

In the example datatable I've attached I would like a formula that can automate the column titled "Alert Prediction for the day" and I don't mind it using the LogTime column or the time since start up column to create it's daily predictions.

Thanks for your help

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 11:04 AM
(2789 views)

Solution

Bearing in mind that's it's rather specific to your case, this should do it. You can use 'Help > Scripting Index' and the editor tooltips to help understand how it works.

```
NamesDefaultToHere(1);
// Starting table
dt = DataTable("Example Datatable for Summarising Alert Ratings.jmp");
// Add required columns
dt << NewColumn("Date", Character, Formula(Short Date( :Logtime )));
dt << NewColumn("Group", Character, Formula((:System || " ") || :Date));
// Find the mode of :Decision Tree Prediction for each level in :Group, and fill in 'modeVals'
vals = Column(dt, "Decision Tree Prediction") << getValues;
gVals = Column(dt, "Group") << getValues;
levs = AssociativeArray(gVals) << getKeys;
modes = J(NItems(levs), 1, .);
modeVals = J(NRow(dt), 1, .);
for(i=1, i<=NItems(levs), i++,
rows = Loc(AsList(gVals), levels[i]);
modes[i] = Mode(vals[rows]);
modeVals[rows] = modes[i];
);
// Build the final column in dt using 'modeVals'
vl = Column(dt, "Decision Tree Prediction") << getProperty("Value Labels");
nc = dt << NewColumn("Modes for Decision Tree Prediction", Numeric, Nominal, Values(modeVals));
nc << valueLabels(Arg(vl, 1));
```

10 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 4:50 AM
(1437 views)

Jim

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 5:07 AM
(1436 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 5:12 AM
(1433 views)

If both columns are numeric, you could do it this way:

```
NamesDefaultToHere(1);
ClearLog();
// Given a set of values ('vals') and group membership ('gVals'),
// returns the distinct levels in 'gVals' and the mode of each group
modeWithBy =
Function({vals, gVals}, {Default Local},
levs = AssociativeArray(gVals) << getKeys;
modes = J(NItems(levs), 1, .);
for(i=1, i<=NItems(levs), i++, modes[i] = Mode(vals[Loc(AsList(gVals), levs[i])]));
EvalList({levs, modes});
);
l = [1, 2, 1, 1, 2, 2, 3];
g = [1, 1, 1, 1, 2, 2, 2];
Print(modeWithBy(l, g));
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 5:43 AM
(1428 views)

Also I'd like these values to be merged back into the datatable so you can see what the predicted alert is on a day by day basis

Thanks for your help

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 5:44 AM
(1427 views)

@ian_jmp wrote:If both columns are numeric, you could do it this way:

`NamesDefaultToHere(1); ClearLog(); // Given a set of values ('vals') and group membership ('gVals'), // returns the distinct levels in 'gVals' and the mode of each group modeWithBy = Function({vals, gVals}, {Default Local}, levs = AssociativeArray(gVals) << getKeys; modes = J(NItems(levs), 1, .); for(i=1, i<=NItems(levs), i++, modes[i] = Mode(vals[Loc(AsList(gVals), levs[i])])); EvalList({levs, modes}); ); l = [1, 2, 1, 1, 2, 2, 3]; g = [1, 1, 1, 1, 2, 2, 2]; Print(modeWithBy(l, g));`

The group "Sample" can be made to be continuous, but it is definitely a nominal data set. The group System is not able to be made continuous without using value labels and also is a nominal data set. How would this then related to organise it to find the mode over all of the groups (System and Sample) followed by organising the data into a day by day basis for each combination of them

Also I'd like these values to be merged back into the datatable so you can see what the predicted alert is on a day by day basis

Thanks for your help

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 6:54 AM
(1419 views)

If I understand correctly, you want to define a group for each value of 'System' combined with the 'date part' of 'Logtime' (maybe use the 'ShortDate()' function to get this into a new column).

If that's the case, you could either generalise the code above to work with more than one 'by' variable, else (perhaps simpler) just concatenate two character columns together and use this to define the groups. Finally, as you point out, you would need to adapt the code to work when 'gVals' is character. Maybe:

```
NamesDefaultToHere(1);
ClearLog();
// Given a set of values ('vals') and group membership ('gVals'),
// returns the distinct levels in 'gVals' and the mode of each group
modeWithBy =
Function({vals, gVals}, {Default Local},
levs = AssociativeArray(gVals) << getKeys;
modes = {};
for(i=1, i<=NItems(levs), i++, InsertInto(modes, Mode(vals[Loc(AsList(gVals), levs[i])])));
EvalList({levs, modes});
);
l = [1, 2, 1, 1, 2, 2, 3];
g = {"1", "1", "1", "1", "2", "2", "2"};
Print(modeWithBy(l, g));
```

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 7:21 AM
(1416 views)

So effectively the logic looks like this:

for each Sample -> Split them up into days (could use the datetime column or the column I have already created which has for each row the number of days since the start of this particular sample. Yes thanks, using shortcode might be a way to make the grouping of days easier)

Then for each sample look at the column which is predicting the Alert state and then for every day of that sample look at every prediction that is made and find the mode value for it (which is ordinal). Then take that output mode value and inset it back into the datatable on each row of a new column (hence why I originally wanted it as a formula) for that day of that sample. I think it's clearer when you read this looking at the datatable

Does that help make a bit more sense?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 7:22 AM
(1415 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 20, 2017 11:04 AM
(2790 views)

Bearing in mind that's it's rather specific to your case, this should do it. You can use 'Help > Scripting Index' and the editor tooltips to help understand how it works.

```
NamesDefaultToHere(1);
// Starting table
dt = DataTable("Example Datatable for Summarising Alert Ratings.jmp");
// Add required columns
dt << NewColumn("Date", Character, Formula(Short Date( :Logtime )));
dt << NewColumn("Group", Character, Formula((:System || " ") || :Date));
// Find the mode of :Decision Tree Prediction for each level in :Group, and fill in 'modeVals'
vals = Column(dt, "Decision Tree Prediction") << getValues;
gVals = Column(dt, "Group") << getValues;
levs = AssociativeArray(gVals) << getKeys;
modes = J(NItems(levs), 1, .);
modeVals = J(NRow(dt), 1, .);
for(i=1, i<=NItems(levs), i++,
rows = Loc(AsList(gVals), levels[i]);
modes[i] = Mode(vals[rows]);
modeVals[rows] = modes[i];
);
// Build the final column in dt using 'modeVals'
vl = Column(dt, "Decision Tree Prediction") << getProperty("Value Labels");
nc = dt << NewColumn("Modes for Decision Tree Prediction", Numeric, Nominal, Values(modeVals));
nc << valueLabels(Arg(vl, 1));
```