turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Update Column at Specified Rows

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 23, 2016 11:02 AM
(1041 views)

Hello,

I try to use the "For each row" and "Begin Data Update" script commands to update the calculation on specific row related to a column.

According to the table below, I intend to have two addition columns, Defective Rate (%) and Yield, and they both use the same equation of cell qty/daily total.

However, Defective Rate (%) column will only calculate the rows where Category = Reject & MRB, and Yield column will only calculate the rows where Category=Binned.

I can get this work out using IF command like

dt=Current data table();

dt<< New Column("Defective Rate(%)", Numeric, "Continuous", Formula(Name Expr(If(:Category == "Reject", "MRB" ,:Cell Qty/:Daily Total))));

However, I try to learn how to use the For each row combined with Begin Data Update to reach the same result.

dt<< New Column("Yield", Numeric, "Continuous");

<< Begin data update;

For each Row( :Yield_ = Selected(Row State())=Contains(:Category,"Binned"), :Cell Qty/:Daily Total)

Can anyone help out?

Thanks,

Category | WorkWeek | Date_PROD. | Cell Qty | Daily Total | Defective Rate(%) | Yield |

Reject | 25 | 2016-06-20 | 751 | 15553 | ||

MRB | 25 | 2016-06-20 | 21 | 15553 | 0.00135022 | |

Binned | 25 | 2016-06-20 | 14781 | 15553 | 0.95036327 | 0.95036327 |

Reject | 25 | 2016-06-21 | 1251 | 31586 | ||

MRB | 25 | 2016-06-21 | 83 | 31586 | 0.00262775 | |

Binned | 25 | 2016-06-21 | 30252 | 31586 | 0.9577661 | 0.9577661 |

Reject | 25 | 2016-06-22 | 1276 | 27964 | ||

MRB | 25 | 2016-06-22 | 42 | 27964 | 0.00150193 | |

Binned | 25 | 2016-06-22 | 26646 | 27964 | 0.95286797 | 0.95286797 |

Reject | 25 | 2016-06-23 | 126 | 1786 | ||

MRB | 25 | 2016-06-23 | 5 | 1786 | 0.00279955 | |

Binned | 25 | 2016-06-23 | 1655 | 1786 | 0.92665174 | 0.92665174 |

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Solution

Hi bernie426,

According to my personal experience, IF command complete its operations really quickly and will return you with instant results because they only deal with a yes or no situation. On the other hand, you will see some noticable delay before getting your results fully generated through looping since looping commands have to deal with iterations and they will keep operating until they reach the last iteration defined by the programmer.

These statements only apply if you are dealing with super huge data set that involves significantly large amount of rows and columns, which makes IF command a more superior option than looping commands in terms of time consumption. However for smaller sets of data, it is safe to use either one of the commands since the time delay for results to be populated is insignificant.

Best Regards,

Val

3 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 11, 2016 8:46 AM
(899 views)

Hi,

Based on the expected output in your question, I think there is a slight mistake in your formula for "Defective Rate(%)" column if you only wanted "Rejected" rows from "Category" column to be ignored.

I never tried using For each row or Begin data update before, so I am not sure how to fix your current codes using those commands. However I have 2 alternative approaches on how you can achieve the same output.

1) Using IF command (faster approach):

dt=Current data table();

a = (dt<< New Column("DR", "Continuous", Formula(If(:Category == "Reject", "", Char(:Cell Qty / :Daily Total)))))<<get values;

dt << New Column("Defective Rate(%)", "Continuous", set values(a));

b = (dt<< New Column("Y", "Continuous", Formula(If(:Category == "Binned", Char(:Cell Qty / :Daily Total), ""))))<<get values;

dt << New Column("Yield", "Continuous", set values(b));

dt<<delete columns({"DR","Y"});

Since some columns are meant to be returned as empty columns (""), the created columns will automatically set themselves to character columns even though you have already defined them as numeric when you create the columns. Hence I've created extra dummy columns to serve the calculation purposes and set these values into new columns ("Defective Rate(%)" and " Yield") that you intend to create. In this way, you will be getting numeric columns.

2) Using Get rows where() and FOR loops:

dt=Current data table();

dt << New Column("Defective Rate(%)", "Continuous", numeric);

r1 = dt << get rows where(:Category!="Reject");

for (i=1, i<=nrows(r1), i++,

Column(dt, "Defective Rate(%)")[r1*] = :Cell Qty[r1 ] / :Daily Total[r1]*

);

dt << New Column("Yield", "Continuous", numeric);

r2 = dt << get rows where(:Category=="Binned");

for (i=1, i<=nrows(r2), i++,

Column(dt, "Yield")[r2*] = :Cell Qty[r2 ] / :Daily Total[r2]*

);

I believe this approach will be slower if your data set is huge since looping takes longer processing time.

Final Output:

Hope this helps!

Best Regards,

Val

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jul 12, 2016 10:55 AM
(899 views)

Hi Val,

Thanks for the suggested solutions, they both work out well.

Actually, I have came up with another formula approach using IF command like

dt=Current data table();

dt<< New Column("Defective Rate(%)", Numeric, "Continuous", Formula(Name Expr(If((contains(:Category, "MRB")| contains(:Category, "Reject")),:Cell Qty/:Daily Total))));

dt<< New Column("Yield", Numeric, "Continuous", Formula(Name Expr(If(:Category == "Binned",:Cell Qty/:Daily Total))));

Do you know what is the speed difference b/w using IF command and looping command?

Many Thanks,

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hi bernie426,

According to my personal experience, IF command complete its operations really quickly and will return you with instant results because they only deal with a yes or no situation. On the other hand, you will see some noticable delay before getting your results fully generated through looping since looping commands have to deal with iterations and they will keep operating until they reach the last iteration defined by the programmer.

These statements only apply if you are dealing with super huge data set that involves significantly large amount of rows and columns, which makes IF command a more superior option than looping commands in terms of time consumption. However for smaller sets of data, it is safe to use either one of the commands since the time delay for results to be populated is insignificant.

Best Regards,

Val