Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Very slow loop behaviour

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 21, 2019 1:59 AM
(2854 views)

I have a question regarding time spent in loops.

I am analyzing larger amounts of production data, typically 1mio rows.

For each row, production captures the status (column Statusnummer).

If the status is "1", then I would like to calculate the values for actual ("ist") and set value("soll").

So I need to calculate 1mio rows. I am doing this a lot of times in my code, for different values.

Sometimes, the loops are fast (60seconds), sometimes they are vey (!) slow(more than one day! The processor still working at full load!).

All the columns are set to numeric and continuous, but I cannot figure out, why some loops are very slow.

I even checked if the cell contains missing values(see sample code); Still the code remains slow.

Any idea, what I am doing wrong?

Marten

Sample of the code, which is slow:

```
for(i=1,i<=NRows(dt),i++,(
if(dt:Statusnummer[i] == 1,(
if(!isMissing(dt:Drehzahl[i]) & !isMissing(dt:geschw[i]) ,(
ist = dt:Drehzahl[i];
soll = dt:SollDrehzahl[i];
dt:Soll_drehzahl[i] = soll;
dt:Ist_drehzahl[i] = ist;
dt:Maschinenleistung[i] = 100*ist/soll;
));
));
));
```

1 ACCEPTED SOLUTION

Accepted Solutions

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Perhaps you have another report that is open, and is updating for each change you make to the table. You could do this before and after your loop:

```
dt << Begin Data Update;
// the loop
dt << End Data Update;
```

Craige

9 REPLIES 9

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very slow loop behaviour

Have you tried building those columns in the data table using the Formula Editor instead of looping with JSL? That may be more efficient.

-Jeff

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very slow loop behaviour

Dear Jeff, thanks for the answer. Formulas at that place of the script

do not help. (This is my sorting and data cleaning section, where I want

to avoid formulas)

I am running at the same problem at different places. And some formulas

were affected as well.

It seems, that the process is freezing almost completely (I counted 1

row/second!) although the CPU continues with the load!

The result however, is correct, once finished.

Yours Marten

do not help. (This is my sorting and data cleaning section, where I want

to avoid formulas)

I am running at the same problem at different places. And some formulas

were affected as well.

It seems, that the process is freezing almost completely (I counted 1

row/second!) although the CPU continues with the load!

The result however, is correct, once finished.

Yours Marten

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very slow loop behaviour

Hi @MWalther

I would try to do this as Jeff suggested using column formulas. don't forget you can always remove the formulas or just suppress their evaluations once you do not want them to be constantly re evaluated.

if you insist on using the for loop method i would first see if the last few semicolons are necessary.

furthermore, I would try to simplify the nested if statement into one longer list of conditions.

finally, using the de bugger on a sub sample and see what in the loop takes more time. I get a feeling there is doing something unintended that is taking most of the time.

let us know if anything improves.

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very slow loop behaviour

|Hi Ron, thanks. The nested loops are intended to make it faster,

because the first condition fails in many lines, such that the loop can

(and does)start from the top using next row. so from the functionality

everything works. However, I rechecked, where it gets slow. In the

debugger, the data table access to write back the values took long:

dt:Soll_drehzahl[i] = soll; dt:Ist_drehzahl[i] = ist;

dt:Maschinenleistung[i] = 100*ist/soll;|

So I commented them out (//) ==> result: very fast

I erased the comment marks (//) and..... still very fast (!)

.......(Same code, same structure,....)

So it works now, but I still do not know what caused the problem. Maybe

something in my system is corrupt.

Thanks for the suggestions.

because the first condition fails in many lines, such that the loop can

(and does)start from the top using next row. so from the functionality

everything works. However, I rechecked, where it gets slow. In the

debugger, the data table access to write back the values took long:

dt:Soll_drehzahl[i] = soll; dt:Ist_drehzahl[i] = ist;

dt:Maschinenleistung[i] = 100*ist/soll;|

So I commented them out (//) ==> result: very fast

I erased the comment marks (//) and..... still very fast (!)

.......(Same code, same structure,....)

So it works now, but I still do not know what caused the problem. Maybe

something in my system is corrupt.

Thanks for the suggestions.

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Perhaps you have another report that is open, and is updating for each change you make to the table. You could do this before and after your loop:

```
dt << Begin Data Update;
// the loop
dt << End Data Update;
```

Craige

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very slow loop behaviour

This might speed things up - restrict your loop to only the rows meeting your conditions.

```
found_rows = dt << get rows where(dt:Statusnummer == 1 &
!Is Missing(dt:Drehzahl) &
!Is Missing( dt:geschw ));
for (i = 1, i <= nrows(found_rows), i++,
k = found_rows[i];
dt:Soll_drehzahl[k] = dt:SollDrehzahl[k];
dt:Ist_drehzahl[k] = dt:Drehzahl[k];
dt:Maschinenleistung[k] = 100 * dt:Ist_drehzahl[k] / dt:Soll_drehzahl[k] ;
);
```

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very slow loop behaviour

Thanks, probably (see comment above, the problem vanished meanwhile) this was the reason,since many reports were open. But I cannot reproduce right now. I will introduce your solution in my code and see, if its the permanent fix.

Marten

Marten

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very slow loop behaviour

**Update**: I added the get rows where method of finding the rows. Obviously this has the restriction of requiring numberic data, but I sometimes am able to get around that.

I'm a big fan of doing everything with matrices.

```
Names default to here(1);
dt = New Table("Test",
Add Rows(1000000),
New Column("Statusnummer", Set Each Value(Random Integer(0, 1))),
New Column("Drehzahl", Set Each Value(Choose(Random Integer(1, 2), ., Random Normal()))),
New Column("geschw", Set Each Value(Choose(Random Integer(1, 2), ., Random Normal()))),
New Column("SollDrehzahl", Set Each Value(Random Normal())),
New Column("Soll_drehzahl"),
New Column("Ist_drehzahl"),
New Column("Maschinenleistung"),
);
st = HPTime();
for(i=1,i<=NRows(dt),i++,
if(dt:Statusnummer[i] == 1,
if(!isMissing(dt:Drehzahl[i]) & !isMissing(dt:geschw[i]),
ist = dt:Drehzahl[i];
soll = dt:SollDrehzahl[i];
dt:Soll_drehzahl[i] = soll;
dt:Ist_drehzahl[i] = ist;
dt:Maschinenleistung[i] = 100*ist/soll;
);
);
);
t_loop = HPTime() - st;
st = HPTime();
m = dt[0, {"Statusnummer", "Drehzahl", "geschw"}]; // makes a matrix
//check the matrix
v = m[0, 1] // don't need to do a check if it's 0 or 1
&
!ismissing(m[0, 2]) // check if drehzahl is missing
&
!ismissing(m[0, 3]); // check if drehzahl is missing
rows = loc(v); // this is the rows you want to do stuff on now.
//now just set all at once
soll = dt[rows, "Soll_drehzahl"] = dt:SollDrehzahl[rows]; // assign the rows and variable simultaneously
ist = dt[rows, "Ist_drehzahl"] = dt:Drehzahl[rows]; // same thing
dt[rows, "Maschinenleistung"] = 100 * ist :* (1/soll); // do an element wise matrix division
t_mat = HPTime() - st;
st = HPTime();
rows = dt << get rows where(dt:Statusnummer == 1 &
!Is Missing(dt:Drehzahl) &
!Is Missing( dt:geschw ));
//now just set all at once
soll = dt[rows, "Soll_drehzahl"] = dt:SollDrehzahl[rows]; // assign the rows and variable simultaneously
ist = dt[rows, "Ist_drehzahl"] = dt:Drehzahl[rows]; // same thing
dt[rows, "Maschinenleistung"] = 100 * ist :* (1/soll); // do an element wise matrix division
t_get_where = HPTime() - st;
show(t_loop, t_mat, t_get_where);
```

When I ran this comparison, I got

t_loop = 11259288;

t_mat = 206942;

**Updated**

```
t_loop = 11950585;
t_mat = 205366;
t_get_where = 6742823;
```

Vince Faller - Predictum

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Very slow loop behaviour

@vince_faller Cool! I only see about 10X faster (which is still pretty amazing.) I suspect this speedup is unrelated to begin/end data update, but I also think the matrix assignments will do something like the begin/end data update internally.

Craige