cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Choose Language Hide Translation Bar
MWalther
Level II

Very slow loop behaviour

I have a question regarding time spent in loops.

I am analyzing larger amounts of production data, typically 1mio rows. 

For each row, production captures the status (column Statusnummer).
If the status is "1", then I would like to calculate the values for actual ("ist") and set value("soll").

 

So I need to calculate 1mio rows. I am doing this a lot of times in my code, for different values.

Sometimes, the loops are fast (60seconds), sometimes they are vey (!) slow(more than one day! The processor still working at full load!).

All the columns are set to numeric and continuous, but I cannot figure out, why some loops are very slow.

I even checked if the cell contains missing values(see sample code); Still the code remains slow.

Any idea, what I am doing wrong?

Marten

Sample of the code, which is slow:

 

for(i=1,i<=NRows(dt),i++,(				
	if(dt:Statusnummer[i] == 1,(
		if(!isMissing(dt:Drehzahl[i]) & !isMissing(dt:geschw[i])  ,(
			ist  = dt:Drehzahl[i];
			soll = dt:SollDrehzahl[i];
			dt:Soll_drehzahl[i] = soll;
			dt:Ist_drehzahl[i] = ist;
			dt:Maschinenleistung[i] = 100*ist/soll;	
			));
		));
	));
1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: Very slow loop behaviour

Perhaps you have another report that is open, and is updating for each change you make to the table. You could do this before and after your loop:

dt << Begin Data Update;
// the loop
dt << End Data Update;
Craige

View solution in original post

9 REPLIES 9
Jeff_Perkinson
Community Manager Community Manager

Re: Very slow loop behaviour

Have you tried building those columns in the data table using the Formula Editor instead of looping with JSL? That may be more efficient.

 

 

-Jeff
MWalther
Level II

Re: Very slow loop behaviour

Dear Jeff, thanks for the answer. Formulas at that place of the script
do not help. (This is my sorting and data cleaning section, where I want
to avoid formulas)
I am running at the same problem at different places. And some formulas
were affected as well.

It seems, that the process is freezing almost completely (I counted 1
row/second!) although the CPU continues with the load!
The result however, is correct, once finished.

Yours Marten


ron_horne
Super User (Alumni)

Re: Very slow loop behaviour

Hi @MWalther 

I would try to do this as Jeff suggested using column formulas. don't forget you can always remove the formulas or just suppress their evaluations once you do not want them to be constantly re evaluated.

if you insist on using the for loop method i would first see if the last few semicolons are necessary.

furthermore, I would try to simplify the nested if statement into one longer list of conditions.

finally, using the de bugger on a sub sample and see what in the loop takes more time. I get a feeling there is doing something unintended that is taking most of the time.

let us know if anything improves.

MWalther
Level II

Re: Very slow loop behaviour

|Hi Ron, thanks. The nested loops are intended to make it faster,
because the first condition fails in many lines, such that the loop can
(and does)start from the top using next row. so from the functionality
everything works. However, I rechecked, where it gets slow. In the
debugger, the data table access to write back the values took long:
dt:Soll_drehzahl[i] = soll; dt:Ist_drehzahl[i] = ist;
dt:Maschinenleistung[i] = 100*ist/soll;|

So I commented them out (//) ==> result: very fast

I erased the comment marks (//) and..... still very fast (!)  
.......(Same code, same structure,....)

So it works now, but I still do not know what caused the problem. Maybe
something in my system is corrupt.

Thanks for the suggestions.

Craige_Hales
Super User

Re: Very slow loop behaviour

Perhaps you have another report that is open, and is updating for each change you make to the table. You could do this before and after your loop:

dt << Begin Data Update;
// the loop
dt << End Data Update;
Craige
pmroz
Super User

Re: Very slow loop behaviour

This might speed things up - restrict your loop to only the rows meeting your conditions.

found_rows = dt << get rows where(dt:Statusnummer == 1 & 
				  !Is Missing(dt:Drehzahl) & 
				  !Is Missing( dt:geschw ));

for (i = 1, i <= nrows(found_rows), i++,
	k = found_rows[i];
	dt:Soll_drehzahl[k] = dt:SollDrehzahl[k];
	dt:Ist_drehzahl[k]  = dt:Drehzahl[k];
	dt:Maschinenleistung[k] = 100 * dt:Ist_drehzahl[k] / dt:Soll_drehzahl[k] ;
);
MWalther
Level II

Re: Very slow loop behaviour

Thanks, probably (see comment above, the problem vanished meanwhile) this was the reason,since many reports were open. But I cannot reproduce right now. I will introduce your solution in my code and see, if its the permanent fix.
Marten
vince_faller
Super User (Alumni)

Re: Very slow loop behaviour

Update: I added the get rows where method of finding the rows.  Obviously this has the restriction of requiring numberic data, but I sometimes am able to get around that.  

 

 

 

I'm a big fan of doing everything with matrices.  

 

Names default to here(1);
dt = New Table("Test", 
	Add Rows(1000000), 
	New Column("Statusnummer", Set Each Value(Random Integer(0, 1))), 
	New Column("Drehzahl", Set Each Value(Choose(Random Integer(1, 2), ., Random Normal()))), 
	New Column("geschw", Set Each Value(Choose(Random Integer(1, 2), ., Random Normal()))), 
	New Column("SollDrehzahl", Set Each Value(Random Normal())), 
	New Column("Soll_drehzahl"), 
	New Column("Ist_drehzahl"), 
	New Column("Maschinenleistung"), 
);

st = HPTime();
for(i=1,i<=NRows(dt),i++,			
	if(dt:Statusnummer[i] == 1,
		if(!isMissing(dt:Drehzahl[i]) & !isMissing(dt:geschw[i]),
			ist  = dt:Drehzahl[i];
			soll = dt:SollDrehzahl[i];
			dt:Soll_drehzahl[i] = soll; 
			dt:Ist_drehzahl[i] = ist; 
			dt:Maschinenleistung[i] = 100*ist/soll;	
		);
	);
);
t_loop = HPTime() - st;

st = HPTime();

m = dt[0, {"Statusnummer", "Drehzahl", "geschw"}]; // makes a matrix

//check the matrix
v = m[0, 1] // don't need to do a check if it's 0 or 1
& 
!ismissing(m[0, 2]) // check if drehzahl is missing
& 
!ismissing(m[0, 3]); // check if drehzahl is missing

rows = loc(v); // this is the rows you want to do stuff on now.  

//now just set all at once
soll = dt[rows, "Soll_drehzahl"] = dt:SollDrehzahl[rows]; // assign the rows and variable simultaneously
ist = dt[rows, "Ist_drehzahl"] = dt:Drehzahl[rows]; // same thing
dt[rows, "Maschinenleistung"] = 100 * ist :* (1/soll);  // do an element wise matrix division

t_mat = HPTime() - st;

st = HPTime();

rows = dt << get rows where(dt:Statusnummer == 1 & 
	!Is Missing(dt:Drehzahl) & 
	!Is Missing( dt:geschw ));  

//now just set all at once
soll = dt[rows, "Soll_drehzahl"] = dt:SollDrehzahl[rows]; // assign the rows and variable simultaneously
ist = dt[rows, "Ist_drehzahl"] = dt:Drehzahl[rows]; // same thing
dt[rows, "Maschinenleistung"] = 100 * ist :* (1/soll);  // do an element wise matrix division

t_get_where = HPTime() - st;

show(t_loop, t_mat, t_get_where);

 

When I ran this comparison, I got

    t_loop = 11259288;

    t_mat = 206942;

 

Updated

t_loop = 11950585;
t_mat = 205366;
t_get_where = 6742823;

 

Vince Faller - Predictum
Craige_Hales
Super User

Re: Very slow loop behaviour

@vince_faller Cool! I only see about 10X faster (which is still pretty amazing.) I suspect this speedup is unrelated to begin/end data update, but I also think the matrix assignments will do something like the begin/end data update internally.  

 

@EvanMcCorkle 

Craige