cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
vince_faller
Super User (Alumni)

CPU performance: Clock Speed or Cores

When running JMP, would it be better to have more cores at lower frequency or vice versa? I thought it was the case that very few of the platforms were threaded so that would lead me to believe that clock speed is more important.  But I'm not sure this is still true.  I know I'm speaking in a general case and that's next to impossible to accurately speak to.  Thanks anyway.  

Vince Faller - Predictum
1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: CPU performance: Clock Speed or Cores

How well JMP uses the available processors depends on which analysis platform is running and the data being processed. Here's JMP running Text Explorer during the text parsing phase, on long documents, with number of rows > number of processors. (Other platforms will behave differently and some will not be able to take advantage of threading. Text Explorer's parsing is something I've worked with recently.)

100% busy100% busy

Text Explorer's algorithms are not always able to keep the machine 100% busy. Some work depends on other work being completed first, and some work that is threaded competes for resources (like memory access) in a way that makes threads wait.

dt = New Table( "bigtext",
    Add Rows( 1000 ),
    New Column( "text", Character, "Nominal", formula( Repeat( "aaa bbb ccc ", 1e5 ) ) )
);

dt << Text Explorer(
    Text Columns( :text )
);

Adding more CPUs has a limit; here all 32/64 CPUs look busy but the work is proceeding at about the same rate:

100% busy but not making progress much faster100% busy but not making progress much faster

I suspect these CPUs are getting in each other's way. A GB of documents moving through memory has a lot of opportunity to churn the cache memory in the CPUs. Mostly JMP is developed on machines with 4 to 12 CPUs and that's likely to be what the algorithms are tuned for. For example, after Text Explorer finishes the parsing step, it combines the parsing results using half of the processors because we determined that was faster than using all of them...on an 8 CPU machine.

So, yes, 4-12 CPUs, faster CPUs, and enough memory to avoid paging. If you don't have enough memory to hold the data without paging to disk, nothing else will help.

Craige

View solution in original post

5 REPLIES 5
txnelson
Super User

Re: CPU performance: Clock Speed or Cores

Vince,

Most of the analytical platforms are multi-threaded, while, JSL is mostly single threaded.  So it is a mixed case.  What I suggest, is to open the Task Manager, and then the Perfomance Tab within it.  Then if you run your typical job flow, you will clearly see when your applications are hitting on multiple cores and when it is single core.  From there, you can make your decision.

Jim
ih
Super User (Alumni) ih
Super User (Alumni)

Re: CPU performance: Clock Speed or Cores

I have not witnessed a single instance of JMP use more than 6 cores out of 12 on a Windows 7 desktop. It usually chugs along with 25% processor usage.
txnelson
Super User

Re: CPU performance: Clock Speed or Cores

My findings are different, and it may be due to the platforms I have been testing with.  When I run a Random Forrest, I easily run on all cpus, under both on Windows 7 and Windows 10.  On average, taking up 35%+ of total cpu capability.  I only have 8 processes....not your wonderful total of 12, but all 8 are working on the problem.

Jim
ih
Super User (Alumni) ih
Super User (Alumni)

Re: CPU performance: Clock Speed or Cores

@vince_faller: to answer your original question my personal recommendation to optimize one copy of JMP (on windows) is to get the fastest 4 or 6 processors you can.

 

@txnelson: I should have been more clear, JMP often uses a fraction of all of the cores, but it does not use them all to capacity.  That leads me to believe that JMP is running with fewer than 8 or 12 actual 'worker' threads, and the processing is moved from core to core by the operating system.

 

This property of JMP bothers me, thus I spent some time testing my theory that JMP only uses a fraction of the total cores. Here are results when concurrently running 1, 2, 3, or 4 copies of the script below each in their own instance of JMP.

 

One Instance

19-21 seconds

CPU 1.PNG

Two Instances

26-28 seconds

CPU 2.PNG

 

Three Instances

34-39 seconds

CPU 3.PNG

 

Four Instances

43-47 seconds

cpu 4.PNG

 

The time to run two copies at once is certainly less than the time to run one twice:

 

cpu times.png

 

Names default to here( 1);

dt = Open( "$Sample_data/probe.jmp", Private );
Random reset(1);
valcol = dt << New Column( "Validation",
	Numeric, "Nominal", Set formula( Random Category( 0.75, 0, 0.25, 1, 2 ) ),
	Value Labels( {0 = "Training", 1 = "Validation", 2 = "Test"} ), Use Value Labels( 1 )
);
dt << Run Formulas; valcol << Delete Formula;

timeboot = Function( {},
	start = Today();
	xvars = Substitute(Char((dt << Get Column Names)[8::394]), "{", "", "}", "");
	
	Eval( Parse(
		"rf = dt << Bootstrap Forest(
			Y( :Process ),
			X( " || xvars || " ),
			Validation Portion( :Validation ),
			Method( \!"Bootstrap Forest\!" ),
			Portion Bootstrap( 1 ),
			Number Terms( 30 ),
			Number Trees( 2000 ),
			Early Stopping( 0 ),
			Go
		);"
	));
	
	
	rf << Close Window;
	end = Today();
	return(end - start);
);

win = New window("Time Bootstrap",
	V List Box(
		r = text box( "", << Set Width( 100 ) ),
		H List Box(
			Button Box( "Start",  r << Set Text( Char( timeboot() ) || " seconds" ) ),
			Button Box( "Close", dt << Close Window; win << Close Window; )
		)
	)
);
Craige_Hales
Super User

Re: CPU performance: Clock Speed or Cores

How well JMP uses the available processors depends on which analysis platform is running and the data being processed. Here's JMP running Text Explorer during the text parsing phase, on long documents, with number of rows > number of processors. (Other platforms will behave differently and some will not be able to take advantage of threading. Text Explorer's parsing is something I've worked with recently.)

100% busy100% busy

Text Explorer's algorithms are not always able to keep the machine 100% busy. Some work depends on other work being completed first, and some work that is threaded competes for resources (like memory access) in a way that makes threads wait.

dt = New Table( "bigtext",
    Add Rows( 1000 ),
    New Column( "text", Character, "Nominal", formula( Repeat( "aaa bbb ccc ", 1e5 ) ) )
);

dt << Text Explorer(
    Text Columns( :text )
);

Adding more CPUs has a limit; here all 32/64 CPUs look busy but the work is proceeding at about the same rate:

100% busy but not making progress much faster100% busy but not making progress much faster

I suspect these CPUs are getting in each other's way. A GB of documents moving through memory has a lot of opportunity to churn the cache memory in the CPUs. Mostly JMP is developed on machines with 4 to 12 CPUs and that's likely to be what the algorithms are tuned for. For example, after Text Explorer finishes the parsing step, it combines the parsing results using half of the processors because we determined that was faster than using all of them...on an 8 CPU machine.

So, yes, 4-12 CPUs, faster CPUs, and enough memory to avoid paging. If you don't have enough memory to hold the data without paging to disk, nothing else will help.

Craige