Hi everybody,
there seems to be a problem with Parallel Assign() in combination with certain systems.
I have got access to 3 Windows 10 PCs with JMP 16.1.:
- Laptop: Core i5-8350U (4 cores) @ 1.9 GHz, 16GB Ram
- Old Workstation: Xeon E5-1660 v4 (8cores) @ 3.2 GHz, 256GB Ram
- New Workstation Xeon W-2145 (8 cores) @3.7 GHz, 128GB Ram
I compared the performance of the parallel and the sequential version of my test script using different problem sizes on all systems (scipt below).
The result was surprisingly bad. Only on my laptop the parallel version is slightly faster than the sequential version, but up to more than 2 times slower on the workstations.
Did anyone experience similar performance issues with Parallel Assign()? Or am I using it wrong? What is going on here?
Rob
n_different_ids = 100000;
// Just generate random data. Every 15 rows must be processed at the same time.
DT_Data = J( 15*n_different_ids, 9, 0 );
For ( i = 1, i <= n_different_ids, i++,
DT_Data[15*(i-1)+(1::15),0] = i + J( 15, 9, Random Normal() );
);
// some function
F_percentile = Function( {x , p},
x = Sort Ascending(x);
n = N Rows(x);
index = 1 + (n - 1) * p;
index_ibelow = Floor(index);
index_iabove = Ceiling(index);
h = index - index_ibelow;
result = (1 - h) * x[index_ibelow] + h * x[index_iabove];
result;
);
// parallel version
DT_Features = J( n_different_ids, 16, 0 );
start=tick seconds();
Parallel Assign( {
DT_Data = DT_Data,
F_percentile = Name Expr( F_percentile )
},
DT_Features[i,j] = (
i_same_group = Loc(DT_Data[15*(i-1)+(1::15), 1]);
data = DT_Data[15*(i-1)+i_same_group, 1+Ceiling(j/2)];
If (mod(j,2),
result = F_percentile(data, 0.95);
,
result = F_percentile(data, 0.05);
);
result;
)
);
time_parallel = tick seconds()-start;
Show(time_parallel);
Wait(0.1);
// sequential version
DT_Features = J( n_different_ids, 16, 0 );
start=tick seconds();
for(i=1,i<=n_different_ids,i++,
for(j=1,j<=16,j++,
i_same_group = Loc(DT_Data[15*(i-1)+(1::15), 1]);
data = DT_Data[15*(i-1)+i_same_group, 1+Ceiling(j/2)];
If (mod(j,2),
result = F_percentile(data, 0.95);
,
result = F_percentile(data, 0.05);
);
DT_Features[i,j] =result;
);
);
time_sequential = tick seconds()-start;
Show(time_sequential);