Re: [sage-devel] multithreading performance issues

2016-10-06 Thread Jean-Pierre Flori
On Thursday, October 6, 2016 at 1:39:05 PM UTC+2, Jonathan Bober wrote: > > I understand the reasons why OpenBLAS shouldn't be multithreading > everything, and why it shouldn't necessarily use all available cpu cores > when it does do multihreading, but the point is: it currently uses all or >

Re: [sage-devel] multithreading performance issues

2016-10-06 Thread Jonathan Bober
I understand the reasons why OpenBLAS shouldn't be multithreading everything, and why it shouldn't necessarily use all available cpu cores when it does do multihreading, but the point is: it currently uses all or one, and sometimes it decides to use multithreading even when using 2 threads doesn't

Re: [sage-devel] multithreading performance issues

2016-10-05 Thread Clement Pernet
To follow up on Jean-Pierre summary of the situation: The current version of fflas-ffpack in sage (v2.2.2) uses the BLAS provided as is. Running it with a multithreaded BLAS may result in a slower code than with a single threaded BLAS. This is very likely due to memory transfer and coherence pr

Re: [sage-devel] multithreading performance issues

2016-10-05 Thread Jean-Pierre Flori
Currently OpenBlas does what it wants for multithreading. We hesitated to disable it but prefered to wait and think about it: see https://trac.sagemath.org/ticket/21323. You can still influence its use of threads setting OPENBLAS_NUM_THREADS. See the trac ticket, just note that this is not Sage sp

Re: [sage-devel] multithreading performance issues

2016-10-05 Thread Thierry Dumont
What is the size of the matrix you use ? Whatever you do, openmp in blas is interesting only if you compute with large matrices. If your computations are embedded in an @parallel and launch n processes, be careful that your OMP_NUM_THREADS be less or equal to ncores/n. My experience is (I am do

Re: [sage-devel] multithreading performance issues

2016-10-04 Thread Jonathan Bober
I've done a few more tests finding bad performance (and some decent improvements with a few threads). Also, I double checked that the default behavior for me seems to be the same as setting OMP_NUM_THREADS=64. I wonder if others who have a recent development version of Sage see similar results. I'm

Re: [sage-devel] multithreading performance issues

2016-10-04 Thread Jonathan Bober
On Tue, Oct 4, 2016 at 9:03 PM, William Stein wrote: > On Tue, Oct 4, 2016 at 12:58 PM, Jonathan Bober wrote: > > No, in 7.3 Sage isn't multithreading in this example: > > > > jb12407@lmfdb1:~$ sage73 > > sage: %time M = ModularSymbols(5113, 2, -1) > > CPU times: user 599 ms, sys: 25 ms, total:

Re: [sage-devel] multithreading performance issues

2016-10-04 Thread William Stein
On Tue, Oct 4, 2016 at 12:58 PM, Jonathan Bober wrote: > No, in 7.3 Sage isn't multithreading in this example: > > jb12407@lmfdb1:~$ sage73 > sage: %time M = ModularSymbols(5113, 2, -1) > CPU times: user 599 ms, sys: 25 ms, total: 624 ms > Wall time: 612 ms > sage: %time S = M.cuspidal_subspace().

Re: [sage-devel] multithreading performance issues

2016-10-04 Thread Jonathan Bober
No, in 7.3 Sage isn't multithreading in this example: jb12407@lmfdb1:~$ sage73 sage: %time M = ModularSymbols(5113, 2, -1) CPU times: user 599 ms, sys: 25 ms, total: 624 ms Wall time: 612 ms sage: %time S = M.cuspidal_subspace().new_subspace() CPU times: user 1.32 s, sys: 89 ms, total: 1.41 s Wall

Re: [sage-devel] multithreading performance issues

2016-10-04 Thread Francois Bissey
openmp is disabled in linbox/ffpack-fflas so it must come from somewhere else. Only R seems to be linked to libgomp (openmp) on my vanilla install. Curiosity: do you observe the same behaviour in 7.3? François > On 5/10/2016, at 07:26, Jonathan Bober wrote: > > See the following timings: If I s

[sage-devel] multithreading performance issues

2016-10-04 Thread Jonathan Bober
See the following timings: If I start Sage with OMP_NUM_THREADS=1, a particular computation takes 1.52 cpu seconds and 1.56 wall seconds. The same computation without OMP_NUM_THREADS set takes 12.8 cpu seconds and 1.69 wall seconds. This is particularly devastating when I'm running with @parallel