Hi Mark and Szilard
Thanks for your both suggestions. They are very helpful.
>
> Neither run had a PP-PME work distribution suitable for the hardware it
> was
> running on (and fixing that for each run requires opposite changes).
> Adding
> a GPU and hoping to see scaling requires that there
As Mark said, please share the *entire* log file. Among other
important things, the result of PP-PME tuning is not included above.
However, I suspect that in this case scaling is strongly affected or
by the small size of the system you are simulating.
--
Szilárd
On Sun, Nov 10, 2013 at 5:28 AM,
On Sun, Nov 10, 2013 at 5:28 AM, Dwey Kauffman wrote:
> Hi Szilard,
>
> Thank you very much for your suggestions.
>
> >Actually, I was jumping to conclusions too early, as you mentioned AMD
> >"cluster", I assumed you must have 12-16-core Opteron CPUs. If you
> >have an 8-core (desktop?) AMD CPU
Hi Szilard,
Thank you very much for your suggestions.
>Actually, I was jumping to conclusions too early, as you mentioned AMD
>"cluster", I assumed you must have 12-16-core Opteron CPUs. If you
>have an 8-core (desktop?) AMD CPU, than you may not need to run more
>than one rank per GPU.
Yes, we
On Thu, Nov 7, 2013 at 6:34 AM, James Starlight wrote:
> I've gone to conclusion that simulation with 1 or 2 GPU simultaneously gave
> me the same performance
> mdrun -ntmpi 2 -ntomp 6 -gpu_id 01 -v -deffnm md_CaM_test,
>
> mdrun -ntmpi 2 -ntomp 6 -gpu_id 0 -v -deffnm md_CaM_test,
>
> Doest it b
Let's not hijack James' thread as your hardware is different from his.
On Tue, Nov 5, 2013 at 11:00 PM, Dwey Kauffman wrote:
> Hi Szilard,
>
>Thanks for your suggestions. I am indeed aware of this page. In a 8-core
> AMD with 1GPU, I am very happy about its performance. See below. My
Actual
First, there is no value in ascribing problems to the hardware if the
simulation setup is not yet balanced, or not large enough to provide enough
atoms and long enough rlist to saturate the GPUs, etc. Look at the log
files and see what complaints mdrun makes about things like PME load
balance, and
I've gone to conclusion that simulation with 1 or 2 GPU simultaneously gave
me the same performance
mdrun -ntmpi 2 -ntomp 6 -gpu_id 01 -v -deffnm md_CaM_test,
mdrun -ntmpi 2 -ntomp 6 -gpu_id 0 -v -deffnm md_CaM_test,
Doest it be due to the small CPU cores or addition RAM ( this system has 32
gb
Hi Dwey,
On 05/11/13 22:00, Dwey Kauffman wrote:
Hi Szilard,
Thanks for your suggestions. I am indeed aware of this page. In a 8-core
AMD with 1GPU, I am very happy about its performance. See below. My
intention is to obtain a even better one because we have multiple nodes.
### 8 core AMD
Hi Szilard,
Thanks for your suggestions. I am indeed aware of this page. In a 8-core
AMD with 1GPU, I am very happy about its performance. See below. My
intention is to obtain a even better one because we have multiple nodes.
### 8 core AMD with 1 GPU,
Force evaluation time GPU/CPU: 4.006 ms
Hi Dwey,
First and foremost, make sure to read the
http://www.gromacs.org/Documentation/Acceleration_and_parallelization
page, in particular the "Multiple MPI ranks per GPU" section which
applies in your case.
Secondly, please do post log files (pastebin is your friend), the
performance table at
Hi Mike,
I have similar configurations except a cluster of AMD-based linux
platforms with 2 GPU cards.
Your suggestion works. However, the performance of 2 GPU discourages
me because , for example, with 1 GPU, our computer node can easily
obtain a simulation of 31ns/day for a protein of
12 matches
Mail list logo