On Fri, Jul 20, 2012 at 2:16 AM, Maciek Wójcikowski
<mac...@wojcikowski.pl>wrote:
> Hi,
> I'm doing some scripting in Python to minimize ligands and want to run
> them concurrently on SMP server (many processes, one core per proc.). The
> problem that I hit is that when I run one instance, process take 100% CPU
> and runs for ~25min. For the same set if I run 32 concurrent processes it
> runs from 60-130 min. Why is that? Server has 32 phisical cores (AMD
> Opteron) so there is no turbo mode, like in intel, to produce such power
> increase. Script is not doing any IO (at least I don't know of it, it gets
> data from MySQL, doing only SELECTs).
>
> How can I debug this issue? Any thoughts?
>
There are many reasons why this might happen. You should do an experiment
starting with two process, then four, eight, etc ... and see where it
starts to slow down.
You say 32 CPUs, but now many CPU chips? Is this 4x8-core CPUs, 8x4core,
or something else?
When you run each process, how much memory does it use? How much memory do
you have?
My first guess is that you are running into problems with memory contention
-- processors fighting each other for memory. This can be either cache
memory (for small processes) or main memory (if the ligand minimization is
a very large process).
Modern CPUs are *much* faster than modern memory, which is why they have
multi-level (L1 and L2) caches (see http://en.wikipedia.org/wiki/CPU_cache),
and also have "NUMA" memory architecture (see
http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access). When you run a
single process, it gets full access to all cache memory and all main
memory. When you run up to N processes where N is the number of CPU chips
(not cores), then each CPU has full access to its caches, but there may be
contention for main memory. When you run N processes where N is the total
number of CPUs (in your case, 32), then each CPU has to share access to
both main memory and to the on-chip memory caches and data channels.
The only time a process will scale from 1 to 32 on a 32-CPU system is if
each process is compute-bound and uses so little memory that all 32
processors are mostly working from their internal caches and don't ever
have to wait for memory access.
If you want to read a long but fascinating article about how this affects
other projects, check this one out. It's about MySQL running very slowly
on very fast CPUs because of memory contention.
http://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
Craig
> ----
> Pozdrawiam, | Best regards,
> Maciek Wójcikowski
> mac...@wojcikowski.pl
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> OpenBabel-discuss mailing list
> OpenBabel-discuss@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
OpenBabel-discuss mailing list
OpenBabel-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss