On 7 Sep, 00:06, cnb <[EMAIL PROTECTED]> wrote: > If I buy a multicore computer and I have really intensive program. How > would that be distributed across the cores?
Distribution of processes and threads across processors (cores or CPUs) is managed by the operating system. > Will algorithms always have to be programmed and told specifically to > run on several cores so if not told it will only utilize one core? One particular program has to be programmed for concurrency to utilize multiple cores. But you typically have more than one program running. > So is the free lunch really over or is this just an overhyped > phenomena? Two slow cores are better than one fast for most purposes. For one thing, it saves power. It's good for the battries and environment alike. > Is threading with Python hard? It's not harder than with other systems. You just subclass threading.Thread, which has almost the same interface as Java threads. Threading with Python is perhaps a bit easier than with other common platforms, due to the Queue.Queue object and the lack of volatile objects. > Can you start several processes with Python or just threads? You can do both. However, remember that Python threads only do what threads were designed to do back in the 1990s. That is asynchrony for I/O and UIs, not concurrency on multiple processors for CPU bound computing. This is due to the "Global Interpreter Lock". The GIL is better than fine-grained locks for single-threading and concurrency with multiple processes, but prevent python threads form being used for concurrency (just as well). You can do concurrency with Java threads or Win32 threads, but this is merely a side-effect. You will often see claims form novice programmers that threads are the only route to concurrency on multi- core CPUs. In addition to the existence of processes, direct use of threads from Java, .NET, POSIX, or Win32 APIs is not even the preferred way of programming for concurrency. Tinkering with low-level threading APIs for concurrency is error-prone and inefficient. You will spend a lot of time cleansing your code of dead-locks, live- locks, volatile objects not being declared volatile, and race conditions. In addition to that, chances are your code will not perform or scale very well due to memory contention, cache line misses, inefficient use of registers due to volatile objects, etc. The list is endless. That is why Java 6 and .NET 3.5 provide other abstractions for multi-core concurrency, such as ForkJoin and Parallel.For. This is also the rationale for using an OpenMP enabled compiler for C or Fortran, auto-vectorizing C or Fortran compilers, and novel languages like cilk and erlang. Traditionally, concurrency om parallel computers have been solved using tools like BSPlib, MPI, vectorizing Fortran compilers, and even "ebarassingly parallel" (running multiple instances of the same program on different data). OpenMP is a recent addition to the concurrency toolset for SMP type parallel computers (to which multi- core x86 processors belong). If you really need concurrency with Python, look into MPI (PyMPI, PyPAR, mpi4py), Python/BSP, subprocess module, os.fork (excluding Windows), pyprocessing package, or Parallel Python. BSP is probably the least error-prone paradigm for multi-core concurrency, albeit not the most efficient. If you decide to move an identified bottleneck from Python to C or Fortran, you also have the option of using OpenMP or cilk to ease the work of programming for concurrency. This is my preferred way of dealing with bad bottlenecks in numerical computing. Remember that you need not learn the overly complex Python C API. Cython, ctypes, f2py, or scipy.weave will do just as well. This approach will require you to manually release the GIL, which can be done in several ways: - In C extensions between Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS macros. - When calling DLL methods using ctypes.cdll or ctypes.windll (not ctypes.pydll). - In a "with nogil:" block in a Cython/Pyrex extension. - With f2py or SWIG, although I have not looked at the details. (I don't use them.) Other things to consider: - Programs that run fast enough run fast enough, even if they only utilize one core. To qoute C.A.R. Hoare and Donald Knuth, "premature optimization is the root of all evil in computer programming." - Psyco, a Python JIT compiler, will often speed up algorithmic code. Using psyco require to change to your code. Try it and see if your programs runs fast enough afterwards. YouTube is rumoured to use psyco to speed ut their Python backend. - Always use NumPy or SciPy if you do numerical work. They make numerical code easier to program. The numerical code also runs a lot faster than a pure python equivalent. - Sometimes Python is faster than your hand-written C. This is particularly the case for Python code that make heavy use of built-in primitives and objects from the standard library. You will spend a lot of time tuning a linked list or dynamic array to match the performance of a Python list. Chances are you'll never come up with a sort as fast as Python's timsort. You'll probably never make your own hash table that can compete with Pythons dictionaries and sets, etc. Even if you can, the benefit will be minute and certainly not worth the effort. - You will get tremendous speedups (often x200 over pure Python) if you can move a computational bottleneck to C, C++, Fortran, Cython, or a third-party library (FFTW, LAPACK, Intel MKL, etc.) - Portions of your Python code that do not constitute important bottlenecks can just be left in Python. You will not gain anything substantial from migrating these parts to C, as other parts of your code dominate. Use a profiler to indentify computational bottlenecks. It will save you a lot of grief fiddling with premature optimizations. That's my fifty cents on Python coding for speed. -- http://mail.python.org/mailman/listinfo/python-list