On Tue, 8 Sep 2009 11:22:41 +0200 Francesco Pietra <[email protected]> wrote:
> Chemical computations, such as of molecular dynamics, that rely on > clusters or uma-type computers, are starting to be performed through > GPGPU technology, that is by putting graphical boards to general > floating point use. The first reports are of 10 to 80 times speeding > up with respect to the best single processors, i.e., something that so > far required big multicore machines for traditional computing. NVIDIA > CUDA seems to be a leader in this area. > > As an amd64 user on traditional uma-type keyboards or clusters, may I > ask where to get independent information as to the hardware/software > required for GPGPU computing? > > thanks > francesco pietra > > There is also: http://gpgpu.org/ http://www.khronos.org/opencl/ http://code.google.com/p/thrust/ As far as Nvidia is concerned, any Geforce8 or newer card is supported by CUDA. You can find the complete list at: http://www.nvidia.com/object/cuda_learn_products.html I've been looking into this as well for implementing a high-performance real-time measurement system (high-speed camera connected via gigabit ethernet, and real-time image processing on the GPU), and I've come to understand three things about GPGPU, which might be of interest to you as well: 1) One problem related to GPGPU is that of the overhead of transferring data to and from the GPU. It must be that the required computation is "heavy" enough, in order to make good use of the massive GPU processor and hide the delays of data transfers. Otherwise, you might find that the GPU-based implementation is slower than the CPU-based one. 2) Much related to (1), is the importance of proper memory management on the GPU. There are many papers and publications about this out there. 3) All these incredible performances that are quoted by manufacturers (now in excess of 1 TFLOP) are for single-precision floating point math. If you need double-precision, then you should look in the "finer print", where you will see that double-precision is about 5-10 times slower (compared to single-precision). As an example, Nvidia Tesla C1060 claims 933 GFLOPS in single precision, and 78 GFLOPS in double precision. Cheers, Dimitris -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected]

