On Fri, Feb 27, 2009 at 11:21 AM, Doug Cutting <[email protected]> wrote: > I think they're complementary. > > Hadoop's MapReduce lets you run computations on up to thousands of computers > potentially processing petabytes of data. It gets data from the grid to > your computation, reliably stores output back to the grid, and supports > grid-global computations (e.g., sorting). > > CUDA can make computations on a single computer run faster by using its GPU. > It does not handle co-ordination of multiple computers, e.g., the flow of > data in and out of a distributed filesystem, distributed reliability, global > computations, etc. > > So you might use CUDA within mapreduce to more efficiently run > compute-intensive tasks over petabytes of data. > > Doug
I actually did some work with this several months ago, using a consumer-level NVIDIA card. I found a couple of interesting things: - I used JOGL and OpenGL shaders rather than CUDA, as at least at the time there was no reasonable way to talk to CUDA through java. That made a number of things more complicated, CUDA certainly makes things simpler. For the particular problem I was working with, GLSL was fine, though CUDA would have simplified things. - The problem set I was working with involved creating and searching large amounts of hashes - 3-4 TB of them at a time. - Only 2 of my nodes in an 8 node cluster had accelerators, but they had a dramatic effect on performance. I do not have any of my test results handy, but for this particular problem the accelerators cut the job time in half or more. I would agree with Doug that the two are complimentary, though there are some similarities. Working with the GPU means you are limited by how much texture memory is available for storage (compared to HDFS, not much!), and the cost of getting data on and off the card can be high. Like many hadoop jobs, the overhead of getting data in and starting a task can easily be greater than the length of the task itself. For what I was doing, it was a good fit - but for many, many problems it would not be the right solution. > > Mark Kerzner wrote: >> >> Hi, this from Dr. Dobbs caught my attention, 240 CPU for $1,700 >> >> http://www.ddj.com/focal/NVIDIA-CUDA >> >> What are your thoughts? >> >> Thank you, >> Mark >> >
