Thoughts on CUDA + Clojure

Timothy Baldridge Thu, 08 Sep 2011 11:50:42 -0700

I've been kicking around an idea for some time, of starting a
Clojure->CUDA compiler. I would like to start a discussion about this
to figure out what some possible solutions are. First of all let me
start a simple fact list:


CUDA (for those who don't know) is NVIDIA's technology for writing
general use code for modern GPUs. The current system uses a subset of
C++ as it's input. The code looks like small functions/classes that
are executed for each thread of the GPU. These threads can number in
the thousands, and the GPU commonly executes hundreds of these at one
time. So, basically we're talking of running pmap on a system will
512+ cores.

CUDA 4.0 supports some very advanced C++ features. As of 4.0 CUDA
supports virtual functions, and new/delete....yes...your GPU code can
allocate memory on the fly (if you have a GeForce 4xx or greater).

My idea is to make a subset of Clojure translatable to CUDA. So you
would create input data in native memory, the the Clojure functions
would be translated to CUDA C++, then to CUDA binaries where they
would be executed in the CPU.

A very simple approach would be to take the view that may Clojure->SQL
frameworks do, and simply do a translation. In this method all CUDA
Clojure functions would take only arrays and scalar values as inputs,
and the functions would read data from arrays, and output them to
arrays. No sequences, on-the-fly allocation, or any such thing would
be allowed.  On top of that, all input and output data must be of the
same type, so no mixing doubles and floats, or ints and longs. All
data must be resolved to staticlly defined types, and mutating the
variable's type on the fly is not allowed.

The more complex approach would be to use something like ClojureScript
to compile core.clj to CUDA, and actually run a subset of Clojure on
the GPU. In this case we would have to come up with a simple type
system, and then rewrite the ClojureScript compiler to output C++ code
instead of JS. In addition, some sort of simple GC (reference
counting?) would have to be developed.  The result would be slower
than my first approach, but would be much more flexible.

----

So in the first version we have a simple to create system, but we
can't use many of the functions we are familiar with in CUDA.

In the second method, we have a slower, but much more powerful system
that would integrate much more tightly with existing code.

----

Any thoughts? Besides that I'm crazy...

Timothy

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Thoughts on CUDA + Clojure

Reply via email to