On 26/02/15 18:34, John Ladasky wrote:
Hi Sturla, I recognize your name from the scikit-learn mailing list.
If you look a few posts above yours in this thread, I am aware of gpu-libsvm.
I don't know if I'm up to the task of reusing the scikit-learn wrapping code,
but I am giving that option some serious thought. It isn't clear to me that
gpu-libsvm can handle both SVM and SVR, and I have need of both algorithms.
My training data sets are around 5000 vectors long. IF that graph on the
gpu-libsvm web page is any indication of what I can expect from my own data (I
note that they didn't specify the GPU card they're using), I might realize a
20x increase in speed.
A GPU is a "floating point monster", not a CPU. It is not designed to
run things like CPython. It is also only designed to run threads in
parallel on its cores, not processes. And as you know, in Python there
is something called GIL. Further the GPU has hard-wired fine-grained
load scheduling for data-parallel problems (e.g. matrix multiplication
for vertex processing in 3D graphics). It is not like a thread on a GPU
is comparable to a thread on a CPU. It is more like a parallel work
queue, with the kind of abstraction you find in Apple's GCD.
I don't think it really doable to make something like CPython run with
thousands of parallel instances on a GPU. A GPU is not designed for
that. A GPU is great if you can pass millions of floating point vectors
as items to the work queue, with a tiny amount of computation per item.
It would be crippled if you passed a thousand CPython interpreters and
expect them to do a lot of work.
Also, as it is libSVM that does the math in you case, you need to get
libSVM to run on the GPU, not CPython.
In most cases the best hardware for parallel scientific computing
(taking economy and flexibility into account) is a Linux cluster which
supports MPI. You can then use mpi4py or Cython to use MPI from your
Python code.
Sturla
--
https://mail.python.org/mailman/listinfo/python-list