Fundamentally, the Xeon Phi programming model is not really that much different from OpenCL/Cuda. You send data to the coprocessor card, run some code there, and pull back the result to the host CPU. It doesn't speed up anything that is not specifically targeted at the coprocessor card.
If you want to use it, you first of all need a problem that is sufficiently parallelizable. Write Xeon Phi code in C/C++, compile it with the special compiler, wrap it into a shared library, load it into Cython/Python. The Intel MKL basically does that, so if we get around to implementing the proposal that I wrote earlier then at least linear algebra would be sped up on stampede. -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To post to this group, send email to sage-devel@googlegroups.com. To unsubscribe from this group, send email to sage-devel+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/sage-devel?hl=en.