My first suggestion would be to keep the rendering in Python, if at all feasible, and do only the actual simulation/computation in C. Rasterizing a heightfield and rigid body plus some plash effects is nothing that couldnt be done in PyOpenGL, or even something higher- level like visvis or mayavi. (visvis would be my first suggestion)
I would run the simulation in a python subprocess that calls into the C dll; that should give you at least one core fully focussed on your computations, without competition/blocking from any other modules. Marshalling the simulation data between subprocesses should be a small performance hurdle relative to getting it on your GPU; either way, the python subprocessing module makes it entirely painless. And I would start with an implementation in numpy; probably, you will be doing some fft's and/or boundary element stuff. This can all be done fairly efficiently in numpy; at minimum it will give you a reference implementation, and its not unlikely it will work well enough that you dont even want to bother with the C implementation anymore. -- http://mail.python.org/mailman/listinfo/python-list