Hi Chris,

your reply makes me suspicious that you may not really be rendering on your tesla yet. LIBGL_ALWAYS_INDIRECT should not be needed and as far as I know this is a Mesa specific variable. Even in an interactive batch job you shouldn't use x11 forwarding for ParaView (I couldn't tell if you are using it or not, but I wanted to be clear about that because it doesn't scale well). If I were you I'd verify that you're really using the your tesla by examining the output of glxinfo.

Burlen

On 10/28/2014 02:52 AM, R C Bording wrote:
HI Burlen,
Yes I am for the purpose of testing on our debug queue. But you are bang on with setting the DISPLAY environment variable.

so setting in the preview module
setenv LIBGL_ALWAYS_INDIRECT 1
or in bash in the job script
export LIBGL_ALWAYS_INDIRECT 1

but also adding
export DISPLAY=:1
is needed to render on the GPU.

Renders the parallelSphere.py example with no errors across multiple nodes.

my mpirun command looks like this

mpirun pvbatch parallelSphere.py

note we have PBSpro installed so it determines the -np ##-number of processors/cores based on the
#PBS -l select=.....

So now to see if I can render something awesome!

Chris B

On 27/10/2014, at 11:10 PM, Burlen Loring wrote:

Hi Christopher,

Are you by any chance logged in with ssh X11 forwarding (ssh -X ...)? It seems the error you report comes up often in that context. X forwarding would not be the right way to run PV on your cluster.

Depending on how your cluster is setup you may need to start up the xserver before launching PV, and make sure to close it after PV exits. IUn that scenario your xorg.conf would specify the nvidia driver and a screen for each gpu which you would refernece in the shell used to start PV through the DISPLAY variable. If you already have x11 running and screens configured then it's just a matter of setting the display variable correctly. When there are multiple GPU's per node then you'd need to set the display using mpi rank modulo the number of gpus per node.

I'm not sure it matters that much but I don't think that you want --use-offscreen-rendering option.

Burlen

On 10/26/2014 10:23 PM, R C Bording wrote:
Hi,
Managed to get a "working version of Paraview-4.2.0.1" on our GPU cluster but when I try to run the parallelSphere.py script on more than one node it just hangs. Work like it is supposed to up to 12 cores on a single node. I am still trying work out if I a running on the GPU "tesla- C2070).

Here is the list of cake configurations

IBS_TOOL_CONFIGURE='-DCMAKE_BUILD_TYPE=Release \
-DParaView_FROM_GIT=OFF \
-DParaView_URL=$MYGROUP/vis/src/ParaView-v4.2.0-source.tar.gz \
-DENABLE_boost=ON \
-DENABLE_cgns=OFF \
-DENABLE_ffmpeg=ON \
-DENABLE_fontconfig=ON \
-DENABLE_freetype=ON \
-DENABLE_hdf5=ON \
-DENABLE_libxml2=ON \
-DENABLE_matplotlib=ON \
-DENABLE_mesa=OFF \
-DENABLE_mpi=ON \
-DENABLE_numpy=ON \
-DENABLE_osmesa=OFF \
-DENABLE_paraview=ON \
-DENABLE_png=ON \
-DENABLE_python=ON \
-DENABLE_qhull=ON \
-DENABLE_qt=ON \
-DENABLE_silo=ON \
-DENABLE_szip=ON \
-DENABLE_visitbridge=ON \
-DMPI_CXX_LIBRARIES:STRING="$MPI_HOME/lib/libmpi_cxx.so" \
-DMPI_C_LIBRARIES:STRING="$MPI_HOME/lib/libmpi.so" \
-DMPI_LIBRARY:FILEPATH="$MPI_HOME/lib/libmpi_cxx.so" \
-DMPI_CXX_INCLUDE_PATH:STRING="$MPI_HOME/include" \
-DMPI_C_INCLUDE_PATH:STRING="$MPI_HOME/include" \
-DUSE_SYSTEM_mpi=ON \
-DUSE_SYSTEM_python=OFF \
-DUSE_SYSTEM_qt=OFF \
-DUSE_SYSTEM_zlib=OFF '

The goal is to be able to support batch rendering on the whole cluster ~96 nodes.

Also do I need set another environment variable in my Paraview module to make the Xlib
warning go away?

[cbording@f100 Paraview]$ mpirun -n 12 pvbatch --use-offscreen-rendering parallelSphere.py
Xlib:  extension "NV-GLX" missing on display "localhost:50.0".
Xlib:  extension "NV-GLX" missing on display "localhost:50.0".
Xlib:  extension "NV-GLX" missing on display "localhost:50.0".
Xlib:  extension "NV-GLX" missing on display "localhost:50.0".
Xlib:  extension "NV-GLX" missing on display "localhost:50.0".
Xlib:  extension "NV-GLX" missing on display "localhost:50.0".

Is this related to my not being able to run across multiple nodes?

R. Christopher Bording
Supercomputing Team-iVEC@UWA
E: [email protected] <mailto:[email protected]>
T: +61 8 6488 6905

26 Dick Perry Avenue,
Technology Park
Kensington, Western Australia.
6151







_______________________________________________
Powered bywww.kitware.com

Visit other Kitware open-source projects 
athttp://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki 
at:http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/mailman/listinfo/paraview



_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the ParaView Wiki at: 
http://paraview.org/Wiki/ParaView

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/mailman/listinfo/paraview

Reply via email to