Hi, On Tue, Sep 10, 2013 at 2:03 AM, Guanglei Cui <amber.mail.arch...@gmail.com> wrote: > Hi Szilard, > > Thanks again for getting back. You may remember the previous thread I > started on regression test failure with icc 11.x compiled binary. Falling As it was not referenced, I did not recall your previous mail at the time of writing my above reply.
> back to SSE2 is my solution, and binaries compiled this way are able to > pass all regression tests, including the one with GPU switched on. However, > it is not clear to me if the GPU part is specifically tested in the > regression. In the regression test runs mdrun uses automated selection of CPU or GPU - the same way as it would happen if you were doing a standalone run. Your question reminds me that we should probably extend this behaviour so that when a GPU is present not only the GPU Verlet scheme kernels will be used in the testing. Therefore, my I advise is that regression tests on machines with a GPU and a GPU-enable builds should, for now, be done in two passes: - tests using GPU Verlet kernels: make check; - tests using CPU Verlet kernels: CUDA_VISIBLE_DEVICES="" make check (or use GMX_DISABLE_GPU_DETECTION in case of detection issues) > > As I was trying to explain in the original email, the binary works fine on > a node with proper graphics driver, but crashes on a node where the > graphics driver is older than the CUDA SDK used in compilation. I think > updating the driver may potentially enable the GPU part. Pure CPU I understood, that's what my comment regarding the less than graceful handling of some GPU detection cases. We'll improve this behaviour in one of the upcoming versions. > calculation with the same binary seems not working. It is not clear to me > if this is caused by the compiler. It's not really simple to update the gcc > to 4.7 or greater since we use CentOS 5.x in the company. Even CentOS 6.x > uses gcc 4.4.x as default. > > I've just tested the code with -nb cpu. It still crashes. The binary Have you tried setting the aforementioned environment variable, GMX_DISABLE_GPU_DETECTION? > compiled without GPU works as expected and passed all regression tests. For > now, I can keep separate binaries for GPU and CPU applications before I can > get gcc 4.7 or greater installed. Have you built the correctly functioning mdrun without GPU support on the same machine with the same compiler and libraries as your problematic GPU-enabled builds? While performance-wise it is far from the best choice, AFAIK gcc 4.4 should work OK - at least in our automated tests it does. Hence, the fact that you are using gcc 4.4 should not result in a crash when switching to a CPU-only run. I would appreciate if you could open an issue on redmine.gromacs.org, describe the behaviour you are seeing, and provide as many of the following information as possible: - log files produced with the crashing binary; - result of running with GMX_DISABLE_GPU_DETECTION; - a backtrace of the crash (build with CMAKE_BUILD_TYPE=RelWithDebInfo, run in gdb, type "bt" after the crash occurs and provide the output) and/or - run with mdrun -debug 1 and provide the mdrun.debug output. With the above information we should be able to judge what is causing the problem. Cheers, -- Szilárd > Best regards, > Guanglei > > > On Mon, Sep 9, 2013 at 4:35 PM, Szilárd Páll <szilard.p...@cbr.su.se> wrote: > >> HI, >> >> First of all, icc 11 is not well tested and there have been reports >> about it compiling broken code. This could explain the crash, but >> you'd need to do a bit more testing to confirm. Regading the GPU >> detection error, if you use a driver which is incompatible with the >> CUDA runtime (at least as high API version, see the mdrun log header's >> last two lines) and at the moment, some of such cases are not detected >> particularly gracefully. >> >> A few things to try: >> - use gcc, 4.7 is as fast or faster than any icc; >> - run with the "-nb cpu" option; does it still crash? >> - run with GPU detection completely disabled* >> - run the regressiontests; try using CPUs only* >> >> *You can set the GMX_DISABLE_GPU_DETECTION environment variable to >> completely disable the GPU detection. >> >> Cheers, >> -- >> Szilárd >> >> >> On Mon, Sep 9, 2013 at 9:52 PM, Guanglei Cui >> <amber.mail.arch...@gmail.com> wrote: >> > Dear GMX users, >> > >> > I recently compiled Gromacs 4.6.3 with CUDA (Intel compiler 11.x, SSE2, >> and >> > CUDA SDK 5.0.35). I was doing a test run with simply 'mdrun -deffnm >> > eq2_npt_verlet' (letting mdrun figure out what to use). I received the >> > error telling me my graphics driver was older than the CUDA SDK, and >> > regular CPU code would be used instead. Then, it crashed with >> Segmentation >> > Fault. The code runs properly on another node where the graphics driver >> is >> > more up to date. I wonder if the crashing is somewhat expected, and >> > therefore I should prepare different binaries based on the capabilities >> of >> > different nodes. Thanks. >> > >> > Best regards, >> > -- >> > Guanglei Cui >> > -- >> > gmx-users mailing list gmx-users@gromacs.org >> > http://lists.gromacs.org/mailman/listinfo/gmx-users >> > * Please search the archive at >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! >> > * Please don't post (un)subscribe requests to the list. Use the >> > www interface or send it to gmx-users-requ...@gromacs.org. >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> -- >> gmx-users mailing list gmx-users@gromacs.org >> http://lists.gromacs.org/mailman/listinfo/gmx-users >> * Please search the archive at >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! >> * Please don't post (un)subscribe requests to the list. Use the >> www interface or send it to gmx-users-requ...@gromacs.org. >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists >> > > > > -- > Guanglei Cui > -- > gmx-users mailing list gmx-users@gromacs.org > http://lists.gromacs.org/mailman/listinfo/gmx-users > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > * Please don't post (un)subscribe requests to the list. Use the > www interface or send it to gmx-users-requ...@gromacs.org. > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists