Thanks very much, Szilárd. Our IT just found out we purchased the latest Intel compiler, but apparently was never installed. Now, I can check if this happens with the new compiler. I may or may not follow up with a bug report if I can't reproduce the behavior.
Regards, On Sun, Sep 15, 2013 at 1:39 PM, Szilárd Páll <szilard.p...@cbr.su.se>wrote: > Hi, > > On Tue, Sep 10, 2013 at 2:03 AM, Guanglei Cui > <amber.mail.arch...@gmail.com> wrote: > > Hi Szilard, > > > > Thanks again for getting back. You may remember the previous thread I > > started on regression test failure with icc 11.x compiled binary. Falling > As it was not referenced, I did not recall your previous mail at the > time of writing my above reply. > > > back to SSE2 is my solution, and binaries compiled this way are able to > > pass all regression tests, including the one with GPU switched on. > However, > > it is not clear to me if the GPU part is specifically tested in the > > regression. > > In the regression test runs mdrun uses automated selection of CPU or > GPU - the same way as it would happen if you were doing a standalone > run. Your question reminds me that we should probably extend this > behaviour so that when a GPU is present not only the GPU Verlet scheme > kernels will be used in the testing. > > Therefore, my I advise is that regression tests on machines with a GPU > and a GPU-enable builds should, for now, be done in two passes: > - tests using GPU Verlet kernels: make check; > - tests using CPU Verlet kernels: CUDA_VISIBLE_DEVICES="" make check > (or use GMX_DISABLE_GPU_DETECTION in case of detection issues) > > > > > As I was trying to explain in the original email, the binary works fine > on > > a node with proper graphics driver, but crashes on a node where the > > graphics driver is older than the CUDA SDK used in compilation. I think > > updating the driver may potentially enable the GPU part. Pure CPU > > I understood, that's what my comment regarding the less than graceful > handling of some GPU detection cases. We'll improve this behaviour in > one of the upcoming versions. > > > calculation with the same binary seems not working. It is not clear to me > > if this is caused by the compiler. It's not really simple to update the > gcc > > to 4.7 or greater since we use CentOS 5.x in the company. Even CentOS 6.x > > uses gcc 4.4.x as default. > > > > I've just tested the code with -nb cpu. It still crashes. The binary > > Have you tried setting the aforementioned environment variable, > GMX_DISABLE_GPU_DETECTION? > > > compiled without GPU works as expected and passed all regression tests. > For > > now, I can keep separate binaries for GPU and CPU applications before I > can > > get gcc 4.7 or greater installed. > > Have you built the correctly functioning mdrun without GPU support on > the same machine with the same compiler and libraries as your > problematic GPU-enabled builds? While performance-wise it is far from > the best choice, AFAIK gcc 4.4 should work OK - at least in our > automated tests it does. > Hence, the fact that you are using gcc 4.4 should not result in a > crash when switching to a CPU-only run. > > I would appreciate if you could open an issue on redmine.gromacs.org, > describe the behaviour you are seeing, and provide as many of the > following information as possible: > - log files produced with the crashing binary; > - result of running with GMX_DISABLE_GPU_DETECTION; > - a backtrace of the crash (build with > CMAKE_BUILD_TYPE=RelWithDebInfo, run in gdb, type "bt" after the crash > occurs and provide the output) and/or > - run with mdrun -debug 1 and provide the mdrun.debug output. > > With the above information we should be able to judge what is causing > the problem. > > Cheers, > -- > Szilárd > > > Best regards, > > Guanglei > > > > > > On Mon, Sep 9, 2013 at 4:35 PM, Szilárd Páll <szilard.p...@cbr.su.se> > wrote: > > > >> HI, > >> > >> First of all, icc 11 is not well tested and there have been reports > >> about it compiling broken code. This could explain the crash, but > >> you'd need to do a bit more testing to confirm. Regading the GPU > >> detection error, if you use a driver which is incompatible with the > >> CUDA runtime (at least as high API version, see the mdrun log header's > >> last two lines) and at the moment, some of such cases are not detected > >> particularly gracefully. > >> > >> A few things to try: > >> - use gcc, 4.7 is as fast or faster than any icc; > >> - run with the "-nb cpu" option; does it still crash? > >> - run with GPU detection completely disabled* > >> - run the regressiontests; try using CPUs only* > >> > >> *You can set the GMX_DISABLE_GPU_DETECTION environment variable to > >> completely disable the GPU detection. > >> > >> Cheers, > >> -- > >> Szilárd > >> > >> > >> On Mon, Sep 9, 2013 at 9:52 PM, Guanglei Cui > >> <amber.mail.arch...@gmail.com> wrote: > >> > Dear GMX users, > >> > > >> > I recently compiled Gromacs 4.6.3 with CUDA (Intel compiler 11.x, > SSE2, > >> and > >> > CUDA SDK 5.0.35). I was doing a test run with simply 'mdrun -deffnm > >> > eq2_npt_verlet' (letting mdrun figure out what to use). I received the > >> > error telling me my graphics driver was older than the CUDA SDK, and > >> > regular CPU code would be used instead. Then, it crashed with > >> Segmentation > >> > Fault. The code runs properly on another node where the graphics > driver > >> is > >> > more up to date. I wonder if the crashing is somewhat expected, and > >> > therefore I should prepare different binaries based on the > capabilities > >> of > >> > different nodes. Thanks. > >> > > >> > Best regards, > >> > -- > >> > Guanglei Cui > >> > -- > >> > gmx-users mailing list gmx-users@gromacs.org > >> > http://lists.gromacs.org/mailman/listinfo/gmx-users > >> > * Please search the archive at > >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > >> > * Please don't post (un)subscribe requests to the list. Use the > >> > www interface or send it to gmx-users-requ...@gromacs.org. > >> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > >> -- > >> gmx-users mailing list gmx-users@gromacs.org > >> http://lists.gromacs.org/mailman/listinfo/gmx-users > >> * Please search the archive at > >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > >> * Please don't post (un)subscribe requests to the list. Use the > >> www interface or send it to gmx-users-requ...@gromacs.org. > >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > >> > > > > > > > > -- > > Guanglei Cui > > -- > > gmx-users mailing list gmx-users@gromacs.org > > http://lists.gromacs.org/mailman/listinfo/gmx-users > > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > > * Please don't post (un)subscribe requests to the list. Use the > > www interface or send it to gmx-users-requ...@gromacs.org. > > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > -- > gmx-users mailing list gmx-users@gromacs.org > http://lists.gromacs.org/mailman/listinfo/gmx-users > * Please search the archive at > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! > * Please don't post (un)subscribe requests to the list. Use the > www interface or send it to gmx-users-requ...@gromacs.org. > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists > -- Guanglei Cui -- gmx-users mailing list gmx-users@gromacs.org http://lists.gromacs.org/mailman/listinfo/gmx-users * Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting! * Please don't post (un)subscribe requests to the list. Use the www interface or send it to gmx-users-requ...@gromacs.org. * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists