I do not know if this would work with this kind of computation but I would 
suggest you try and run the programme under gdb.

This should  tell you where things go wrong. You might have recompile the 
programme and enable debugging symbols 

Peter 

Sent from my phone. Please forgive misspellings and weird “corrections”

> On 20 Nov 2022, at 18:08, Francesco Pietra <chiendar...@gmail.com> wrote:
> 
> 
> Hello
> Main board GA-X79-UD3 with two 680 GPUs
> Debian10 Linux,
> kernel 5.10.0-19-amd64
> OpenGL 4.6.0 
> nvidia driver 470.141.03
> Months ago, following updating/upgrading of amd64, the GPUs, while rendering 
> correctly, became unable to run classical molecular dynamics simulations. 
> Launching a minimization with software NAMD with both GPUs or with one of 
> them (by software or even by removing one GPU)
> 
> namd2 +idlepoll +p12 +devices 0,1 min.conf
> namd2 +idlepoll +p12 +devices 0 min.conf
> namd2 +idlepoll +p12 +devices 1 min.conf
> 
> NAMD organizes the simulation correctly but at the stage of starting the 
> computation, accessing memory, a crash occurs with error
> 
>> TCL: Minimizing for 3000 steps
>> FATAL ERROR: CUDA error cudaStreamSynchronize(stream) in file 
>> src/CudaTileListKernel.cu, function buildTileLists, line 1136
>> on Pe 4 (gig64 device 0 pci 0:2:0): an illegal memory access was encountered
>> FATAL ERROR: CUDA error in ComputeBondedCUDA::forceDoneCheck after polling 
>> 48 times over 0.005047 s on Pe 8 (gig64 device 1 pci 0:3:0): an illegal 
>> memory access was encountered
>> FATAL ERROR: CUDA error cudaStreamSynchronize(stream) in file 
>> src/CudaTileListKernel.cu, function buildTileLists, line 1136
>> on Pe 4 (gig64 device 0 pci 0:2:0): an illegal memory access was encountered
>> FATAL ERROR: CUDA error in ComputeBondedCUDA::forceDoneCheck after polling 
>> 48 times over 0.005047 s on Pe 8 (gig64 device 1 pci 0:3:0): an illegal 
>> memory access was encountered
>> [Partition 0][Node 0] End of program 
> 
> "illegal memory access" is a software error (as also proven by using 
> alternatively one of the two GPUs) that escapes all my attempts at unraveling 
> its origin. I had no clues from NAMD forum. Hope here.
> 
> Thanks for your kind attention
> 
> francesco pietra
> 
> 
> 

Reply via email to