Re: [Freesurfer] CUDA Error - all CUDA-capable devices are busy or unavailable

Pedro Paulo de Magalhães Oliveira Junior Thu, 19 Aug 2010 11:04:27 -0700

Actually I believe it's related with the CUDA architecture itself.

You cannot run multiple CUDA process at the same time in the same GPU, in a
multi-core environment it may happen if you start multiple recon-all.


---------------------------------------------------------------------
Pedro Paulo de Magalhães Oliveira Junior
Diretor de Operações
Netfilter & SpeedComm Telecom
-- www.netfilter.com.br
-- For mobile: http://www.netfilter.com.br/mobile




On Thu, Aug 19, 2010 at 13:57, Nick Schmansky <ni...@nmr.mgh.harvard.edu>wrote:

> hello cuda beta users!  this problem 'all CUDA-capable
> devices are busy or unavailable.', seems to fall into the category of
> 'post-release curse', because i am seeing this problem locally as well,
> but havent seen it in the months we've been using the gpu code.  we have
> found rebooting the machine seems to work, but thats not a real
> solution.  i suspect our detection scheme is tripping a flag in the
> driver thats not getting untripped or cleared the next time around.
>
> when we find a better solution, we'll post new _cuda libs on our site,
> which i'm expecting will be a regular occurrence over the next few
> months.  glad to see so many willing gpu users though!
>
> n.
>
>
> On Thu, 2010-08-19 at 17:08 +0200, Daniel Guellmar wrote:
> > Hi folks,
> >
> > I'm trying to employ the new cuda binaries, which come with freesurfer
> > version 5.0.0, however, if I'm trying to execute a cuda binary (e.g.
> > mri_ca_register_cuda) I get the following output:
> >
> >  Acquiring CUDA device
> >  Using default device
> >  CUDA Error in file 'devicemanagement.cu' on line 46 : all CUDA-capable
> > devices are busy or unavailable.
> >
> > This error occurs on two different systems which are cuda capable. Both
> > systems run with Ubuntu 9.10, both have the latest developer driver for
> > linux (256.40) and the latest cuda toolkit (3.1) on it. The GPU
> > Computing SDK code samples compile and work fine. The device query on
> > both hosts work fine ... see following output
> >
> > Host 1:
> >
> >  CUDA Device Query (Runtime API) version (CUDART static linking)
> >
> > There are 2 devices supporting CUDA
> >
> > Device 0: "Tesla C2050"
> >   CUDA Driver Version:                           3.10
> >   CUDA Runtime Version:                          3.10
> >   CUDA Capability Major revision number:         2
> >   CUDA Capability Minor revision number:         0
> >   Total amount of global memory:                 2817720320 bytes
> >   Number of multiprocessors:                     14
> >   Number of cores:                               448
> >   Total amount of constant memory:               65536 bytes
> >   Total amount of shared memory per block:       49152 bytes
> >   Total number of registers available per block: 32768
> >   Warp size:                                     32
> >   Maximum number of threads per block:           1024
> >   Maximum sizes of each dimension of a block:    1024 x 1024 x 64
> >   Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
> >   Maximum memory pitch:                          2147483647 bytes
> >   Texture alignment:                             512 bytes
> >   Clock rate:                                    1.15 GHz
> >   Concurrent copy and execution:                 Yes
> >   Run time limit on kernels:                     Yes
> >   Integrated:                                    No
> >   Support host page-locked memory mapping:       Yes
> >   Compute mode:                                  Default (multiple host
> > threads can use this device simultaneously)
> >   Concurrent kernel execution:                   Yes
> >   Device has ECC support enabled:                Yes
> >
> > Device 1: "Tesla C2050"
> >   CUDA Driver Version:                           3.10
> >   CUDA Runtime Version:                          3.10
> >   CUDA Capability Major revision number:         2
> >   CUDA Capability Minor revision number:         0
> >   Total amount of global memory:                 2817982464 bytes
> >   Number of multiprocessors:                     14
> >   Number of cores:                               448
> >   Total amount of constant memory:               65536 bytes
> >   Total amount of shared memory per block:       49152 bytes
> >   Total number of registers available per block: 32768
> >   Warp size:                                     32
> >   Maximum number of threads per block:           1024
> >   Maximum sizes of each dimension of a block:    1024 x 1024 x 64
> >   Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
> >   Maximum memory pitch:                          2147483647 bytes
> >   Texture alignment:                             512 bytes
> >   Clock rate:                                    1.15 GHz
> >   Concurrent copy and execution:                 Yes
> >   Run time limit on kernels:                     Yes
> >   Integrated:                                    No
> >   Support host page-locked memory mapping:       Yes
> >   Compute mode:                                  Default (multiple host
> > threads can use this device simultaneously)
> >   Concurrent kernel execution:                   Yes
> >   Device has ECC support enabled:                Yes
> >
> > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.10, CUDA
> > Runtime Version = 3.10, NumDevs = 2, Device = Tesla C2050, Device =
> > Tesla C2050
> >
> >
> > Host 2:
> >
> >  CUDA Device Query (Runtime API) version (CUDART static linking)
> >
> > There is 1 device supporting CUDA
> >
> > Device 0: "Tesla C1060"
> >   CUDA Driv  CUDA Runtime Version:                          3.10
> >   CUDA Capability Major revision number:         1
> >   CUDA Capability Minor revision number:         3
> >   Total amount of global memory:                 4294770688 bytes
> >   Number of multiprocessors:                     30
> >   Number of cores:                               240
> >   Total amount of constant memory:               65536 bytes
> >   Total amount of shared memory per block:       16384 bytes
> >   Total number of registers available per block: 16384
> >   Warp size:                                     32
> >   Maximum number of threads per block:           512
> >   Maximum sizes of each dimension of a block:    512 x 512 x 64
> >   Maximum sizes of each dimension of a grid:     65535 x 65535 x 1
> >   Maximum memory pitch:                          2147483647 bytes
> >   Texture alignment:                             256 bytes
> >   Clock rate:                                    1.30 GHz
> >   Concurrent copy and execution:                 Yes
> >   Run time limit on kernels:                     No
> >   Integrated:                                    No
> >   Support host page-locked memory mapping:       Yes
> >   Compute mode:                                  Default (multiple host
> > threads can use this device simultaneously)
> >   Concurrent kernel execution:                   No
> >   Device has ECC support enabled:                No
> >
> > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 3.10, CUDA
> > Runtime Version = 3.10, NumDevs = 1, Device = Tesla C1060
> >
> > Any comments on that?
> >
> > Regards and thanks in advance,
> > Daniel
> >
> >
> > --
> >
> > Dr.-Ing. Daniel Güllmar
> > Medical Physics Group / IDIR I
> > Jena University Hospital
> > MRT-Gebäude am Steiger
> > Philosophenweg 3
> > 07743 Jena
> >
> > Tel: +49-3641-9-35373
> > Fax: +49-3641-9-35081
> > www: http://ww.mrt.uni-jena.de
> > ____________________
> > Universitätsklinikum Jena
> > Körperschaft des öffentlichen Rechts und Teilkörperschaft der
> > Friedrich-Schiller-Universität
> > Jena Bachstraße 18, 07743 Jena
> > Verwaltungsratsvorsitzender: Prof. Dr. Thomas Deufel; Medizinischer
> > Vorstand: Prof. Dr. Klaus Höffken;
> > Wissenschaftlicher Vorstand: Prof. Dr. Klaus Benndorf; Kaufmännischer
> > Vorstand und Sprecher des Klinikumsvorstandes Rudolf Kruse
> > Bankverbindung: Sparkasse Jena; BLZ: 830 530 30; Kto.: 221;
> > Gerichtsstand Jena
> > Steuernummer: 161/144/02978; USt.-IdNr. : DE 150545777
> > _______________________________________________
> > Freesurfer mailing list
> > Freesurfer@nmr.mgh.harvard.edu
> > https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
> >
> >
>
> _______________________________________________
> Freesurfer mailing list
> Freesurfer@nmr.mgh.harvard.edu
> https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer
>
>
> The information in this e-mail is intended only for the person to whom it
> is
> addressed. If you believe this e-mail was sent to you in error and the
> e-mail
> contains patient information, please contact the Partners Compliance
> HelpLine at
> http://www.partners.org/complianceline . If the e-mail was sent to you in
> error
> but does not contain patient information, please contact the sender and
> properly
> dispose of the e-mail.
>

_______________________________________________
Freesurfer mailing list
Freesurfer@nmr.mgh.harvard.edu
https://mail.nmr.mgh.harvard.edu/mailman/listinfo/freesurfer


The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.

Re: [Freesurfer] CUDA Error - all CUDA-capable devices are busy or unavailable

Reply via email to