Public bug reported: Description: Ubuntu 18.04 LTS Release: 18.04
Expected behavior: profile output Actual behavior: error messages Reproduce as follows: cd NVIDIA_CUDA-9.1_Samples/0_Simple/matrixMul nvcc -I ../../common/inc matrixMul.cu -o matrixMul # check the exe works ./matrixMul [Matrix Multiply Using CUDA] - Starting... GPU Device 0: "GeForce GTX 1080" with compute capability 6.1 MatrixA(320,320), MatrixB(640,320) Computing result using CUDA Kernel... done Performance= 1137.23 GFlop/s, Time= 0.115 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block Checking computed result for correctness: Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. # now try nvprof nvprof ./matrixMul [Matrix Multiply Using CUDA] - Starting... ==4775== NVPROF is profiling process 4775, command: ./matrixMul GPU Device 0: "GeForce GTX 1080" with compute capability 6.1 MatrixA(320,320), MatrixB(640,320) Computing result using CUDA Kernel... done ==4775== Error: Internal profiling error 4168:999. Performance= 1130.40 GFlop/s, Time= 0.116 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block Checking computed result for correctness: Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. ======== Error: CUDA profiling error. # run with sudo sudo nvprof ./matrixMul [Matrix Multiply Using CUDA] - Starting... ==4797== NVPROF is profiling process 4797, command: ./matrixMul GPU Device 0: "GeForce GTX 1080" with compute capability 6.1 MatrixA(320,320), MatrixB(640,320) Computing result using CUDA Kernel... done Performance= 1132.95 GFlop/s, Time= 0.116 msec, Size= 131072000 Ops, WorkgroupSize= 1024 threads/block Checking computed result for correctness: Result = PASS NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled. ==4797== Profiling application: ./matrixMul ==4797== Profiling result: Type Time(%) Time Calls Avg Min Max Name GPU activities: 99.54% 34.644ms 301 115.10us 114.15us 116.07us void matrixMulCUDA<int=32>(float*, float*, float*, int, int) 0.28% 98.465us 2 49.232us 32.960us 65.505us [CUDA memcpy HtoD] 0.18% 62.944us 1 62.944us 62.944us 62.944us [CUDA memcpy DtoH] API calls: 74.77% 110.27ms 3 36.757ms 3.4300us 110.26ms cudaMalloc 22.45% 33.105ms 1 33.105ms 33.105ms 33.105ms cudaEventSynchronize 0.93% 1.3780ms 3 459.33us 427.70us 478.26us cudaGetDeviceProperties 0.81% 1.1874ms 301 3.9440us 3.7260us 18.511us cudaLaunch 0.36% 536.51us 3 178.84us 56.346us 363.23us cudaMemcpy 0.31% 451.50us 94 4.8030us 301ns 228.31us cuDeviceGetAttribute 0.11% 156.37us 1 156.37us 156.37us 156.37us cudaDeviceSynchronize 0.09% 132.82us 1505 88ns 79ns 289ns cudaSetupArgument 0.07% 100.43us 3 33.475us 4.3440us 83.746us cudaFree 0.06% 82.848us 1 82.848us 82.848us 82.848us cuDeviceTotalMem 0.02% 35.673us 301 118ns 110ns 801ns cudaConfigureCall 0.02% 33.788us 1 33.788us 33.788us 33.788us cuDeviceGetName 0.00% 5.3080us 2 2.6540us 2.2050us 3.1030us cudaEventRecord 0.00% 3.2350us 2 1.6170us 1.0960us 2.1390us cudaEventCreate 0.00% 2.8120us 1 2.8120us 2.8120us 2.8120us cudaSetDevice 0.00% 2.0920us 1 2.0920us 2.0920us 2.0920us cudaEventElapsedTime 0.00% 1.7410us 3 580ns 292ns 1.0710us cuDeviceGetCount 0.00% 1.0230us 2 511ns 353ns 670ns cuDeviceGet 0.00% 658ns 1 658ns 658ns 658ns cudaGetDeviceCount ProblemType: Bug DistroRelease: Ubuntu 18.04 Package: nvidia-profiler 9.1.85-3 ProcVersionSignature: Ubuntu 4.15.0-20.21-generic 4.15.17 Uname: Linux 4.15.0-20-generic x86_64 NonfreeKernelModules: nvidia_modeset nvidia ApportVersion: 2.20.9-0ubuntu7 Architecture: amd64 Date: Thu Apr 26 17:28:48 2018 Dependencies: gcc-8-base 8-20180414-1ubuntu2 libc6 2.27-3ubuntu1 libcuinj64-9.1 9.1.85-3 libgcc1 1:8-20180414-1ubuntu2 InstallationDate: Installed on 2018-04-21 (5 days ago) InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Alpha amd64 (20180421) ProcEnviron: TERM=xterm-256color PATH=(custom, no user) XDG_RUNTIME_DIR=<set> LANG=en_US.UTF-8 SHELL=/bin/bash SourcePackage: nvidia-cuda-toolkit UpgradeStatus: No upgrade log present (probably fresh install) ** Affects: nvidia-cuda-toolkit (Ubuntu) Importance: Undecided Status: New ** Tags: amd64 apport-bug bionic -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1767205 Title: nvprof does not complete without sudo To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nvidia-cuda-toolkit/+bug/1767205/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs