On 1/28/25 03:50, Anssi Saari wrote: > Eben King <e...@gmx.us> writes: > > I don't know if there's more history to this issue but a couple of > things come to mind. > >> Checking card: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1) >> Your card is supported by all driver versions. >> Your card is also supported by the Tesla 470 drivers series. > > What about this as an alternative? nvidia-tesla-470-driver instead of > nvidia-driver? > > And what about controlling the fan(s) manually? See for example > https://askubuntu.com/questions/42494/how-can-i-change-the-nvidia-gpu-fan-speed
Good guess, but that was weird. The GPU was working today for reasons unknown, but I figured "what could be the harm in setting the fan to a higher value?". So I tried. eben@cerberus:~$ sudo nvidia-xconfig --cool-bits=4 [sudo] password for eben: sudo: nvidia-xconfig: command not found Apparently nvidia-xconfig is not in root's $PATH. Just for kicks I went into nvidia-xconfig (as me), checked "Enable GPU Fan Setting", went to 40-some% and hit Apply. The fan immediately went wonky, returning random values while presumably not actually running because the GPU temp slowly rose. Well, the command's for a different OS so YMMV. But one of the comments said "This will generate a completely new xorg.conf and also adds Option "Coolbits" "4" to Section "Screen"." Since I currently have no xorg.conf I wondered why manual fan speed setting was enabled. I used to have an xorg.conf to which I'd added that, but things happened. A bit of poking around showed it got renamed to -rw-r--r-- 1 root root 1226 Aug 31 16:47 /etc/X11/xorg.conf.nvidia so I did lrwxrwxrwx 1 root root 16 Jan 28 10:13 /etc/X11/xorg.conf -> xorg.conf.nvidia and restarted X. No dice, fan's still returning crazy values. FTR, this is what I mean by "crazy values": eben@cerberus:~$ ./monitor_fan 2025-01-28 11:23:57 71% 2025-01-28 11:23:58 26% 2025-01-28 11:23:59 0% 2025-01-28 11:24:00 58% 2025-01-28 11:24:01 57% 2025-01-28 11:24:02 103% 2025-01-28 11:24:03 0% 2025-01-28 11:24:04 74% 2025-01-28 11:24:05 18% 2025-01-28 11:24:06 0% 2025-01-28 11:24:07 30% 2025-01-28 11:24:08 75% 2025-01-28 11:24:09 0% 2025-01-28 11:24:11 43% 2025-01-28 11:24:12 34% 2025-01-28 11:24:13 49% 2025-01-28 11:24:14 7% eben@cerberus:~$ cat monitor_fan #! /bin/sh prevSpeed=-1 nvidia-smi --format=csv,noheader,nounits --query-gpu=fan.speed --loop=1 \ | while read currentSpeed ; do if [ $currentSpeed -ne $prevSpeed ] ; then echo $(date "+%F %T") ${currentSpeed}% prevSpeed=$currentSpeed fi done > Note the possible caveats though. Yeah. Worst case, restore /usr from backup.