summary refs log tree commit diff
path: root/topics/systems/linux/adding-nvidia-drivers-penguin2.gmi
blob: 81e721f5554bbf7e7f9247f405dabc6ba4c2623d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
# GPU Graphics Driver Set-Up

Tux02 has the Tesla K80 (GK210GL) GPU.  For machine learning, we want the official proprietary NVIDIA drivers.

## Installation

* Debian 12 moved NVIDIA driver into the non-free-firmware repo.  Add the following to "/etc/apt/sources.list" and run "sudo apt update":

```
deb http://deb.debian.org/debian/ bookworm main contrib non-free non-free-firmware
```

* Make sure the correct kernel headers are installed:

```
sudo apt install linux-headers-$(uname -r)
```

* Install "nvidia-tesla-470-driver"⁰ (The NVIDIA line-up of programmable "Tesla" devices, used primarily for simulations and large-scale calculations, also require separate driver packages to function correctly compared to the consumer-grade GeForce GPUs that are instead targeted for desktop and gaming usage)¹:

```
sudo apt purge 'nvidia-*'
sudo apt install nvidia-tesla-470-driver
```

* Black list nouveau since it conflicts with NVIDIA's driver, and regenerate the initramfs "sudo update-initramfs -u":

```
echo "blacklist nouveau" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
echo "options nouveau modeset=0" | sudo tee -a /etc/modprobe.d/blacklist-nouveau.conf
```

* Reboot and test the nvidia drivers:

```
sudo reboot
nvidia-smi

# optional if you want to use nvidia-cuda-toolkit
sudo apt install nvidia-cuda-dev nvidia-cuda-toolkit
```

## Issues

Holding on reboot until I check in with the rest of team regarding some initd raspi hook:

```
update-initramfs: Generating /boot/initrd.img-6.1.0-9-amd64
raspi-firmware: missing /boot/firmware, did you forget to mount it?
run-parts: /etc/initramfs/post-update.d//z50-raspi-firmware exited with return code 1
dpkg: error processing package initramfs-tools (--configure):
 installed initramfs-tools package post-installation script subprocess returned error exit status 1
Processing triggers for libgdk-pixbuf-2.0-0:amd64 (2.42.10+dfsg-1+deb12u1) ...
Errors were encountered while processing:
 initramfs-tools
```

Removed the firmware by running:

```
sudo apt purge raspi-firmware

# Configure all packages that are installed but not yet fully configured
sudo dpkg --configure -a

# Update initramfs since we updated our drivers
sudo update-initramfs -u
```

## References

=> https://us.download.nvidia.com/XFree86/Linux-x86_64/470.129.06/README/supportedchips.html ⁰ Nvidia 470.129.06 Supported Chipsets.
=> https://wiki.debian.org/NvidiaGraphicsDrivers#Tesla_Drivers ¹ Debian Tesla Drivers.
=> https://wiki.debian.org/NvidiaGraphicsDrivers/Configuration ² NVIDIA Proprietary Driver: Configuration.