CUDA + NVIDIA drivers: Difference between revisions

From WilliamsNet Wiki
Jump to navigation Jump to search
mNo edit summary
Line 3: Line 3:


=== Fedora 33 ===
=== Fedora 33 ===
==== Official Instructions) ====
Install the repo and then deploy the full package:
Install the repo and then deploy the full package:
  sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora32/x86_64/cuda-fedora33.repo
  sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora32/x86_64/cuda-fedora33.repo

Revision as of 01:14, 20 January 2021

Installation

Download repo rpm from http://developer.nvidia.com/cuda-downloads (or copy the repo file over from another similar system). For further information, see the Installation Guide for Linux and the CUDA Quick Start Guide.

Fedora 33

Install the repo and then deploy the full package:

sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora32/x86_64/cuda-fedora33.repo
sudo dnf clean all
sudo dnf -y module install nvidia-driver:latest-dkms
sudo dnf -y install cuda

CentOS 7

It seems that the dependency for the linux kernel devel package has been lost ... and the driver install just silently fails

yum install kernel-devel

Now install the repository

sudo yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
sudo yum clean all

On regular compute nodes where the full CUDA libraries aren't needed:

yum install nvidia-driver-latest-dkms

On a full workstation or where CUDA is needed:

sudo yum -y install nvidia-driver-latest-dkms cuda
sudo yum -y install cuda-drivers

There is a yum plugin that facilitates the installation and management of NVIDIA kernel modules:

yum -y install yum-plugin-nvidia

Debian 10

Install the repo and then deploy the full package:

sudo apt-get install -y software-properties-common
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/7fa2af80.pub
sudo add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/debian10/x86_64/ /"
sudo add-apt-repository contrib
sudo apt-get update
sudo apt-get -y install cuda

Updating drivers

The above process installs the cuda repository for yum ... and while most of the time updates will happen automatically, there are some things to consider when updating.

The command examples are for CentOS ... use the command and package set appropriate for the OS in use.

Reinstalling

Occasionally you need to just uninstall and reinstall the drivers. Rather than uninstalling and reinstalling the whole works, you can just do one group:

sudo yum -y remove kmod-nvidia-dkms-latest
sudo yum -y install kmod-nvidia-dkms-latest
reboot

Unfortunately, you cannot use the 'yum reinstall' command -- that just overlays the package with a new copy from the .rpm file. DKMS sees that the module already exists (whether it works or not) and just does nothing. The reboot is needed for the kernel to reload the reinstalled drivers.

Kernel Updates

Most of the time, a driver update by itself will work fine ... as will a kernel update. When these happen at the same time, problems can occur. It is best for the kernel update to occur first -- even if you have to do it manually:

sudo yum -y update kernel kernel-devel

Then you need to reboot (sorry). This way, when the driver update is applied, it is applied to the current (new) kernel version:

sudo yum -y update kmod-nvidia-dkms-lastest

This will require another reboot to activate the new modules.

DKMS Issues

Sometimes the update process will not recognize that it can do an update for a particular kernel. In this case, you need to completely remove the driver, reboot, and reinstall. When it happens, it is usually in conjunction with a kernel update (see above) -- so ... do the kernel update after removing the drivers, reboot, and then reinstall the driver:

sudo yum -y remove kmod-nvidia-dkms-latest
<update kernel if needed>
reboot

sudo yum -y install kmod-nvidia-dkms-latest

Ignore any warnings about not finding the latest version of the kmod-nvidia-dkms package ... the process will create it.