Docker Installation
This installs the official Docker distrubution from the community edition repository.
Fedora
The docker team officially supports Fedora, but the rapid pace of Fedora releases (every 6 months) and the changing cgroups support (fedora has moved to v2 but docker is still on v1) has led to delays with the Fedora 33 release of docker-ce. I expect this will happen again for future releases, so the instructions below accommodate this possibility.
Prerequisites
- Basic Fedora 31+ Installation
- CUDA + NVIDIA drivers (if GPU is present)
Set up Docker-CE Repository
sudo dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo dnf list docker\*
If you don't see the docker-ce* packages in the listing, you need to enable the test repository (CAUTION):
sudo dnf config-manager --set-enabled docker-ce-test
Then you install the packages
sudo dnf install docker-ce docker-ce-cli containerd.io
CentOS
( originally from https://docs.docker.com/engine/installation/linux/centos/ )
Prerequisites
- Basic CentOS 7 Installation or Basic CentOS 8 Installation
- CUDA + NVIDIA drivers (if GPU is present)
Set up Docker-CE Repository
sudo yum install -y yum-utils device-mapper-persistent-data lvm2 sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
Install Docker
sudo yum install -y docker-ce docker-ce-cli containerd.io
Debian
( originally from https://docs.docker.com/engine/installation/linux/debian/ )
Prerequisites
Set up Docker-CE Repository
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg2 software-properties-common curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add - sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable" sudo apt-get update
Install Docker
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
Finishing the Install
Make sure the docker daemon is running and set up to start automatically (should already be done by the package install):
sudo systemctl enable --now docker
Test installation
sudo docker run --rm hello-world sudo docker image rm -f hello-world
Enable standard user access to docker commands (requires $USER to log out and back in to activate)
sudo groupadd docker sudo usermod -aG docker $USER
In order to access the local repositories, we need to copy the certs into the local docker config
sudo scp -r aslan:/etc/docker/certs.d /etc/docker
Copy over the .docker directory from aslan for both root and user(s) to get login credentials
sudo scp -r aslan:.docker /root scp -r aslan:.docker ~
GPU Nodes
On GPU-enabled nodes, install the nvidia runtime.
(originally from https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker)
CentOS 7/8
sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/centos7/x86_64/nvidia-docker.repo sudo yum install -y nvidia-docker2
Debian 10
Setup the stable repository and the GPG key:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Note: To get access to experimental features such as CUDA on WSL or the new MIG capability on A100, you may want to add the experimental branch to the repository listing:
curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
Install the nvidia-docker2 package (and dependencies) after updating the package listing:
sudo apt-get update sudo apt-get install -y nvidia-docker2
Finishing the install
Restart the Docker daemon to complete the installation after setting the default runtime:
sudo systemctl restart docker
Test nvidia-smi with the latest official CUDA image
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Test the GPU performance using a simple NVIDIA GPU Cloud container with the CUDA nbody sample program
docker run -it --rm nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -benchmark -fp64
Reloading Repository Certificates
Rather frequently, it seems, the NVIDIA folks invalidate their signing certificates for their repositories. When that happens, you just need to delete the certificates and let the 'yum' command reload them on the next update:
$(sed -n 's/releasever=//p' /etc/yum.conf)
${DIST:-$(. /etc/os-release; echo $VERSION_ID)}
sudo rpm -e gpg-pubkey-f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/nvidia-container-runtime/gpgdir --delete-key f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/libnvidia-container/gpgdir --delete-key f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/nvidia-docker/gpgdir --delete-key f796ecb0
sudo yum -y makecache