Docker Installation: Difference between revisions
DrEdWilliams (talk | contribs) m (added Debian nvidia-docker install directions) |
|||
| Line 70: | Line 70: | ||
On GPU-enabled nodes, install the nvidia runtime. | On GPU-enabled nodes, install the nvidia runtime. | ||
(originally from https:// | (originally from https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker) | ||
==== CentOS 7/8 ==== | |||
<pre>sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/centos7/x86_64/nvidia-docker.repo | <pre>sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/centos7/x86_64/nvidia-docker.repo | ||
sudo yum install -y nvidia-docker2 | sudo yum install -y nvidia-docker2</pre> | ||
==== Debian 10 ==== | |||
Setup the stable repository and the GPG key: | |||
<pre>distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ | |||
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ | |||
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list</pre> | |||
Note: To get access to experimental features such as CUDA on WSL or the new MIG capability on A100, you may want to add the experimental branch to the repository listing: | |||
curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list | |||
Install the nvidia-docker2 package (and dependencies) after updating the package listing: | |||
sudo apt-get update | |||
sudo apt-get install -y nvidia-docker2 | |||
==== Finishing the install ==== | |||
Restart the Docker daemon to complete the installation after setting the default runtime: | |||
sudo systemctl restart docker | sudo systemctl restart docker | ||
Test nvidia-smi with the latest official CUDA image | |||
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi | |||
Test the GPU performance using a simple NVIDIA GPU Cloud container with the CUDA nbody sample program | Test the GPU performance using a simple NVIDIA GPU Cloud container with the CUDA nbody sample program | ||
Revision as of 03:52, 19 December 2020
This installs the official Docker distrubution from the community edition repository.
Fedora
The docker team officially supports Fedora, but the rapid pace of Fedora releases (every 6 months) and the changing cgroups support (fedora has moved to v2 but docker is still on v1) has led to delays with the Fedora 33 release of docker-ce. I expect this will happen again for future releases, so the instructions below accommodate this possibility.
Prerequisites
- Basic Fedora 31+ Installation
- CUDA + NVIDIA drivers (if GPU is present)
Set up Docker-CE Repository
sudo dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo dnf list docker\*
If you don't see the docker-ce* packages in the listing, you need to enable the test repository (CAUTION):
sudo dnf config-manager --set-enabled docker-ce-test
Then you install the packages
sudo dnf install docker-ce docker-ce-cli containerd.io
CentOS
( originally from https://docs.docker.com/engine/installation/linux/centos/ )
Prerequisites
- Basic CentOS 7 Installation or Basic CentOS 8 Installation
- CUDA + NVIDIA drivers (if GPU is present)
Set up Docker-CE Repository
sudo yum install -y yum-utils device-mapper-persistent-data lvm2 sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
Install Docker
sudo yum install -y docker-ce docker-ce-cli containerd.io
Debian
( originally from https://docs.docker.com/engine/installation/linux/debian/ )
Prerequisites
Set up Docker-CE Repository
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg2 software-properties-common curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add - sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable" sudo apt-get update
Install Docker
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
Finishing the Install
Make sure the docker daemon is running and set up to start automatically (should already be done by the package install):
sudo systemctl enable --now docker
Test installation
sudo docker run --rm hello-world sudo docker image rm -f hello-world
Enable standard user access to docker commands (requires $USER to log out and back in to activate)
sudo groupadd docker sudo usermod -aG docker $USER
In order to access the local repositories, we need to copy the certs into the local docker config
sudo scp -r aslan:/etc/docker/certs.d /etc/docker
Copy over the .docker directory from aslan for both root and user(s) to get login credentials
sudo scp -r aslan:.docker /root scp -r aslan:.docker ~
GPU Nodes
On GPU-enabled nodes, install the nvidia runtime.
(originally from https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker)
CentOS 7/8
sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/centos7/x86_64/nvidia-docker.repo sudo yum install -y nvidia-docker2
Debian 10
Setup the stable repository and the GPG key:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \ && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Note: To get access to experimental features such as CUDA on WSL or the new MIG capability on A100, you may want to add the experimental branch to the repository listing:
curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
Install the nvidia-docker2 package (and dependencies) after updating the package listing:
sudo apt-get update sudo apt-get install -y nvidia-docker2
Finishing the install
Restart the Docker daemon to complete the installation after setting the default runtime:
sudo systemctl restart docker
Test nvidia-smi with the latest official CUDA image
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi
Test the GPU performance using a simple NVIDIA GPU Cloud container with the CUDA nbody sample program
docker run -it --rm nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -benchmark -fp64
Reloading Repository Certificates
Rather frequently, it seems, the NVIDIA folks invalidate their signing certificates for their repositories. When that happens, you just need to delete the certificates and let the 'yum' command reload them on the next update:
$(sed -n 's/releasever=//p' /etc/yum.conf)
${DIST:-$(. /etc/os-release; echo $VERSION_ID)}
sudo rpm -e gpg-pubkey-f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/nvidia-container-runtime/gpgdir --delete-key f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/libnvidia-container/gpgdir --delete-key f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/nvidia-docker/gpgdir --delete-key f796ecb0
sudo yum -y makecache