Docker Installation

From WilliamsNet Wiki
Jump to navigation Jump to search

This installs the official Docker distrubution from the community edition repository.


Fedora

The docker team officially supports Fedora, but the rapid pace of Fedora releases (every 6 months) and the changing cgroups support (fedora has moved to v2 but docker is still on v1) has led to delays with the Fedora 33 release of docker-ce. I expect this will happen again for future releases, so the instructions below accommodate this possibility.

Prerequisites

Set up Docker-CE Repository

sudo dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
dnf list docker\*

If you don't see the docker-ce* packages in the listing, you need to enable the test repository (CAUTION):

sudo dnf config-manager --set-enabled docker-ce-test

Then you install the packages

sudo dnf install docker-ce docker-ce-cli containerd.io

CentOS

( originally from https://docs.docker.com/engine/installation/linux/centos/ )

Prerequisites

Set up Docker-CE Repository

sudo yum install -y yum-utils device-mapper-persistent-data lvm2
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo

Install Docker

sudo yum install -y docker-ce docker-ce-cli containerd.io

Debian

( originally from https://docs.docker.com/engine/installation/linux/debian/ )

Prerequisites

Set up Docker-CE Repository

sudo apt-get install -y apt-transport-https ca-certificates curl gnupg2 software-properties-common
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"
sudo apt-get update

Install Docker

sudo apt-get install -y docker-ce docker-ce-cli containerd.io

Finishing the Install

Make sure the docker daemon is running and set up to start automatically (should already be done by the package install):

sudo systemctl enable --now docker

Test installation

sudo docker run --rm hello-world
sudo docker image rm -f hello-world

Enable standard user access to docker commands (requires $USER to log out and back in to activate)

sudo groupadd docker
sudo usermod -aG docker $USER

In order to access the local repositories, we need to copy the certs into the local docker config

sudo scp -r aslan:/etc/docker/certs.d /etc/docker

Copy over the .docker directory from aslan for both root and user(s) to get login credentials

sudo scp -r aslan:.docker /root
scp -r aslan:.docker ~ 

GPU Nodes

On GPU-enabled nodes, install the nvidia runtime.

(originally from https://github.com/NVIDIA/nvidia-docker)

sudo yum-config-manager --add-repo https://nvidia.github.io/nvidia-docker/centos7/x86_64/nvidia-docker.repo
sudo yum install -y nvidia-docker2
sudo systemctl restart docker 

Test nvidia-smi with the latest official CUDA image

docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi

Make the nvidia runtime default

  • add entry to /etc/docker/daemon.json
  • note that this gets reset when docker updates
  • resulting daemon.json file looks like this:
{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
} 

... or copy from aslan

scp aslan:/etc/docker/daemon.json /etc/docker

Reload the configuration to enable the change

sudo systemctl restart docker

Test the GPU performance using a simple NVIDIA GPU Cloud container with the CUDA nbody sample program

docker run -it --rm nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -benchmark -fp64

Reloading Repository Certificates

Rather frequently, it seems, the NVIDIA folks invalidate their signing certificates for their repositories. When that happens, you just need to delete the certificates and let the 'yum' command reload them on the next update:

$(sed -n 's/releasever=//p' /etc/yum.conf)
${DIST:-$(. /etc/os-release; echo $VERSION_ID)}
sudo rpm -e gpg-pubkey-f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/nvidia-container-runtime/gpgdir --delete-key f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/libnvidia-container/gpgdir --delete-key f796ecb0
sudo gpg --homedir /var/lib/yum/repos/$(uname -m)/$DIST/nvidia-docker/gpgdir --delete-key f796ecb0
sudo yum -y makecache