Ceph-CSI Provisioner for Kubernetes: Difference between revisions

From WilliamsNet Wiki
Jump to navigation Jump to search
m (DrEdWilliams moved page Ceph-RBD Provisioner for Kubernetes to Ceph-CSI Provisioner for Kubernetes without leaving a redirect)
(total rewrite to use ceph-csi instead of ceph-rbd)
Line 1: Line 1:
Excerpted from https://computingforgeeks.com/persistent-storage-for-kubernetes-with-ceph-rbd/
Another way to use ceph-based storage to support a kubernetes cluster is to use the native ceph-csi driver.  It is developed and maintained by the ceph team itself, and it is documented as part of the ceph documentation as well as its own github page:


== Step 1: Deploy Ceph Provisioner on Kubernetes ==
https://docs.ceph.com/en/latest/rbd/rbd-kubernetes/?highlight=ceph-csi#configure-ceph-csi-plugins
Login to your Kubernetes cluster and Create a manifest file for deploying RBD provisioner which is an out-of-tree dynamic provisioner for Kubernetes 1.5+.
https://github.com/ceph/ceph-csi


$ vim ceph-rbd-provisioner.yaml
This is definitely a lean deployment, but it is functional -- and since Rook doesn't handle external clusters well, it is the only way to skin this cat if you want/need to maintain an independent ceph cluster -- but still use it for dynamic storage provisioning in Kubernetes.


Add the following contents to the file. Notice our deployment uses RBAC, so we’ll create cluster role and bindings before creating service account and deploying Ceph RBD provisioner.
The documentation is not bad, so I won't repeat it here.  The one change that should be made to the manifests is to put the ceph-csi provisioner into its own namespace (I chose ''ceph-csi'').  Once I got through the ''global_id'' issue discussed below, it just worked ...
<pre>---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-provisioner
  namespace: kube-system
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
  - apiGroups: [""]
    resources: ["services"]
    resourceNames: ["kube-dns","coredns"]
    verbs: ["list", "get"]
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]


---
A recent CVE on Ceph was published -- it affects Nautilus, Octopus, and Pacific:
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-provisioner
  namespace: kube-system
subjects:
  - kind: ServiceAccount
    name: rbd-provisioner
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: rbd-provisioner
  apiGroup: rbac.authorization.k8s.io


---
https://docs.ceph.com/en/latest/security/CVE-2021-20288/
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: rbd-provisioner
  namespace: kube-system
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
- apiGroups: [""]
  resources: ["endpoints"]
  verbs: ["get", "list", "watch", "create", "update", "patch"]


---
I'm not sure about the risk in a closed environment, but since the alerts in the newest versions of ceph are annoying if you don't set the parameters to prevent it, I did so when I upgraded.  The problem is, ceph-csi uses the insecure way of reclaiming global_id that the parameter changes block.  So ... after 2 days of playing with it (and failing) on two different kubernetes clusters, I finally found enough in the logfiles to lead me to conclude that ceph-csi was doing it wrong.  I un-did the config changes to prevent it ... and it finally worked.  Not good, but at least I'm not crazy.
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: rbd-provisioner
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rbd-provisioner
subjects:
- kind: ServiceAccount
  name: rbd-provisioner
  namespace: kube-system


---
I put in an [https://github.com/ceph/ceph-csi/issues/2063 issue] on it ... let's see what they say.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: rbd-provisioner
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: rbd-provisioner
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: rbd-provisioner
    spec:
      containers:
      - name: rbd-provisioner
        image: "quay.io/external_storage/rbd-provisioner:latest"
        env:
        - name: PROVISIONER_NAME
          value: ceph.com/rbd
      serviceAccount: rbd-provisioner</pre>


Apply the file to create the resources.
All the files I used for this are in the k8s-admin repo in gitlab ...
 
$ kubectl apply -f ceph-rbd-provisioner.yaml
clusterrole.rbac.authorization.k8s.io/rbd-provisioner created
clusterrolebinding.rbac.authorization.k8s.io/rbd-provisioner created
role.rbac.authorization.k8s.io/rbd-provisioner created
rolebinding.rbac.authorization.k8s.io/rbd-provisioner created
deployment.apps/rbd-provisioner created
 
Confirm that RBD volume provisioner pod is running.
 
$ kubectl get pods -l app=rbd-provisioner -n kube-system
NAME                              READY  STATUS    RESTARTS  AGE
rbd-provisioner-75b85f85bd-p9b8c  1/1    Running  0          3m45s
 
== Step 2: Get Ceph Admin Key and create Secret on Kubernetes ==
Login to your Ceph Cluster and get the admin key for use by RBD provisioner.
 
$ sudo ceph auth get-key client.admin
 
Save the Value of the admin user key printed out by the command above. We’ll add the key as a secret in Kubernetes.
 
$ kubectl create secret generic ceph-admin-secret \
    --type="kubernetes.io/rbd" \
    --from-literal=key='<key-value>' \
    --namespace=kube-system
 
Where <key-value> is your ceph admin key. You can confirm creation with the command below.
 
$ kubectl get secrets ceph-admin-secret -n kube-system
NAME                TYPE                DATA  AGE
ceph-admin-secret  kubernetes.io/rbd  1      5m
 
== Step 3: Create Ceph pool for Kubernetes & client key ==
Next is to create a new Ceph Pool for Kubernetes.
 
$ sudo ceph ceph osd pool create <pool-name> <pg-number>
 
Example:
 
$ sudo ceph ceph osd pool create k8s 100
 
Then create a new client key with access to the pool created.
 
$ sudo ceph auth add client.kube mon 'allow r' osd 'allow rwx pool=<pool-name>'
 
Example
$ sudo ceph auth add client.kube mon 'allow r' osd 'allow rwx pool=k8s'
 
Where k8s is the name of pool created in Ceph.
 
You can then associate the pool with an application and initialize it.
 
sudo ceph osd pool application enable <pool-name> rbd
sudo rbd pool init <pool-name>
 
Get the client key on Ceph.
 
$ sudo ceph auth get-key client.kube
 
Create client secret on Kubernetes
 
kubectl create secret generic ceph-k8s-secret \
  --type="kubernetes.io/rbd" \
  --from-literal=key='<key-value>' \
  --namespace=kube-system
 
Where <key-value> is your Ceph client key.
 
== Step 4: Create a RBD Storage Class ==
A StorageClass provides a way for you to describe the “classes” of storage you offer in Kubernetes. We’ll create a storageclass called ceph-rbd.
 
$ vim ceph-rbd-sc.yml
 
The contents to be added to file:
 
<pre>---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: ceph-rbd
provisioner: ceph.com/rbd
parameters:
  monitors: 10.10.10.11:6789, 10.10.10.12:6789, 10.10.10.13:6789
  pool: k8s-uat
  adminId: admin
  adminSecretNamespace: kube-system
  adminSecretName: ceph-admin-secret
  userId: kube
  userSecretNamespace: kube-system
  userSecretName: ceph-k8s-secret
  imageFormat: "2"
  imageFeatures: layering</pre>
 
Where:
 
* ceph-rbd is the name of the StorageClass to be created.
* 10.10.10.11, 10.10.10.12 & 10.10.10.13 are the IP address of Ceph Monitors.
 
You can list them with the command:
<pre>$ sudo ceph -s
  cluster:
    id:    7795990b-7c8c-43f4-b648-d284ef2a0aba
    health: HEALTH_OK
  services:
    mon: 3 daemons, quorum cephmon01,cephmon02,cephmon03 (age 32h)
    mgr: cephmon01(active, since 30h), standbys: cephmon02
    mds: cephfs:1 {0=cephmon01=up:active} 1 up:standby
    osd: 9 osds: 9 up (since 32h), 9 in (since 32h)
    rgw: 3 daemons active (cephmon01, cephmon02, cephmon03)
  data:
    pools:  8 pools, 618 pgs
    objects: 250 objects, 76 KiB
    usage:  9.6 GiB used, 2.6 TiB / 2.6 TiB avail
    pgs:    618 active+clean</pre>
 
After modifying the file with correct values of Ceph monitors, apply config:
 
$ kubectl apply -f ceph-rbd-sc.yml
storageclass.storage.k8s.io/ceph-rbd created
 
List available StorageClasses:
 
kubectl get sc
NAME      PROVISIONER      RECLAIMPOLICY  VOLUMEBINDINGMODE  ALLOWVOLUMEEXPANSION  AGE
ceph-rbd  ceph.com/rbd      Delete          Immediate          false                  17s
cephfs    ceph.com/cephfs  Delete          Immediate          false                  18d
 
== Step 5: Create a test Claim and Pod on Kubernetes ==
To confirm everything is working, let’s create a test persistent volume claim.
 
$ vim ceph-rbd-claim.yml
 
<pre>kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: ceph-rbd-claim1
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: ceph-rbd
  resources:
    requests:
      storage: 1Gi</pre>
 
Apply manifest file to create claim.
 
$ kubectl apply -f ceph-rbd-claim.yml
persistentvolumeclaim/ceph-rbd-claim1 created
 
If it was successful in binding, it should show Bound status.
 
$ kubectl get pvc
NAME              STATUS  VOLUME                                    CAPACITY  ACCESS MODES  STORAGECLASS  AGE
ceph-rbd-claim1  Bound    pvc-c6f4399d-43cf-4fc1-ba14-cc22f5c85304  1Gi        RWO            ceph-rbd      43s
 
Nice!.. We are able create dynamic Persistent Volume Claims on Ceph RBD backend. Notice we didn’t have to manually create a Persistent Volume before a Claim. How cool is that?..
 
We can then deploy a test pod using the claim we created. First create a file to hold the data:
 
$ vim rbd-test-pod.yaml
 
Add:
 
<pre>---
kind: Pod
apiVersion: v1
metadata:
  name: rbd-test-pod
spec:
  containers:
  - name: rbd-test-pod
    image: busybox
    command:
      - "/bin/sh"
    args:
      - "-c"
      - "touch /mnt/RBD-SUCCESS && exit 0 || exit 1"
    volumeMounts:
      - name: pvc
        mountPath: "/mnt"
  restartPolicy: "Never"
  volumes:
    - name: pvc
      persistentVolumeClaim:
        claimName: ceph-rbd-claim1 </pre>
 
Create pod:
 
$ kubectl apply -f rbd-test-pod.yaml
pod/rbd-test-pod created
 
If you describe the Pod, you’ll see successful attachment of the Volume.
 
$ kubectl describe pod rbd-test-pod
.....
Events:
  Type    Reason                  Age        From                    Message
  ----    ------                  ----      ----                    -------
  Normal  Scheduled              <unknown>  default-scheduler        Successfully assigned default/rbd-test-pod to rke-worker-02
  Normal  SuccessfulAttachVolume  3s        attachdetach-controller  AttachVolume.Attach succeeded for volume "pvc-c6f4399d-43cf-4fc1-ba14- cc22f5c85304"
 
If you have Ceph Dashboard, you can see a new block image created.

Revision as of 01:52, 9 May 2021

Another way to use ceph-based storage to support a kubernetes cluster is to use the native ceph-csi driver. It is developed and maintained by the ceph team itself, and it is documented as part of the ceph documentation as well as its own github page:

https://docs.ceph.com/en/latest/rbd/rbd-kubernetes/?highlight=ceph-csi#configure-ceph-csi-plugins
https://github.com/ceph/ceph-csi

This is definitely a lean deployment, but it is functional -- and since Rook doesn't handle external clusters well, it is the only way to skin this cat if you want/need to maintain an independent ceph cluster -- but still use it for dynamic storage provisioning in Kubernetes.

The documentation is not bad, so I won't repeat it here. The one change that should be made to the manifests is to put the ceph-csi provisioner into its own namespace (I chose ceph-csi). Once I got through the global_id issue discussed below, it just worked ...

A recent CVE on Ceph was published -- it affects Nautilus, Octopus, and Pacific:

https://docs.ceph.com/en/latest/security/CVE-2021-20288/

I'm not sure about the risk in a closed environment, but since the alerts in the newest versions of ceph are annoying if you don't set the parameters to prevent it, I did so when I upgraded. The problem is, ceph-csi uses the insecure way of reclaiming global_id that the parameter changes block. So ... after 2 days of playing with it (and failing) on two different kubernetes clusters, I finally found enough in the logfiles to lead me to conclude that ceph-csi was doing it wrong. I un-did the config changes to prevent it ... and it finally worked. Not good, but at least I'm not crazy.

I put in an issue on it ... let's see what they say.

All the files I used for this are in the k8s-admin repo in gitlab ...