Ceph-CSI Provisioner for Kubernetes: Difference between revisions

Latest revision as of 22:34, 11 May 2021

Another way to use ceph-based storage to support a kubernetes cluster is to use the native ceph-csi driver. It is developed and maintained by the ceph team itself, and it is documented as part of the ceph documentation as well as its own github page:

https://docs.ceph.com/en/latest/rbd/rbd-kubernetes/?highlight=ceph-csi#configure-ceph-csi-plugins
https://github.com/ceph/ceph-csi

Installation[edit]

This is definitely a lean deployment, but it is functional -- and since Rook doesn't handle external clusters well, it is the only way to skin this cat if you want/need to maintain an independent ceph cluster -- but still use it for dynamic storage provisioning in Kubernetes.

The documentation is not bad, so I won't repeat it here. The one change that should be made to the manifests is to put the ceph-csi provisioner into its own namespace (I chose ceph-csi). Once I got through the global_id issue discussed below, it just worked ...

Default Storage Class (Optional)[edit]

Not all manifests specify a storage class -- this can be especially problematic with helm charts that don't expose the opportunity to specify a storage class. Kubernetes has the concept of a default storage class that is widely used by the cloud providers to point to their preferred storage solution. While not required, specifying a Rook StorageClass as default can simplify 'automatic' deployments. In reality, all that needs to be done is to set a flag on the storage class that you want to be default ... but there's some prep work ...

First, identify all the defined storage classes:

kubectl get sc

All the storage classes will be shown -- if one of them is already defined as a 'default' class, it will have (default) after the name. If there is a default class identified (and it isn't yours) you need to turn off the default status for that class (i.e. setting a new default does NOT reset the old default):

kubectl patch storageclass <old-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'

Then you can set your storageclass as the default:

kubectl patch storageclass <new-default> -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Validate that it took by looking at the storageclasses again:

kubectl get sc

Your class should now be marked '(default)'.

(excerpted from https://kubernetes.io/docs/tasks/administer-cluster/change-default-storage-class/)

CVE-2021-20288[edit]

A recent CVE on Ceph was published -- it affects Nautilus, Octopus, and Pacific:

https://docs.ceph.com/en/latest/security/CVE-2021-20288/

I'm not sure about the risk in a closed environment, but since the alerts in the newest versions of ceph are annoying if you don't set the parameters to prevent it, I did so when I upgraded. The problem is, ceph-csi uses the insecure way of reclaiming global_id that the parameter changes block. So ... after 2 days of playing with it (and failing) on two different kubernetes clusters, I finally found enough in the logfiles to lead me to conclude that ceph-csi was doing it wrong. I un-did the config changes to prevent it ... and it finally worked. Not good, but at least I'm not crazy.

I put in an issue on it ... let's see what they say.

Update: it has been fixed (previously -- I just didn't find it when searching). using v3.3.1 of their containers takes care of it. I will say that their documentation hadn't been updated yet to reflect that the newer version had been fixed ... and since they're using a quay registry, it's not possible to see what versions of their containers are available :-(

Repo[edit]

All the files I used for this are in the k8s-admin repo in gitlab ...