Ceph Pacific running on Debian 11 (Bullseye)

In this tutorial I will explain how to setup a Ceph Cluster on Debian 11. The Linux Distribution is not as relevant as it sounds but for the latest Ceph release Pacific I am using here also the latest Debian release Bullseye.

In difference to my last tutorial how to setup Ceph I will focus a little bit more on network. Understanding and configuring the Ceph network options will ensure optimal performance and reliability of the overall storage cluster. See also the latest configuration guide from Red Hat.

Continue reading “Ceph Pacific running on Debian 11 (Bullseye)”

Kubernetes, Ceph and Static Volumes

Ceph is an open source distributed storage system which integrates with the concept of Kubernetes in a perfect way. With the Ceph CSI-Plugin you can connect a Ceph cluster into your Kubernetes cluster in a well designed way. In one of my last posts I give a short tutorial how to setup a Ceph cluster on Debian. Also take a look at the Imixs-Cloud project.

Static Persistence Volumes

When we talk about Kuberentes and Persistence Volumes often you will find examples working with a so called storage class and Dynamic Persistence Volumes. In this concept a persistence volume will be provisioned automatically by the Kubernetes CSI adapter and you do not need to think much about how this works. But this kind of persistence volumes are not durable which means, that if you delete your POD also the persistence volume will be removed and all the data you container wrote so far will be lost. To avoid this, you need a so called Static Persistence Volume. Such a persistence volume is marked with the flag ‘Retain’:

persistentVolumeReclaimPolicy: Retain

This means the volume will not be deleted when the POD is removed or updated.

To setup a Static Persistence Volume in Ceph, two steps are necessary. Fist you need to create the ceph image on you ceph cluster. This can be done form the ceph web admin interface or from the command line tool:

# rbd create test-image --size=1024 --pool=kubernetes --image-feature layering

Next you can define the corresponding Kubernetes Persistence Volume Object referring to this RBD image:

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rbd-static-pv
spec:
  volumeMode: Filesystem
  storageClassName: ceph
  persistentVolumeReclaimPolicy: Retain
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 1Gi
  csi:
    driver: rbd.csi.ceph.com
    fsType: ext4
    nodeStageSecretRef:
      name: csi-rbd-secret
      namespace: ceph-system
    volumeAttributes:
      clusterID: "<clusterID>"
      pool: "kubernetes"
      staticVolume: "true"
      # The imageFeatures must match the created ceph image exactly!
      imageFeatures: "layering"
    volumeHandle: test-image 

Replace <clusterID> with the id of you ceph cluster. Note: also a storage class is needed here to identify the ceph nodes. Find more details here.

Resizing Static Persistence Volumes

So are everything is working fine using Ceph for static persistence volumes. But it becomes a little bit tricky if you need to resize an image. Imagine you are running a database and the calculated storage you need exceeds the size you planed in the beginning.

In this case you first need to resize the ceph image. This can be done easily form the Ceph web admin interface or from the command line tool.

# rbd resize --image foo --size 2048

But the problem is, that after you delete and redeploy your POD in Kubernetes it will still see the old disk size. This happens because the Ceph CSI Plugin did not support automatically resizing of static volumes.

If you are using the fsType ext4 (as in my example) you can run the resize2fs command from within your POD to give your container the correct new size:

# resize2fs /dev/rbd[number]

You need to replace [number] with the correct rbd image mounted within your POD. You can check the rdb number with the command df -h.

Note: The command will only work if the resize2fs lib is installed on your container (which is for example the case for the official PostgreSQL image). Also it is important for this command that your POD runs with the securityContext privileged=true :

....         
          volumeMounts:
            - name: volume-to-resize
              mountPath: /var/lib/data
          securityContext:
            privileged: true

Using a Kubernetes Job

As an alternative to executing the resize2fs command manually you can also start a simple Kubernetes job to resize your RBD images automatically.

---
###################################################
# This job can be used to resize a ext4 filesystem
# aligned to the given size of the underlying RBD image.
###################################################
apiVersion: batch/v1
kind: Job
metadata:
  name: ext4-resize2fs
spec:
  template:
    spec:
      containers:
        - name: debian
          image: debian

          command: ["/bin/sh"]
          args:
            - -c
            - >-
                echo '******** start resizeing block device  ********' &&
                echo ...find rbd mounts to be resized.... &&
                df | grep /rbd &&
                DEVICE=`df | grep /rbd | awk '{print $1}'` &&
                echo ...resizing device $DEVICE ... &&
                resize2fs $DEVICE &&
                echo '******** resize block device completed ********'

          volumeMounts:
            - name: volume-to-resize
              mountPath: /tmp/mount2resize
          securityContext:
            privileged: true
      volumes:
        - name: volume-to-resize
          persistentVolumeClaim:
            claimName: test-pg-dbdata
      restartPolicy: Never
  backoffLimit: 1

Make sure that the PV and PVC objects exist before you run the job. Replace the PVC with the name of your PVC to be resized.

$ kubectl apply -f resize2fs.yaml

If you have any comments please post them here.

Ceph Warning that Won’t Resolve

If ceph is having a temporarily problem – e.g. a node goes down – it may happen, that you see constanctly a waring in the Web UI or also if you run

$ ceph status

In case the message is

.. daemons have recently crashed

but your ceph is up an running again and you can not see any more concerning messages you can remove the messages the force this kind of status. To do this you can run the following form your ceph console:

ceph crash ls
# lists all crash message

ceph crash archive-all
# moves the messages into the archive

This will bring back the health status to HEALTH_OK.