Kubernetes and GlusterFS

In this Blog I will explain how to install a distributed filesystem on a kubernetes cluster. To run stateful docker images (e.g. a Database like PostgreSQL) you have two choices.

  • run the service on a dedicated node – this avoids the lost of data if kubernetes re-schedules your server to another node
  • use a distributed storage solution like ceph or glusterfs storage

Gluster is a scalable network filesystem. This allows you to create a large, distributed storage solution on common hard ware. You can connect a gluster storage to Kubernetes to abstract the volume from your services. 

Install

You can install Glusterfs on any node this includes the kubernetes worker nodes.  The following guide explains how to intal Glusterfs on Debian 10(buster). You will find more information about installation here.

Repeat the following installation on each node you wish to join your gluster network storage. Run the commands as root:

$ su

# Add the GPG key to apt:
$ wget -O - https://download.gluster.org/pub/gluster/glusterfs/7/rsa.pub | apt-key add -

# Add the source (s/amd64/arm64/ as necessary):	
$ echo deb [arch=amd64] https://download.gluster.org/pub/gluster/glusterfs/7/LATEST/Debian/buster/amd64/apt buster main > /etc/apt/sources.list.d/gluster.list

# Install...
$ apt update
$ apt install -y glusterfs-server

Start and enable the GlusterFS Service on all the servers.

$ sudo systemctl start glusterd
$ sudo systemctl enable glusterd

To test the gluster status run connect into one of your cluster-nodes:

$ sudo service glusterd status

Setup Gluster Network

Now you can check form one of your gluster nodes if you can reach each other node

$ gluster peer probe [gluster-node-ip]

where gluster-node-ip is the IP Adress or the DNS name of one of your gluster nodes. Now you can check the peer status on each node:

$ gluster peer status
Uuid: vvvv-qqq-zzz-yyyyy-xxxxx
State: Peer in Cluster (Connected)
Other names:
[YOUR-GLUSTER-NODE-NAME]

Setup a Volume

Next you can set up a GlusterFS volume. For that create a data volume on all servers:

$ mkdir -p /data/glusterfs/brick1/gv0

From any single worker node run:

$ gluster volume create gv0 replica 2 [GLUSTER-NODE1]:/data/glusterfs/brick1/gv0 [GLUSTER-NODE2]:/data/glusterfs/brick1/gv0
volume create: gv0: success: please start the volume to access data

replace [GLUSTER-NODE1] with the gluster node dns name or ip address. 

Note: the directory must not be on the root partition. At least you should provide 3 gluster nodes. 

Now you can start your new volume ‘gv0’: 

$ sudo gluster volume start gv0
volume start: gv0: success

With the following command you can check the status of the new volume:

$ sudo gluster volume info

Find more about the setup [here}(https://docs.gluster.org/en/latest/Quick-Start-Guide/Quickstart/).

Kubernetes

Now as your gluster filesystem is up and running it’s time to tell your kubernetes from the new storage.

glusterfs-client

Fist you need to install the glusterfs-client package on your master node. The client is used by the kubernetes scheduler to create the gluster volumes.

$ sudo apt install gluster-client

Persistence Volume Example

The following is an example how to create a volume claim for the GlusterFS within a pod. First you need to create a persistence volume (pv)

gluster-pv.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  # The name of the PV, which is referenced in pod definitions or displayed in various oc volume commands.
  name: gluster-pv   
spec:
  capacity:
    # The amount of storage allocated to this volume.
    storage: 1Gi     
  accessModes:
    # labels to match a PV and a PVC. They currently do not define any form of access control.
  - ReadWriteMany    
  # The glusterfs plug-in defines the volume type being used 
  glusterfs:         
    endpoints: gluster-cluster 
    # Gluster volume name, preceded by /
    path: /gv0
    readOnly: false
  # volume reclaim policy indicates that the volume will be preserved after the pods accessing it terminate.
  # Accepted values include Retain, Delete, and Recycle.
  persistentVolumeReclaimPolicy: Retain

The gluster volume must be created before.

Persistence Volume Claim Example

The persistent volume claim (PVC) specifies the desired access mode and storage capacity. Currently, based on only these two attributes, a PVC is bound to the PV created before. Once a PV is bound to a PVC, that PV is essentially tied to the PVC’s project and cannot be bound to by another PVC. There is a one-to-one mapping of PVs and PVCs. However, multiple pods in the same project can use the same PVC.

gluster-pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: gluster-claim  
spec:
  accessModes:
  - ReadWriteMany      
  resources:
     requests:
       storage: 1Gi

The claim name is referenced by the pod under its volumes section.

Deployment Example

Within a deployment you can than mount a volume based on this claim. See the following example:

apiVersion: apps/v1
kind: Deployment
.....
spec:
  ....
  template:
    ....
    spec:
      containers:
      .....
        image: postgres:9.6.1
        # run as root because of glusterfs
       securityContext:
         runAsUser: 0
         allowPrivilegeEscalation: false
      
        volumeMounts:
        - mountPath: /var/lib/postgresql/data
          name: dbdata
          readOnly: false
      restartPolicy: Always
      volumes:
      - name: dbdata
        persistentVolumeClaim:
          claimName: gluster-claim
....

Find also a more detailed example here.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.