Skip to content

Add CSI VolumeSnapshot support with snapshot support

Before we deploy snapshot-controller to actually manage the snapshots we take, we need the validation webhook to make sure it's done "right".

Snapshot Controller requirements

Ingredients

Already deployed:

Preparation

Snapshot Controller Namespace

We need a namespace to deploy our HelmRelease and associated YAMLs into. Per the flux design, I create this example yaml in my flux repo at /bootstrap/namespaces/namespace-snapshot-controller.yaml:

/bootstrap/namespaces/namespace-snapshot-controller.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: snapshot-controller

Snapshot Controller Kustomization

Now that the "global" elements of this deployment (just the HelmRepository in this case) have been defined, we do some "flux-ception", and go one layer deeper, adding another Kustomization, telling flux to deploy any YAMLs found in the repo at /snapshot-controller/. I create this example Kustomization in my flux repo:

/bootstrap/kustomizations/kustomization-snapshot-controller.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: snapshot-controller
  namespace: flux-system
spec:
  interval: 30m
  path: ./snapshot-controller
  prune: true # remove any elements later removed from the above path
  timeout: 10m # if not set, this defaults to interval duration, which is 1h
  sourceRef:
    kind: GitRepository
    name: flux-system
  healthChecks:
    - apiVersion: helm.toolkit.fluxcd.io/v2beta1
      kind: HelmRelease
      name: snapshot-controller
      namespace: snapshot-controller

Fast-track your fluxing! 🚀

Is crafting all these YAMLs by hand too much of a PITA?

I automatically and instantly share (with my sponsors) a private "premix" git repository, which includes an ansible playbook to auto-create all the necessary files in your flux repository, for each chosen recipe!

Let the machines do the TOIL! 🏋️‍♂️

Snapshot Controller HelmRelease

Lastly, having set the scene above, we define the HelmRelease which will actually deploy snapshot-controller into the cluster. We start with a basic HelmRelease YAML, like this example:

/snapshot-controller/helmrelease-snapshot-controller.yaml
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: snapshot-controller
  namespace: snapshot-controller
spec:
  chart:
    spec:
      chart: snapshot-controller
      version: 1.8.x # auto-update to semver bugfixes only (1)
      sourceRef:
        kind: HelmRepository
        name: piraeus-charts
        namespace: flux-system
  interval: 15m
  timeout: 5m
  releaseName: snapshot-controller
  values: # paste contents of upstream values.yaml below, indented 4 spaces (2)
  1. I like to set this to the semver minor version of the Snapshot Controller current helm chart, so that I'll inherit bug fixes but not any new features (since I'll need to manually update my values to accommodate new releases anyway)
  2. Paste the full contents of the upstream values.yaml here, indented 4 spaces under the values: key

If we deploy this helmrelease as-is, we'll inherit every default from the upstream Snapshot Controller helm chart. That's probably hardly ever what we want to do, so my preference is to take the entire contents of the Snapshot Controller helm chart's values.yaml, and to paste these (indented), under the values key. This means that I can then make my own changes in the context of the entire values.yaml, rather than cherry-picking just the items I want to change, to make future chart upgrades simpler.

Why not put values in a separate ConfigMap?

Didn't you previously advise to put helm chart values into a separate ConfigMap?

Yes, I did. And in practice, I've changed my mind.

Why? Because having the helm values directly in the HelmRelease offers the following advantages:

  1. If you use the YAML extension in VSCode, you'll see a full path to the YAML elements, which can make grokking complex charts easier.
  2. When flux detects a change to a value in a HelmRelease, this forces an immediate reconciliation of the HelmRelease, as opposed to the ConfigMap solution, which requires waiting on the next scheduled reconciliation.
  3. Renovate can parse HelmRelease YAMLs and create PRs when they contain docker image references which can be updated.
  4. In practice, adapting a HelmRelease to match upstream chart changes is no different to adapting a ConfigMap, and so there's no real benefit to splitting the chart values into a separate ConfigMap, IMO.

Then work your way through the values you pasted, and change any which are specific to your configuration.

Configure for rook-ceph

Under the HelmRelease values which you pasted from upstream, you'll note a section for volumeSnapshotClasses. By default, this is populated with commented out examples. To configure snapshot-controller to work with rook-ceph, replace these commented values as illustrated below:

/snapshot-controller/helmrelease-snapshot-controller.yaml (continued)
  values:
    # extra content from upstream
    volumeSnapshotClasses:
    - name: csi-rbdplugin-snapclass
      driver: rook-ceph.rbd.csi.ceph.com # driver:namespace:operator
      labels:
        velero.io/csi-volumesnapshot-class: "true"
      parameters:
        # Specify a string that identifies your cluster. Ceph CSI supports any
        # unique string. When Ceph CSI is deployed by Rook use the Rook namespace,
        # for example "rook-ceph".
        clusterID: rook-ceph # namespace:cluster
        csi.storage.k8s.io/snapshotter-secret-name: rook-csi-rbd-provisioner
        csi.storage.k8s.io/snapshotter-secret-namespace: rook-ceph # namespace:cluster
      deletionPolicy: Delete # docs suggest this may need to be set to "Retain" for restoring

Install Snapshot Controller!

Commit the changes to your flux repository, and either wait for the reconciliation interval, or force a reconcilliation using flux reconcile source git flux-system. You should see the kustomization appear...

~  flux get kustomizations snapshot-controller
NAME        READY   MESSAGE                         REVISION        SUSPENDED
snapshot-controller True    Applied revision: main/70da637  main/70da637    False
~ 

The helmrelease should be reconciled...

~  flux get helmreleases -n snapshot-controller snapshot-controller
NAME        READY   MESSAGE                             REVISION    SUSPENDED
snapshot-controller True    Release reconciliation succeeded    v1.8.x      False
~ 

And you should have happy pods in the snapshot-controller namespace:

~  k get pods -n snapshot-controller -l app.kubernetes.io/name=snapshot-controller
NAME                                  READY   STATUS    RESTARTS   AGE
snapshot-controller-7c94b7446d-nwsss   1/1     Running   0          5m14s
~ 

Summary

What have we achieved? We've got snapshot-controller running, and ready to manage VolumeSnapshots on behalf of Velero, for handy in-cluster volume backups!

Summary

Created:

  • snapshot-controller running and ready to snap 📷 !

Next:

  • Configure Velero with a VolumeSnapshotLocation, so that volume snapshots can be made as part of a BackupSchedule!

Chef's notes 📓

///Footnotes Go Here///

Tip your waiter (sponsor) 👏

Did you receive excellent service? Want to compliment the chef? (..and support development of current and future recipes!) Sponsor me on Github / Ko-Fi / Patreon, or see the contribute page for more (free or paid) ways to say thank you! 👏

Employ your chef (engage) 🤝

Is this too much of a geeky PITA? Do you just want results, stat? I do this for a living - I'm a full-time Kubernetes contractor, providing consulting and engineering expertise to businesses needing short-term, short-notice support in the cloud-native space, including AWS/Azure/GKE, Kubernetes, CI/CD and automation.

Learn more about working with me here.

Flirt with waiter (subscribe) 💌

Want to know now when this recipe gets updated, or when future recipes are added? Subscribe to the RSS feed, or leave your email address below, and we'll keep you updated.

Your comments? 💬