Add CSI VolumeSnapshot support with snapshot support
Before we deploy snapshot-controller to actually manage the snapshots we take, we need the validation webhook to make sure it's done "right".
Snapshot Controller requirements
Ingredients
Already deployed:
- A Kubernetes cluster
- Flux deployment process bootstrapped
- snapshot-validation-webhook deployed
Preparation
Snapshot Controller Namespace
We need a namespace to deploy our HelmRelease and associated YAMLs into. Per the flux design, I create this example yaml in my flux repo at /bootstrap/namespaces/namespace-snapshot-controller.yaml
:
apiVersion: v1
kind: Namespace
metadata:
name: snapshot-controller
Snapshot Controller Kustomization
Now that the "global" elements of this deployment (just the HelmRepository in this case) have been defined, we do some "flux-ception", and go one layer deeper, adding another Kustomization, telling flux to deploy any YAMLs found in the repo at /snapshot-controller/
. I create this example Kustomization in my flux repo:
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: snapshot-controller
namespace: flux-system
spec:
interval: 30m
path: ./snapshot-controller
prune: true # remove any elements later removed from the above path
timeout: 10m # if not set, this defaults to interval duration, which is 1h
sourceRef:
kind: GitRepository
name: flux-system
healthChecks:
- apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
name: snapshot-controller
namespace: snapshot-controller
Fast-track your fluxing! 🚀
Is crafting all these YAMLs by hand too much of a PITA?
"Premix" is a git repository, which includes an ansible playbook to auto-create all the necessary files in your flux repository, for each chosen recipe!
Let the machines do the TOIL!
Snapshot Controller HelmRelease
Lastly, having set the scene above, we define the HelmRelease which will actually deploy snapshot-controller into the cluster. We start with a basic HelmRelease YAML, like this example:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: snapshot-controller
namespace: snapshot-controller
spec:
chart:
spec:
chart: snapshot-controller
version: 1.8.x # auto-update to semver bugfixes only (1)
sourceRef:
kind: HelmRepository
name: piraeus-charts
namespace: flux-system
interval: 15m
timeout: 5m
releaseName: snapshot-controller
values: # paste contents of upstream values.yaml below, indented 4 spaces (2)
- I like to set this to the semver minor version of the Snapshot Controller current helm chart, so that I'll inherit bug fixes but not any new features (since I'll need to manually update my values to accommodate new releases anyway)
- Paste the full contents of the upstream values.yaml here, indented 4 spaces under the
values:
key
If we deploy this helmrelease as-is, we'll inherit every default from the upstream Snapshot Controller helm chart. That's probably hardly ever what we want to do, so my preference is to take the entire contents of the Snapshot Controller helm chart's values.yaml, and to paste these (indented), under the values
key. This means that I can then make my own changes in the context of the entire values.yaml, rather than cherry-picking just the items I want to change, to make future chart upgrades simpler.
Why not put values in a separate ConfigMap?
Didn't you previously advise to put helm chart values into a separate ConfigMap?
Yes, I did. And in practice, I've changed my mind.
Why? Because having the helm values directly in the HelmRelease offers the following advantages:
- If you use the YAML extension in VSCode, you'll see a full path to the YAML elements, which can make grokking complex charts easier.
- When flux detects a change to a value in a HelmRelease, this forces an immediate reconciliation of the HelmRelease, as opposed to the ConfigMap solution, which requires waiting on the next scheduled reconciliation.
- Renovate can parse HelmRelease YAMLs and create PRs when they contain docker image references which can be updated.
- In practice, adapting a HelmRelease to match upstream chart changes is no different to adapting a ConfigMap, and so there's no real benefit to splitting the chart values into a separate ConfigMap, IMO.
Then work your way through the values you pasted, and change any which are specific to your configuration.
Configure for rook-ceph
Under the HelmRelease values which you pasted from upstream, you'll note a section for volumeSnapshotClasses
. By default, this is populated with commented out examples. To configure snapshot-controller to work with rook-ceph, replace these commented values as illustrated below:
values:
# extra content from upstream
volumeSnapshotClasses:
- name: csi-rbdplugin-snapclass
driver: rook-ceph.rbd.csi.ceph.com # driver:namespace:operator
labels:
velero.io/csi-volumesnapshot-class: "true"
parameters:
# Specify a string that identifies your cluster. Ceph CSI supports any
# unique string. When Ceph CSI is deployed by Rook use the Rook namespace,
# for example "rook-ceph".
clusterID: rook-ceph # namespace:cluster
csi.storage.k8s.io/snapshotter-secret-name: rook-csi-rbd-provisioner
csi.storage.k8s.io/snapshotter-secret-namespace: rook-ceph # namespace:cluster
deletionPolicy: Delete # docs suggest this may need to be set to "Retain" for restoring
Install Snapshot Controller!
Commit the changes to your flux repository, and either wait for the reconciliation interval, or force a reconcilliation using flux reconcile source git flux-system
. You should see the kustomization appear...
~ ❯ flux get kustomizations snapshot-controller
NAME READY MESSAGE REVISION SUSPENDED
snapshot-controller True Applied revision: main/70da637 main/70da637 False
~ ❯
The helmrelease should be reconciled...
~ ❯ flux get helmreleases -n snapshot-controller snapshot-controller
NAME READY MESSAGE REVISION SUSPENDED
snapshot-controller True Release reconciliation succeeded v1.8.x False
~ ❯
And you should have happy pods in the snapshot-controller namespace:
~ ❯ k get pods -n snapshot-controller -l app.kubernetes.io/name=snapshot-controller
NAME READY STATUS RESTARTS AGE
snapshot-controller-7c94b7446d-nwsss 1/1 Running 0 5m14s
~ ❯
Summary
What have we achieved? We've got snapshot-controller running, and ready to manage VolumeSnapshots on behalf of Velero, for handy in-cluster volume backups!
Summary
Created:
- snapshot-controller running and ready to snap !
Next:
- Configure Velero with a VolumeSnapshotLocation, so that volume snapshots can be made as part of a BackupSchedule!
Chef's notes 📓
///Footnotes Go Here///
Tip your waiter (sponsor) 👏
Did you receive excellent service? Want to compliment the chef? (..and support development of current and future recipes!) Sponsor me on Github / Ko-Fi / Patreon, or see the contribute page for more (free or paid) ways to say thank you! 👏
Employ your chef (engage) 🤝
Is this too much of a geeky PITA? Do you just want results, stat? I do this for a living - I'm a full-time Kubernetes contractor, providing consulting and engineering expertise to businesses needing short-term, short-notice support in the cloud-native space, including AWS/Azure/GKE, Kubernetes, CI/CD and automation.
Learn more about working with me here.
Flirt with waiter (subscribe) 💌
Want to know now when this recipe gets updated, or when future recipes are added? Subscribe to the RSS feed, or leave your email address below, and we'll keep you updated.