Persistent storage in Kubernetes with Rook Ceph / CephFS - Operator
Ceph is a highly-reliable, scalable network storage platform which uses individual disks across participating nodes to provide fault-tolerant storage.
Rook provides an operator for Ceph, decomposing the 10-year-old, at-time-arcane, platform into cloud-native components, created declaratively, whose lifecycle is managed by an operator.
To start off with, we need to deploy the ceph operator into the cluster, after which, we'll be able to actually deploy our ceph cluster itself.
Rook Ceph requirements
Preparation
Namespace
We need a namespace to deploy our HelmRelease and associated ConfigMaps into. Per the flux design, I create this example yaml in my flux repo at /bootstrap/namespaces/namespace-rook-ceph.yaml
:
apiVersion: v1
kind: Namespace
metadata:
name: rook-ceph
HelmRepository
We're going to install a helm chart from the Rook Ceph chart repository, so I create the following in my flux repo:
apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
name: rook-release
namespace: flux-system
spec:
interval: 15m
url: https://charts.rook.io/release
Kustomization
Now that the "global" elements of this deployment (just the HelmRepository in this case) have been defined, we do some "flux-ception", and go one layer deeper, adding another Kustomization, telling flux to deploy any YAMLs found in the repo at /rook-ceph
. I create this example Kustomization in my flux repo:
apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
name: rook-ceph
namespace: flux-system
spec:
interval: 30m
path: ./rook-ceph
prune: true # remove any elements later removed from the above path
timeout: 10m # if not set, this defaults to interval duration, which is 1h
sourceRef:
kind: GitRepository
name: flux-system
healthChecks:
- apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
name: cephblockpools.ceph.rook.io
Fast-track your fluxing! 🚀
Is crafting all these YAMLs by hand too much of a PITA?
"Premix" is a git repository, which includes an ansible playbook to auto-create all the necessary files in your flux repository, for each chosen recipe!
Let the machines do the TOIL!
ConfigMap
Now we're into the app-specific YAMLs. First, we create a ConfigMap, containing the entire contents of the helm chart's values.yaml. Paste the values into a values.yaml
key as illustrated below, indented 4 spaces (since they're "encapsulated" within the ConfigMap YAML). I create this example yaml in my flux repo:
apiVersion: v1
kind: ConfigMap
metadata:
name: rook-ceph-helm-chart-value-overrides
namespace: rook-ceph
data:
values.yaml: |- # (1)!
# <upstream values go here>
- Paste in the contents of the upstream
values.yaml
here, intended 4 spaces, and then change the values you need as illustrated below.
Values I change from the default are:
pspEnable: false # (1)!
- PSPs are deprecated, and will eventually be removed in Kubernetes 1.25, at which point this will cause breakage.
HelmRelease
Finally, having set the scene above, we define the HelmRelease which will actually deploy the rook-ceph operator into the cluster. I save this in my flux repo:
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
chart:
spec:
chart: rook-ceph
version: 1.9.x
sourceRef:
kind: HelmRepository
name: rook-release
namespace: flux-system
interval: 30m
timeout: 10m
install:
remediation:
retries: 3
upgrade:
remediation:
retries: -1 # keep trying to remediate
crds: CreateReplace # Upgrade CRDs on package update
releaseName: rook-ceph
valuesFrom:
- kind: ConfigMap
name: rook-ceph-helm-chart-value-overrides
valuesKey: values.yaml # (1)!
- This is the default, but best to be explicit for clarity
Install Rook Ceph Operator!
Commit the changes to your flux repository, and either wait for the reconciliation interval, or force a reconcilliation using flux reconcile source git flux-system
. You should see the kustomization appear...
~ ❯ flux get kustomizations rook-ceph
NAME READY MESSAGE REVISION SUSPENDED
rook-ceph True Applied revision: main/70da637 main/70da637 False
~ ❯
The helmrelease should be reconciled...
~ ❯ flux get helmreleases -n rook-ceph rook-ceph
NAME READY MESSAGE REVISION SUSPENDED
rook-ceph True Release reconciliation succeeded v1.9.9 False
~ ❯
And you should have happy rook-ceph operator pods:
~ ❯ k get pods -n rook-ceph -l app=rook-ceph-operator
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-7c94b7446d-nwsss 1/1 Running 0 5m14s
~ ❯
Summary
What have we achieved? We're half-way to getting a ceph cluster, having deployed the operator which will manage the lifecycle of the ceph cluster we're about to create!
Summary
Created:
- Rook ceph operator running and ready to deploy a cluster!
Next:
- Deploy the ceph cluster using a CR
Chef's notes 📓
///Footnotes Go Here///
Tip your waiter (sponsor) 👏
Did you receive excellent service? Want to compliment the chef? (..and support development of current and future recipes!) Sponsor me on Github / Ko-Fi / Patreon, or see the contribute page for more (free or paid) ways to say thank you! 👏
Employ your chef (engage) 🤝
Is this too much of a geeky PITA? Do you just want results, stat? I do this for a living - I'm a full-time Kubernetes contractor, providing consulting and engineering expertise to businesses needing short-term, short-notice support in the cloud-native space, including AWS/Azure/GKE, Kubernetes, CI/CD and automation.
Learn more about working with me here.
Flirt with waiter (subscribe) 💌
Want to know now when this recipe gets updated, or when future recipes are added? Subscribe to the RSS feed, or leave your email address below, and we'll keep you updated.