Persistent storage in Kubernetes with Rook Ceph / CephFS - Cluster
Ceph is a highly-reliable, scalable network storage platform which uses individual disks across participating nodes to provide fault-tolerant storage.
Rook provides an operator for Ceph, decomposing the 10-year-old, at-time-arcane, platform into cloud-native components, created declaratively, whose lifecycle is managed by an operator.
In the previous recipe, we deployed the operator, and now to actually deploy a Ceph cluster, we need to deploy a custom resource (a "CephCluster"), which will instruct the operator on we'd like our cluster to be deployed.
We'll end up with multilpe storageClasses which we can use to allocate storage to pods from either Ceph RBD (block storage), or CephFS (a mounted filesystem). In many cases, CephFS is a useful choice, because it can be mounted from more than one pod at the same time, which makes it suitable for apps which need to share access to the same data (NZBGet, Sonarr, and Plex, for example)
We already deployed a rook-ceph namespace when deploying the Rook Ceph Operator, so we don't need to create this again 1
HelmRepository
Likewise, we'll install the rook-ceph-cluster helm chart from the same Rook-managed repository as we did the rook-ceph (operator) chart, so we don't need to create a new HelmRepository.
Kustomization
We do, however, need a separate Kustomization for rook-ceph-cluster, telling flux to deploy any YAMLs found in the repo at /rook-ceph-cluster. I create this example Kustomization in my flux repo:
Why a separate Kustomization if both are needed for rook-ceph?
While technically we could use the same Kustomization to deploy both rook-ceph and rook-ceph-cluster, we'd run into dependency issues. It's simpler and cleaner to deploy rook-ceph first, and then list it as a dependency for rook-ceph-cluster.
apiVersion:kustomize.toolkit.fluxcd.io/v1beta2kind:Kustomizationmetadata:name:rook-ceph-cluster--rook-cephnamespace:flux-systemspec:dependsOn:-name:"rook-ceph"interval:30mpath:./rook-ceph-clusterprune:true# remove any elements later removed from the above pathtimeout:10m# if not set, this defaults to interval duration, which is 1hsourceRef:kind:GitRepositoryname:flux-system
Fast-track your fluxing! 🚀
Is crafting all these YAMLs by hand too much of a PITA?
"Premix" is a git repository, which includes an ansible playbook to auto-create all the necessary files in your flux repository, for each chosen recipe!
Let the machines do the TOIL!
ConfigMap
Now we're into the app-specific YAMLs. First, we create a ConfigMap, containing the entire contents of the helm chart's values.yaml. Paste the values into a values.yaml key as illustrated below, indented 4 spaces (since they're "encapsulated" within the ConfigMap YAML). I create this example yaml in my flux repo:
apiVersion:v1kind:ConfigMapmetadata:name:rook-ceph-cluster-helm-chart-value-overridesnamespace:rook-cephdata:values.yaml:|-# <upstream values go here>
Here are some suggested changes to the defaults which you should consider:
toolbox:enabled:truemonitoring:# enabling will also create RBAC rules to allow Operator to create ServiceMonitorsenabled:true# whether to create the prometheus rulescreatePrometheusRules:truepspEnable:falseingress:dashboard:{}
Further to the above, decide which disks you want to dedicate to Ceph, and add to the cephClusterSpec section.
The default configuration (below) will cause the operator to use any un-formatted disks found on any of your nodes. If this is what you want to happen, then you don't need to change anything.
cephClusterSpec:storage:# cluster level storage configuration and selectionuseAllNodes:trueuseAllDevices:true
If you'd rather be a little more selective / declarative about which disks are used in a homogenous cluster, you could consider using deviceFilter, like this:
cephClusterSpec:storage:# cluster level storage configuration and selectionuseAllNodes:trueuseAllDevices:falsedeviceFilter:sdc
If your cluster nodes are a little more snowflakey , here's a complex example:
cephClusterSpec:storage:# cluster level storage configuration and selectionuseAllNodes:falseuseAllDevices:falsenodes:-name:"teeny-tiny-node"deviceFilter:"."-name:"bigass-node"devices:-name:"/dev/disk/by-path/pci-0000:01:00.0-sas-exp0x500404201f43b83f-phy11-lun-0"config:metadataDevice:"/dev/osd-metadata/11"-name:"nvme0n1"-name:"nvme1n1"
HelmRelease
Finally, having set the scene above, we define the HelmRelease which will actually deploy the rook-ceph operator into the cluster. I save this in my flux repo:
apiVersion:helm.toolkit.fluxcd.io/v2beta1kind:HelmReleasemetadata:name:rook-ceph-clusternamespace:rook-cephspec:chart:spec:chart:rook-ceph-clusterversion:1.9.xsourceRef:kind:HelmRepositoryname:rook-releasenamespace:flux-systeminterval:30mtimeout:10minstall:remediation:retries:3upgrade:remediation:retries:-1# keep trying to remediatecrds:CreateReplace# Upgrade CRDs on package updatereleaseName:rook-ceph-clustervaluesFrom:-kind:ConfigMapname:rook-ceph-cluster-helm-chart-value-overridesvaluesKey:values.yaml
Install Rook Ceph Operator!
Commit the changes to your flux repository, and either wait for the reconciliation interval, or force a reconcilliation using flux reconcile source git flux-system. You should see the kustomization appear...
Assuming you have an Ingress Controller setup, and you've either picked a default IngressClass, or defined the dashboard ingress appropriately, you should be able to access your Ceph Dashboard, at the URL identified by the ingress (this is a good opportunity to check that the ingress deployed correctly):
What have we achieved? We're half-way to getting a ceph cluster, having deployed the operator which will manage the lifecycle of the ceph cluster we're about to create!
Summary
Created:
Ceph cluster has been deployed
StorageClasses are available so that the cluster storage can be consumed by your pods
Pretty graphs are viewable in the Ceph Dashboard
Chef's notes 📓
Unless you wanted to deploy your cluster components in a separate namespace to the operator, of course! ↩
Tip your waiter (sponsor) 👏
Did you receive excellent service? Want to compliment the chef? (..and support development of current and future recipes!) Sponsor me on Github / Ko-Fi / Patreon, or see the contribute page for more (free or paid) ways to say thank you! 👏
Employ your chef (engage) 🤝
Is this too much of a geeky PITA? Do you just want results, stat? I do this for a living - I'm a full-time Kubernetes contractor, providing consulting and engineering expertise to businesses needing short-term, short-notice support in the cloud-native space, including AWS/Azure/GKE, Kubernetes, CI/CD and automation.
Want to know now when this recipe gets updated, or when future recipes are added? Subscribe to the RSS feed, or leave your email address below, and we'll keep you updated.