My first introduction to Kubernetes was a children's story:
Why would you want to use Kubernetes for your self-hosted recipes over simple Docker Swarm? Here's my personal take..
I use Docker swarm both at home (on a single-node swarm), and on a trio of Ubuntu 16.04 VPSs in a shared lab OpenStack environment.
In both cases above, I'm responsible for maintaining the infrastructure supporting Docker - either the physical host, or the VPS operating systems.
I started experimenting with Kubernetes as a plan to improve the reliability of my cryptocurrency mining pools (the contended lab VPSs negatively impacted the likelihood of finding a block), and as a long-term replacement for my aging home server.
What I enjoy about building recipes and self-hosting is not the operating system maintenance, it's the tools and applications that I can quickly launch in my swarms. If I could only play with the applications, and not bother with the maintenance, I totally would.
Kubernetes (on a cloud provider, mind you!) does this for me. I feed Kubernetes a series of YAML files, and it takes care of all the rest, including version upgrades, node failures/replacements, disk attach/detachments, etc.
Uggh, it's so complicated!¶
Yes, but that's a necessary sacrifice for the maturity, power and flexibility it offers. Like docker-compose syntax, Kubernetes uses YAML to define its various, interworking components.
Let's talk some definitions. Kubernetes.io provides a glossary. My definitions are below:
Node : A compute instance which runs docker containers, managed by a cluster master.
Cluster : One or more "worker nodes" which run containers. Very similar to a Docker Swarm node. In most cloud provider deployments, the master node for your cluster is provided free of charge, but you don't get to access it.
Pod : A collection of one or more the containers. If a pod runs multiple containers, these containers always run on the same node.
Deployment : A definition of a desired state. I.e., "I want a pod with containers A and B running". The Kubernetes master then ensures that any changes necessary to maintain the state are taken. (I.e., if a pod crashes, but is supposed to be running, a new pod will be started)
Service : Unlike Docker Swarm, service discovery is not built in to Kubernetes. For your pods to discover each other (say, to have "webserver" talk to "database"), you create a service for each pod, and refer to these services when you want your containers (in pods) to talk to each other. Complicated, yes, but the abstraction allows you to do powerful things, like auto-scale-up a bunch of database "pods" behind a service called "database", or perform a rolling container image upgrade with zero impact.
External access : Services not only allow pods to discover each other, but they're also the mechanism through which the outside world can talk to a container. At the simplest level, this is akin to exposing a container port on a docker host.
Ingress : When mapping ports to applications is inadequate (think virtual web hosts), an ingress is a sort of "inbound router" which can receive requests on one port (i.e., HTTPS), and forward them to a variety of internal pods, based on things like VHOST, etc. For us, this is the functional equivalent of what Traefik does in Docker Swarm. In fact, we use a Traefik Ingress in Kubernetes to accomplish the same.
Persistent Volume : A virtual disk which is attached to a pod, storing persistent data. Meets the requirement for shared storage from Docker Swarm. I.e., if a persistent volume (PV) is bound to a pod, and the pod dies and is recreated, or get upgraded to a new image, the PV the data is bound to the new container. PVs can be "claimed" in a YAML definition, so that your Kubernetes provider will auto-create a PV when you launch your pod. PVs can be snapshotted.
Namespace : An abstraction to separate a collection of pods, services, ingresses, etc. A "virtual cluster within a cluster". Can be used for security, or simplicity. For example, since we don't have individual docker stacks anymore, if you commonly name your database container "db", and you want to deploy two applications which both use a database container, how will you name your services? Use namespaces to keep each application ("nextcloud" vs "kanboard") separate. Namespaces also allow you to allocate resources limits to the aggregate of containers in a namespace, so you could, for example, limit the "nextcloud" namespace to 2.3 CPUs and 1200MB RAM.
Mm.. maaaaybe, how do I start?¶
If you're like me, and you learn by doing, either play with the examples at https://labs.play-with-k8s.com/, or jump right in by setting up a Google Cloud trial (you get $300 credit for 12 months), or a small cluster on Digital Ocean.
If you're the learn-by-watching type, just search for "Kubernetes introduction video". There's a lot of great content available.
I'm ready, gimme some recipes!¶
As of Jan 2019, our first (and only!) Kubernetes recipe is a WIP for the Mosquitto MQTT broker. It's a good, simple starter if you're into home automation (shoutout to Home Assistant!), since it only requires a single container, and a simple NodePort service.
I'd love for your feedback on the Kubernetes recipes, as well as suggestions for what to add next. The current rough plan is to replicate the Chef's Favorites recipes (see the left-hand panel) into Kubernetes first.
Still with me? Good. Move on to reviewing the design elements
- Start (this page) - Why Kubernetes?
- Design - How does it fit together?
- Cluster - Setup a basic cluster
- Load Balancer - Setup inbound access
- Snapshots - Automatically backup your persistent data
- Helm - Uber-recipes from fellow geeks
- Traefik - Traefik Ingress via Helm
Tip your waiter (support me) 👏¶
Did you receive excellent service? Want to make your waiter happy? (..and support development of current and future recipes!) See the support page for (free or paid) ways to say thank you! 👏