I got 4 20TB drives from Amazon around Black Friday that I want to get setup for network storage. I’ve got 3 descent Ryzen 5000 series desktops that I was thinking about setting up so that I could build my own mini-Kubernetes cluster, but I don’t know if I have enough motivation. I’m pretty OCD so small projects often turn into big projects.
I don’t have an ECC motherboard though, so I want to get some input if BTRFS, ZFS, TrueNAS, or some other solution should be relatively safe without it? I guess it is a risk-factor but I haven’t had any issues yet (fingers crossed). I’ve been out of the CNCF space for a while but Rook used to be the way to go for Ceph on Kubernetes. Has there been any new projects worth checking out or should I just do RAID and get it over with? Does Ceph offer the same level of redundancy or performance? The boards have a single M.2 slot so I could add in some SSD caching.
If I go with RAID, should I do RAID 5 or 6? I’m also a bit worried because the drives are all the same so if there is an issue it could hit multiple drives at once, but I plan to try to have an online backup somewhere and if I order more drives I’ll balance it out with a different manufacturer.
If you’ve got >=3 machines and >=3 devices, I’d suggest at least strongly considering Rook. It should allow for future growth and will let you tolerate the loss of one node at the storage level too, assuming you have replication configured. Which (replication params) you can set per StorageClass in case you want to squeeze every last byte out for cases where you don’t need storage-level replication.
I’ve run my own k8s cluster for years now, and solid storage from rook really made it take off with respect to how many applications I can build and/or run on it.
As for backup, there’s velero. Though I haven’t gotten it to work on bare metal. My ideal would be to just use it to store backups in Backblaze B2 given the ridiculously low cost. Presumably I could get there with restic, since that’s my outside-k8s backup solution, but I still haven’t gotten that set up since it’s much more cloud-provider friendly.
Thanks. That is what I’m leaning towards. Do you have any suggestions for a particular distro for your K8S nodes? I’m running Arch on my desktop.
The idea of being able to setup different storage classes is very appealing, as well as learning how to build my on K8S cloud.
I’m also running arch. Unfortunately I’ve been running mine long enough that it’s just my own bespoke Ansible playbooks for configs that have morphed only as required by breaking changes or features/security I want to add. I think the best way to start from scratch these days is kubeadm, and I think it should be fairly straightforward on arch or whatever distro you like.
Fundamentally my setup is just kubelet and kubeproxy on every node, the oci runtime (CRIO for me), etcd (set up manually but certs are automated now) and then some k8s manifests templated and dropped into the k8s manifest folder for the control plane on 3 nodes for HA. The more I think about it, the more I remember how complicated it is unless you want a private CA. Which I have and love the convenience and privacy it affords me (no CTL exposing domain names unless I need public certs and they’re public anyway).
I have expanded to 6 nodes (5 of which remain, RIP laptop SSD) and just run arch on all of them because it kinda just works and I like the consistency. I also got quite good at the arch install in the process.
That’s rad… I have a set of Ansible playbooks/roles/collections already for most system-wide settings. I have a love-hate relationship with Ansible though, but it gets the job done. I may try for cloud-init first until I reach its limitations. I’ve gotten pretty good at the Arch install too, although setting up the disks with LUKS was the most challenging part. Fortunately, the few times I’ve broke things I’ve been able to boot the installer ISO and mount my LUKS volumes from memory, but I couldn’t tell you how I set them up in the first place. 🤣 However I do it, I really just want to automate the process so that I can add new nodes and expand should I decide to rent out colocation space someday.