Skip to main content

Setup a Docker Swarm cluster for less than $30 / month

·7 mins
Build your own cheap but powerful self-hosted cluster and be free from any SaaS solutions by following this opinionated guide 🎉

Why Docker Swarm 🧐 ? #

Because Docker Swarm Rocks !

Yeah, for some people it seems a little outdated now in 2022, a period where Kubernetes is everywhere, but I’m personally convicted that it’s really so underrated. Except for training, you really don’t have to throw yourself into all Kubernetes fuzzy complicated things, at least in a personal Homelab perspective.

Of course with Docker Swarm you’ll be completely limited to what Docker API has to offer, without any abstraction, contrary to K8S, which built its community around new abstracted orchestration concepts, like StatefulSets, operators, Helm, etc. But it’s the intended purpose of Swarm ! Not many new things to learn once you master docker.

The 2022 Docker Swarm guide 🚀 #

I’ll try to show you step by step how to install your own serious containerized cluster for less than $30 by using Hetzner, one of the best Cloud provider on European market, with cheap yet really powerful VPS. Besides, they just recently opened new centers in America !

This tutorial is a sort of massive 2022 update from the well-known dockerswarm.rocks, with a further comprehension under the hood. It’s NOT a quick and done tutorial, as we’ll go very deeply, but at least you will understand all it’s going on. It’s divided into 8 parts, so be prepared ! The prerequisites before continue :

  • Have some fundamentals on Docker
  • Be comfortable with SSH terminal
  • Registered for a Hetzner Cloud account, at least for the part 2, or feel free to adapt to any other VPS provider
  • A custom domain, I’ll use dockerswarm.rocks here as an example
  • An account to a transactional mail provider as Mailgun, SendGrid, Sendinblue, etc. as a bonus.

Final goal 🎯 #

In the very end of this multi-steps guide, you will have complete working production grade secured cluster, backup included, with optional monitoring and complete development CI/CD workflow.

1. Cluster initialization 🌍 #

  • Hetzner VPS setups under Ubuntu 20.04 with proper firewall configuration
  • SaltStack for efficient node management
  • Docker Swarm installation, with 1 manager and 2 workers
  • Traefik, a cloud native reverse proxy with automatic service discovery and SSL configuration
  • Portainer as simple GUI for containers management

2. The stateful part 💾 #

Because Docker Swarm is not really suited for managing stateful containers (an area where K8S can shine thanks to operators), I choose to use 1 dedicated VPS for all data critical part. We will install :

  • GlusterFS as network filesystem, configured for cluster nodes
  • PostgreSQL as main production database
  • MySQL as additional secondary database (optional)
  • Redis as fast database cache (optional)
  • Elasticsearch as database for indexes
  • Restic as S3 backup solution

Note as I will not set up this data server for HA (High Availability) here, as it’s a complete another topic. But note as every chosen tool’s here can be clustered.

There are many debates about using databases as docker container, but I personally prefer use managed server for better control, local on-disk performance, central backup management and easier possibility of database clustering.
Note as on the Kubernetes world, running containerized AND clustered databases becomes reality thanks to powerful operators that provide clustering. There is obviously no such things on Docker Swarm 🙈.

3. Testing the cluster ✅ #

We will use the main Portainer GUI in order to install following tools :

  • Diun (optional), very useful in order to be notified for all used images update inside your Swarm cluster
  • Cron for distributed cron across all cluster
  • pgAdmin and phpMyAdmin as web database managers (optional)
  • Some containerized app samples as MinIO, Matomo, Redmine, n8n, that will show you how simple is it to install self-hosted web apps thanks to your shiny new cluster !

4. Monitoring 📈 #

This is an optional part, feel free to skip. We’ll set up production grade monitoring and tracing with complete dashboards.

  • Prometheus as time series DB for monitoring
    • We will configure many metrics exporter for each critical part (Data node, PostgreSQL, MySQL, containers detail thanks to cAdvisor)
    • Basic usage of PromQL
  • Loki with Promtail for centralized logs, fetched from data node and docker containers
  • Jaeger as main tracing tool, with Elasticsearch as main data storage
  • Configure Traefik for metrics, logs and tracing as perfect sample
  • Grafana as GUI dashboard builder with many battery included dashboards
    • Monitoring all the cluster
    • Node, PostgreSQL and MySQL metrics
    • Navigate through log history of all containers and data server node thanks to Loki like ELK, with LogQL

5. CI/CD setup 💻 #

  • Gitea as lightweight centralized control version, in case you want get out of Github / GitLab Cloud
  • Private docker registry with minimal UI for all your custom app images that will be built on your development process and be used as based image for your production docker on cluster
  • Drone CI as self-hosted CI/CD solution
  • SonarQube as self-hosted quality code control
  • Get perfect load testing environment with k6 + InfluxDB + Grafana combo

We’ll entirely test the above configuration with the basic .NET weather API.

Cluster Architecture 🏘️ #

Note as this cluster will be intended for developer user with complete self-hosted CI/CD solution. So for a good cluster architecture starting point, we can imagine the following nodes :

serverdescription
manager-01The frontal manager node, with proper reverse proxy and some management tools
worker-01A worker for your production/staging apps
runner-01An additional worker dedicated to CI/CD pipelines execution
data-01The critical data node, with attached and resizable volume for better flexibility
flowchart TD subgraph manager-01 traefik((Traefik))<-- Container Discovery -->docker[Docker API] end subgraph worker-01 my-app-01((My App 01)) my-app-02((My App 02)) end subgraph runner-01 runner((Drone CI runner)) end subgraph data-01 logs[Loki] postgresql[(PostgreSQL)] files[/GlusterFS/] mysql[(MySQL)] end manager-01 == As Worker Node ==> worker-01 manager-01 == As Worker Node ==> runner-01 traefik -. reverse proxy .-> my-app-01 traefik -. reverse proxy .-> my-app-02 my-app-01 -.-> postgresql my-app-02 -.-> mysql my-app-01 -.-> files my-app-02 -.-> files

Note as the hostnames correspond to a particular type of server, dedicated for one task specifically. Each type of node can be scaled as you wish :

replicadescription
manager-0xFor advanced resilient Swarm quorum
worker-0xFor better scaling production apps, the easiest to set up
runner-0xMore power for pipeline execution
data-0xThe hard part for data HA, with GlusterFS replications, DB clustering for PostgreSQL and MySQL, etc.
For a simple production cluster, you can start with only manager-01 and data-01 as minimal start.
For a development perspective, you can skip worker-01 and use manager-01 for production running.
You have plenty choices here according to your budget !

Cheap solution with Hetzner VPS 🖥️ #

Here some of the cheapest VPS options we have at this time of writing (02/2022) :

Server TypeSpecPrice
CPX11 (AMD)2C/2G/40Go€4.79
CX21 (Intel)3C/4G/80Go€5.88
CPX21 (AMD)3C/4G/80Go€8.28

My personal choice for a cheap yet well-balanced cluster :

Server NameTypeWhy
manager-01CX21I’ll privilege RAM for running many management based container tools
worker-01CX21 or CPX21Just a power choice matter for your app
runner-01CPX112 powerful EPYC core is better for fast app building
data-01CX21 or CPX21Just a power choice matter for your databases

We’ll take additional volume for data-01 of 20 Go for €0.96. So we finally arrive to following respectable budget range : €23.39 - $29.39.

The main difference is choice between Xeon VS EPIC as CPU for worker-01 and data-01 nodes, which will our main critical production application nodes. A quick sysbench will indicates around 70-80% more power for AMD (test date from 2022-02). Choose wisely according to your needs.

Note as a Swarm node with manager role can act as a worker as well. So if you’re very budget limited, you can eventually skip worker-01 and runner-01 nodes, with only one simple standalone docker host (no Swarm mode). So you can go down to €14,64 with only 2 CX21 in addition to volume.

If you intend to have your own self-hosted GitLab for an enterprise grade CI/CD workflow, you should run it on node with 8 GB of RAM.
4 GB is doable if you run just one single GitLab container on it with Prometheus mode disabled and external PostgreSQL.

Let’s party 🎉 #

All presentation is done, go to the next part for starting !