Skip to content

Creating a cluster from zero

You run a single command. A few minutes later you have a working cluster, a dashboard open in your browser, an API answering on a real URL, an identity provider managing the login, a database, a metrics and log stack, ingress, certificates — already in place, already wired together, already reachable. There is no second step where you install the platform on top of the cluster. The platform is what the command installs.

That is the shape of the operation this chapter is about. The interesting part is not the command itself, but the small set of choices that decide what comes out the other side: where the cluster lives, how it is addressed, what kind of certificate it presents on day one, what storage it carries, who is allowed to reach it.

The chapter is written around the control cluster — the first cluster of any Flui installation, the one that hosts the control plane. The same shape applies to a workload cluster, with the control-plane parts taken out.

The choices that shape a cluster

Every cluster is described by a handful of inputs. Most of them have sensible defaults; the ones you change are usually the ones that match a concrete constraint (a region close to your users, a server shape that fits your workload, a stricter firewall).

ChoiceWhat it controls
ProviderWhich cloud the cluster runs on.
RegionWhich provider region the servers sit in. Defaults to the provider’s default region.
Server typeThe shape (CPU / RAM / disk) of every node. Defaults to the recommended size for the chosen provider.
FirewallWhether Flui installs and locks down a default firewall, and which source IPs may reach SSH and the Kubernetes API.
Certificate modeWhether the cluster issues a real Let’s Encrypt certificate or a staging one (see below).
Shared storageWhether the cluster includes a master-hosted shared volume (the default) or falls back to per-node local storage.

A cluster is created master-only by default. Adding workers is a separate operation you run later (covered in Nodes and cluster topology); the same is true of growing the root disk on a server type that allows it — both belong to scaling, not creation.

The control cluster is always created from the CLI. There is no dashboard yet — the dashboard is part of what this very operation brings up, together with the API and the identity provider. Workload clusters, which come later, are created from the dashboard once the control cluster is online.

From the moment the control cluster is up, every further operation — creating workload clusters, deploying apps, scaling nodes, managing firewalls, reading logs — is available from both surfaces. CLI or dashboard is a matter of taste, not capability.

Addressing: nip.io by default, your domain when you want one

A brand new cluster needs a stable URL the moment it is up — for the dashboard, for the API, for the identity provider. Buying a domain and pointing it at the cluster takes longer than creating the cluster itself, so Flui makes this part work without a domain at all.

By default the cluster is addressed through nip.io: the master’s public IP is encoded directly into the hostname (something like api.warm-finch-zs.162-55-56-10.nip.io), and any name that resolves through that scheme automatically resolves to the master. No DNS record to create, no propagation to wait for, no registrar to configure. The cluster is reachable as soon as it is up.

When you want to put the cluster behind a domain you own, Flui handles the swap for you — DNS records, certificates and the control-plane endpoints are rewired in place, without re-creating the cluster. The fact that this is a managed operation rather than manual work is one of Flui’s defining traits. The full story lives in DNS integration.

TLS from minute zero

Every Flui cluster is born with a TLS certificate. There is no “now go install cert-manager and configure ACME” step. The cluster comes up, cert-manager has already issued the certificates the control plane needs, and the dashboard is on HTTPS the first time you open it.

The choice you make at creation time is which Let’s Encrypt endpoint issues the certificate:

  • Production (default). A real, browser-trusted certificate. The cluster is immediately usable from any client without TLS warnings.
  • Staging. A Let’s Encrypt staging certificate, which is not trusted by browsers but does not count against ACME’s production rate limits. Useful when you expect to create and destroy the cluster several times in a short window — for example, while developing against Flui itself.

The trade-off is concrete. Let’s Encrypt enforces several production rate limits on rolling weekly windows that do not reset on demand. To stay clear of the most easily-tripped one — duplicate certificates for the same set of identifiers — Flui draws a random prefix from a pool of roughly a thousand combinations (something like warm-finch-zs.…) for every cluster, so two clusters asking for the exact same names is unlikely rather than impossible. The other limits (per registered domain in particular) still apply, and they matter when you plan to re-create clusters repeatedly: a few times a week is comfortable, a tight test loop is not.

Staging removes those ceilings entirely, at the cost of an untrusted cert your CLI and browser will complain about until you explicitly allow it. If you expect to spin clusters up and down at any volume — automation, integration tests, demo environments — staging is the right default.

A cluster started in staging can later be moved to production; the inverse — production to staging — is rarely useful and not a supported flow.

Storage: shared by default, local on request

When the provider supports attachable block storage, a fresh cluster comes with a shared storage volume attached to the master and exposed to every worker through a network filesystem, fronted by a local cache layer (fscache) on each node so that repeated reads do not pay the network cost every time. On providers without that capability, this part of the cluster is simply absent and workloads fall back to per-node local storage. This is the flui-shared storage class that any workload requiring persistent state can use, including the platform components Flui installs.

Two things are worth knowing about the default:

  1. It is one provider-attached volume, sized at creation, and it can be grown later — never shrunk.
  2. It survives node scaling. Adding or replacing a worker does not move the volume; it stays attached to the master.

You can disable shared storage at creation time and fall back to each node’s local disk. That mode is lighter — no extra volume to pay for — but every workload that needs persistent state then has to deal with the fact that its storage is pinned to whichever node it landed on.

See Storage classes and dedicated placement for the longer story.

The firewall

When the provider supports a managed firewall, Flui installs one as part of cluster creation. The default rule set is small and predictable:

  • SSH is restricted to a source-IP list you provide at creation time (auto-detected from your current IP if you don’t pass one).
  • HTTP and HTTPS are open, so the cluster’s public endpoints work for everyone.
  • Every other port required for internal communication is reachable only on the cluster’s private VNet, never on the public interface.

What the cluster hands back

When creation finishes, two commands tell you everything you need to know about the new cluster:

  • flui env status — health, node list, service health, and the effective public endpoints for the API, the dashboard and the identity provider. This is the read you run first to confirm the cluster is alive.
  • flui env credentials — the first-access details: the bootstrap admin email and password for the dashboard, plus connection information for the in-cluster databases and the monitoring stack. Secrets are hidden by default; pass --show-secrets to print them. The bootstrap password is meant to be rotated the first time you log in.

From this point on the control cluster is operated like any other Flui cluster: it shows up in the cluster list and exposes the same surfaces — nodes, network, storage, firewall, monitoring — accepting the same operations (scaling, storage expansion, firewall updates) as a workload cluster. The fact that the control plane runs on top of it does not make it a black box.

This works thanks to a seeding step that runs the first time the API comes up: every resource the CLI created during the bootstrap — the cluster itself, its nodes, the VNet, the firewall, the SSH keys — is registered into Flui’s own database. From the dashboard’s point of view, the control cluster is just one more row, fully manageable like the workload clusters that come after it.

One operation stays exclusively in the CLI: destroying the control cluster. The reason is mechanical — the dashboard and the API live on that cluster, so the cluster cannot be asked to delete the thing it is running. Workload clusters carry no such constraint and can be destroyed from either surface.

Workload clusters

A workload cluster is created the same way conceptually, with one difference: the control plane is not re-installed on it. The identity provider, the API, the dashboard, the database — those live on the control cluster you already have. A workload cluster carries only the small set of components Flui adds on top of what K3s already ships: certificate lifecycle (cert-manager) and optional cluster-side authorization.

Each node also runs two host-level agents installed at bootstrap time — a metrics exporter and a log shipper — that push the node’s metrics and the cluster’s logs to the control cluster’s observability backend over the private network. They are part of the node, not of the cluster’s Kubernetes payload, and they make every workload cluster show up in the same monitoring views as the control cluster itself.

This is the moment the Multi-cluster topology chapter picks up.

Where this chapter goes from here