Anatomy of the control cluster

The control cluster is not “a cluster with monitoring on it”. It is the place where Flui itself runs, plus the observability backend that every other cluster pushes into, plus the few platform pieces that every Flui cluster needs to function. This chapter walks through that inventory by role.

The components below are reconciled by the platform-components catalog described in Creating a cluster from zero, restricted to the control cluster — except for the baseline ones (certificates, ingress, in-cluster DNS) that every cluster carries.

The platform itself

These are the components that make this cluster the Flui control plane. None of them appear on workload clusters.

Flui API

The service every CLI command, every dashboard request and every third-party integration talks to. It runs on the cluster it controls: there is no cloud-hosted Flui API tier — the API that operates your installation is the one you installed yourself. Reached at a public hostname through the cluster’s ingress.

Not replaceable: it is the control plane.

Flui dashboard

The visual surface of the installation: cluster lists, application views, monitoring, deploy flows, identity. The dashboard and the CLI are equal citizens — they both call the Flui API, and every operation is available on both surfaces (with the lone exception of destroying the control cluster, which stays CLI-only).

Identity provider (Zitadel)

The OIDC authority for the installation. Issues the credentials the CLI uses, the tokens the dashboard uses, and federates with external identity providers when configured. Installed when the installation runs in OIDC mode; in local-auth mode Flui falls back to a database-backed admin user instead.

Replaceable in principle by any OIDC-compliant identity provider; in practice Zitadel is the one Flui ships and tests.

Database

The installation’s state lives here: clusters, nodes, applications, releases, builds, operations, encrypted credentials, DNS zones, certificate metadata. Logs and metrics do not, by design — they live in the observability backend below.

PostgreSQL is the engine Flui ships and tests against.

Cache and queue backend

Every operation Flui drives — provisioning, deploys, builds, lifecycle changes — flows through a job queue, and the same component also serves caches that the API needs to keep small pieces of state fast. Redis is the engine Flui ships and tests against.

The observability backend

The components that make this cluster the place where every other cluster sends its telemetry. They are present on the control cluster only; workload clusters push into them from outside.

Metrics store

A time-series database that receives the metrics every node in every cluster pushes. It exposes a query API in a well-known dialect, which is what the CLI and the dashboard read from when they show app metrics. Today it runs in a single-node shape; the clustered variant is a future scaling step that does not change the user-facing API.

Log store

The central destination for application logs from every cluster. Indexes them by label and serves a query interface that both the dashboard and the CLI use through the Flui API, and that Grafana queries directly when used as the deep-dive tool.

Metrics scraper

The agent that collects metrics from cluster objects and applications and pushes them to the metrics store. It runs on every cluster — including the control cluster — so the control cluster monitors itself with the same tools it uses on every workload cluster.

Alert evaluator

The component that watches the metrics store continuously and turns raw data into useful signals. It fires alerts when a metric crosses a defined threshold for long enough, and it pre-computes heavier queries (rolling averages, per-app aggregates) so the dashboard can read them straight from the store instead of recalculating each time. A baseline set of alerts and aggregates ships pre-defined; operators can extend it.

Object metrics exporter

Surfaces the state of cluster objects — deployments, pods, services — as metrics that the scraper picks up. This is what powers the per-app baseline metrics (replica counts, restart counts, health) the dashboard shows for every application.

Grafana

The advanced exploration tool. Pre-wired to the metrics store and the log store as data sources. Positioned as the deep-debug surface; the everyday path is the Flui dashboard and CLI.

The cluster baseline

These run on every Flui cluster — the control cluster and every workload cluster. They are listed here because the control cluster needs them too, but the full discussion lives with the cluster concept itself.

Ingress

The L7 entry point of the cluster: it terminates TLS, routes requests by hostname and path, and exposes every public surface the cluster offers — including, on the control cluster, the Flui API, the dashboard and the identity provider.

Certificate lifecycle

Issues and renews TLS certificates for every public hostname the cluster serves, against the configured ACME endpoint (production or staging — see Creating a cluster from zero). When the provider exposes DNS as an API, an extension lets the issuer solve DNS-based challenges through that API; other DNS providers are added in the same shape as integration depth grows.

In-cluster DNS

Service discovery inside the cluster. Ships with the cluster runtime; Flui uses it as-is.

Short-lived metrics source

The per-node and per-pod CPU and memory readings that some in-cluster autoscaling decisions read from. Distinct from the observability backend above: it is the cluster runtime’s own short-window source, not the place Flui’s long-term metrics live.

Default storage

A baseline storage class for cluster workloads. On a cluster with shared storage enabled, applications can choose the shared volume or a node-local fallback; on a cluster without shared storage, the fallback is the only option.

Optional cluster-side authentication gate

A small in-cluster service that sits in front of the ingress and checks every incoming request against the installation’s identity provider before letting it through. With it enabled, any endpoint on the cluster can be put behind OIDC login without the application itself having to do anything about authentication.

Flui standardises this on a dedicated subdomain convention: an endpoint published under <app>.internal.<cluster-zone> is treated as gated and goes through the authentication check automatically, while the same app on its plain <app>.<cluster-zone> endpoint stays public. The full story lives in DNS integration.

It is off by default — not because it is rarely useful, but because it adds an extra service on every request path and the overhead is only worth paying on clusters that actually need gated endpoints. Installing it on demand keeps clusters that don’t need it lighter. Flui-specific, not a replaceable component.

What is user-facing, what is internal

The cluster exposes a small public surface and keeps everything else internal.

Public, through the cluster’s ingress: the Flui API, the dashboard and the identity provider. These are the addresses you put in a browser or call from the CLI.
Cluster-internal: the database, the cache/queue, the metrics store, the log store, their agents, and Grafana. Reachable only from inside the cluster — or, for development, through a temporary tunnel that the CLI can open against the master.

This split is deliberate. The public surface is the contract you sign with the platform; the internals are free to evolve underneath without changing what you can call from outside.

Where this chapter goes from here

Observability is not an afterthought — the why this anatomy looks the way it does.
Creating a cluster from zero — when each of these components is installed.
Identity and OIDC — the identity provider’s role in detail.