Storage classes and dedicated placement

Most workloads do not need anything special from the disk under them. They write the occasional file, they read it back later, and they can move freely from one node to another whenever the cluster needs them to.

Some workloads, though, depend on the disk behaving in a precise way. They expect their writes to land in order, their locks to be honoured, their “I saved this” to be a promise the kernel actually keeps. Databases are the canonical example, but the property is what matters, not the workload’s name: anything that relies on real filesystem semantics belongs in this second group. Shared storage that fronts a network filesystem cannot always guarantee those semantics, and the mismatch ranges from “surprising performance” to “quietly losing the last seconds of writes”.

Flui ships two storage classes — one for each group — and lets the application pick the right one.

Two storage classes

Every persistent volume on a Flui cluster ends up in one of two classes:

Shared. The default, NFS-backed class. A single volume on the master is exposed to every node as a network filesystem, fronted by a local cache on each node. Any pod on any node can read and write through the same address. This is the right answer for stateless apps, log directories, and any workload that occasionally writes a file and re-reads it later.
Dedicated. The node-local flui-local class, backed by a node’s own disk. Writes hit that disk directly — there is no NFS hop in the path — and the pod that uses it is pinned to the node hosting the disk; once placed, it does not move. This is the right answer for databases and anything that depends on the disk behaving like a real disk.

The choice is on the application side: a catalog manifest declares whether it needs dedicated placement, and Flui takes care of the rest. An application that does not declare anything gets shared placement.

Where dedicated workloads live

When an application asks for dedicated placement, Flui has to decide which node hosts its disk. The default target is a worker node, not the master. At deploy time Flui looks at every worker’s free capacity and auto-pins the app to the one with the most room — recording that choice as the app’s dedicated node name, so subsequent operations know exactly where its disk lives. Keeping dedicated workloads off the master keeps the control plane free of stateful, node-locked tenants and lets their hosts grow independently.

There is one thing this default needs and cannot invent: a worker has to exist. If a cluster has no worker at all and an app asks for dedicated placement, the deploy does not silently fall back — it fails loudly with the code NO_WORKER_FOR_DEDICATED_APP, telling you to add a worker (or to use the escape hatch below).

The master escape hatch

For the genuinely small case — a single-node cluster, or a deliberate choice to keep one dedicated app on the control plane — an app carries an allowMasterPlacement flag (a boolean, default false). It exists on both the application itself and on the catalog-install input, so you can opt in either when installing or later. When it is set, a dedicated app is allowed to schedule on the master even though no worker is available; when it is not, the missing-worker case is the loud failure above. The default is deliberately the safe one: dedicated work lands on a worker unless you say otherwise.

A locked node

A node hosting at least one dedicated workload becomes a locked node from Flui’s point of view: the node-removal flow refuses to delete it as long as those workloads live there, and explains exactly which applications would lose their disk if it went away.

Unlocking the node is a deliberate operation: move the dedicated workloads off it (to another worker, or to the master via the allowMasterPlacement escape hatch), then the node-removal flow accepts. The platform does not migrate dedicated data between nodes automatically — that step belongs to the operator, who knows whether a downtime window is acceptable and whether a backup-and-restore is the right way to move the data.

A pre-flight check before deploy

Before a dedicated workload is deployed (or re-deployed after a configuration change), Flui sanity-checks the target node: it asks the cluster how much CPU and memory the node has free, and compares that against what the workload needs. If the node cannot fit the workload, the deploy fails up front with a clear error pointing at the gap — and at the operations that can close it: resize the node, or move the workload to one that has the room.

This check is run at deploy time, not just at install time, so adding a new replica or growing the resource request of a dedicated workload either succeeds because the room is there, or fails with a useful diagnosis instead of leaving the workload stuck in a degraded state.

Vertical scaling: giving the host more

Once a workload is pinned, the only way to give it more CPU or memory is to give the host node more CPU or memory. There is no horizontal “add another replica” path for a dedicated workload — there is only one disk it can speak to, so the answer is always to grow the machine underneath it.

Resizing a node is a planned-maintenance operation. The cluster powers the node down, asks the provider to change the server type, powers it back up, and waits for the cluster to report it healthy before declaring the operation done. The workload is unavailable for a few minutes — this is unavoidable on the providers Flui integrates with — and the platform makes that window explicit in the operation’s plan before doing anything. The command-level reference for resize lives in CLI: Environments.

Growing the shared storage

The shared storage layer also runs out of room over time, as the apps using it accumulate state. Flui exposes a direct operation to grow it: the underlying volume on the provider is resized, the filesystem on top is extended, and the new space is immediately visible to every node that mounts it. The operation is online — no downtime, no pod restarts — but it is one-way: the providers Flui integrates with let volumes grow, not shrink. Plan the target size; the smallest reversible step is “fresh cluster”.

What is not in this story today

Moving a dedicated workload from one host to another is not an automatic data migration: Flui re-pins the placement, but the underlying disk does not follow. The workload has to be stopped, its data copied to the new host explicitly (typically a backup followed by a restore on the target), and then re-deployed with the new placement. This is the main caveat to know about; it does not break the common path of “install a database, eventually resize the host, optionally grow the shared volume”.

Where this chapter goes from here

The app concept — the application surface that declares storage, exposure and placement, and the home of the install-time capacity gate.
Nodes and cluster topology — the nodes a workload can be pinned to.
The database console — the first-party tooling for the databases that most often want dedicated, node-local storage.
CLI: Environments — the command-by-command reference for resizing nodes and growing the shared volume.