Clusters

Clusters are an experimental feature — see Experimental Features for important caveats. Enable with features: { experimental.clusters: true } in your configuration file.

Scaling the Orchestrator horizontally today requires sticky sessions on your load balancer so each user’s requests always reach the same node. Many deployments also need an external cache so that shared state is available across nodes. These are proven, production-supported approaches that work extremely well. However, some deployments need to minimize external infrastructure — whether for operational simplicity, edge and remote environments, or reducing the surface area that teams need to manage. Clustering addresses this by letting Orchestrator nodes talk directly to each other over an encrypted peer-to-peer mesh. Sessions, cache data, and routing decisions are replicated across every node automatically — no load balancer affinity rules, no external cache to provision. Beyond simplifying infrastructure, clustering also unlocks capabilities that aren’t possible with the traditional approach: multi-cluster topologies for global scale, built-in request routing across nodes, and a substrate for real-time event propagation.

Console deployments: Clustering is not available natively in the Maverics Console. To enable clustering for Console-deployed Orchestrators, use the config override feature. Config override requires enablement for your organization — contact your Strata account team or Strata support to enable it.

Why Clustering

Any node handles any request

Sessions and cache data are replicated across every node in the cluster. A user can authenticate on node A and have their next request served by node C — the session is already there. No sticky sessions on your load balancer, no external session store, no single point of failure for user state.

A substrate for global enforcement

The clustering substrate is built to carry more than just session and cache data. Its pub/sub and broadcast primitives are designed to propagate security events — session revocations, token invalidations, policy changes — to every node in the cluster within seconds. As these capabilities mature, the Orchestrator will be able to receive external security signals (such as a compromised credential notification from an IdP or security tool) and act on them globally, enforcing revocations across every node in the cluster from a single event.

Intelligent request routing

State replication is eventually consistent — a write on one node takes a brief moment to reach the others. HTTP routing bridges that gap. If a request lands on a node that doesn’t yet have the latest state for that user, the cluster forwards it internally to the node that does. This is what makes sticky sessions unnecessary: your load balancer can round-robin freely because the cluster itself ensures each request reaches the right place, even during the replication window.

No more sticky sessions or external caches

Clustering replaces the two things most multi-node deployments depend on today. Sticky sessions on the load balancer become unnecessary because every node already has every session. An external cache becomes optional because the cluster itself is the distributed cache. Fewer moving parts means fewer things to break and simpler operations overall.

How It Works

Clustering uses two encrypted network channels:

Membership channel — Nodes discover and monitor each other using a gossip protocol. Each node exchanges lightweight heartbeats with its peers, and membership changes propagate organically through the cluster. There is no central coordinator — if a node goes down, the remaining nodes detect the failure and continue operating.
Data channel — A dedicated plane for the actual workload: session replication, cache synchronization, and HTTP routing decisions. The data channel is also designed to carry event broadcasts as clustering capabilities expand.

Both channels are encrypted using a pre-shared key (encryption.psk). All nodes in a cluster must share the same PSK. The optional nodeKey.file gives a node a persistent identity across restarts, so other members recognize it even after a reboot.

The experimental.clusters feature flag must be set to true in the features map. Without the flag, the Orchestrator ignores cluster configuration and runs as a standalone instance. See Feature Flags for details.

Pre-shared key security: The PSK protects all cluster communication — treat it with the same care as a TLS private key.

Generate securely: Use a cryptographically random value, for example: openssl rand -hex 32. Any secure generation method producing exactly 32 bytes of entropy works.
Store in a secret provider: Always reference the PSK via a secret provider (<secrets.cluster-psk>). Never hardcode the PSK in configuration files, environment variables, or source code.
Rotate periodically: Establish a rotation schedule. To rotate, update the PSK value in your secret provider and perform a rolling restart of cluster nodes — nodes will re-establish trust using the new PSK.

Multi-Cluster Topologies

A single Orchestrator node can join multiple clusters simultaneously. Each cluster is independent — its own membership, its own encryption, its own data channel — so you can assign different clusters to different jobs and layer them on the same node. This is where clustering moves beyond simple horizontal scaling and into global architecture. By separating concerns across cluster boundaries, you control exactly what data flows where.

Networking requirements: Cluster nodes must be able to reach each other on the configured membership and data ports. For regional clusters, this is straightforward. For cross-region or global clusters, your infrastructure and networking teams are responsible for establishing connectivity between regions — through VPN tunnels, peered VPCs, private interconnects, or other network-layer solutions. The Orchestrator does not handle network traversal across region boundaries on its own.

Regional data with global reach

The most common multi-cluster pattern: keep data close to where it is needed, but route requests globally. Each region runs its own cluster for cache data — keeping cached entries fast and local. All nodes also join a global cluster for HTTP routing so a request arriving in the US can be served locally or forwarded to Europe if needed. As event broadcast capabilities mature, this same global cluster will also carry security signals across regions.

Example scenarios

Regional caching with global routing. Regional clusters keep cached tokens and provider data close to users. A global routing cluster lets any node forward requests to the right place, regardless of region. Global events with regional sessions. As event broadcast capabilities expand, a global cluster will propagate security events — session revocations, token invalidations — to every node everywhere. Regional clusters keep session data local for low-latency access and data residency compliance. Isolated workload clusters. Internal-facing and external-facing application tiers use separate clusters for session and cache isolation, while sharing a common routing cluster so traffic flows between tiers when needed.

Configuration example

The following configuration joins a node to two clusters — one regional cache cluster and one global routing cluster:

clusters:
  - name: regional-cache
    addresses:
      membership: /ip4/0.0.0.0/tcp/9450
      data: /ip4/0.0.0.0/tcp/9451
    discovery:
      method: static
      static:
        nodes:
          - endpoints:
              membership: /ip4/10.0.1.1/tcp/9450
          - endpoints:
              membership: /ip4/10.0.1.2/tcp/9450
    encryption:
      psk: <secrets.regional-psk>

  - name: global-routing
    addresses:
      membership: /ip4/0.0.0.0/tcp/9460
      data: /ip4/0.0.0.0/tcp/9461
    discovery:
      method: srvdns
      srvDNS:
        dnsAddress: "10.0.0.53:53"
        pollInterval: "30s"
        names:
          - "_maverics-global._tcp.orchestrator.internal"
    encryption:
      psk: <secrets.global-psk>

caches:
  - name: regional-data
    type: cluster
    cluster:
      name: regional-cache

http:
  routing:
    enabled: true
    type: cluster
    cluster:
      name: global-routing

Each cluster uses its own ports, discovery method, and PSK. The cache references the regional cluster while HTTP routing references the global cluster. You can mix and match any combination of services across any number of clusters.

Discovery Methods

The Orchestrator supports two methods for cluster members to find each other.

Static Discovery

Static discovery lists known node addresses directly. Use this when cluster members have stable, known IP addresses or hostnames.

clusters:
  - name: my-cluster
    addresses:
      membership: /ip4/0.0.0.0/tcp/9450
      data: /ip4/0.0.0.0/tcp/9451
    discovery:
      method: static
      static:
        nodes:
          - endpoints:
              membership: /ip4/10.0.0.1/tcp/9450
          - endpoints:
              membership: /ip4/10.0.0.2/tcp/9450
          - endpoints:
              membership: /ip4/10.0.0.3/tcp/9450
    encryption:
      psk: <secrets.cluster-psk>

Each node in the static.nodes array provides the membership endpoint of a peer. The current node’s own address can be included — the gossip protocol ignores self-connections.

DNS SRV Discovery

DNS SRV discovery queries DNS SRV records to find cluster members dynamically. Use this in container orchestration environments (Kubernetes, ECS) where node addresses change.

clusters:
  - name: my-cluster
    addresses:
      membership: /ip4/0.0.0.0/tcp/9450
      data: /ip4/0.0.0.0/tcp/9451
    discovery:
      method: srvdns
      srvDNS:
        dnsAddress: "10.0.0.53:53"
        pollInterval: "30s"
        names:
          - "_maverics-membership._tcp.orchestrator.internal"
    encryption:
      psk: <secrets.cluster-psk>

The pollInterval controls how frequently the Orchestrator queries DNS for updated membership records. Shorter intervals detect new nodes faster but increase DNS load.

Cluster-Backed Services

Once a cluster is defined, other configuration sections reference it by name. Each service can point to a different cluster, giving you fine-grained control over what data flows where.

Session Store

Share user sessions across all nodes so any node can handle any user — no sticky sessions or load balancer affinity required:

session:
  store:
    type: cluster
    cluster:
      name: my-cluster

See Cluster Sessions for the full session configuration reference.

Cache

Use a cluster as a distributed key-value store, where cached data propagates automatically across nodes — no external cache infrastructure needed:

caches:
  - name: my-cache
    type: cluster
    cluster:
      name: my-cluster

See Cluster Cache for configuration details.

HTTP Routing

Route requests internally across cluster nodes, eliminating the need for sticky sessions on your load balancer:

http:
  routing:
    enabled: true
    type: cluster
    cluster:
      name: my-cluster

See HTTP Server Configuration for the full HTTP server configuration reference.

Event Broadcast

The clustering substrate includes built-in pub/sub and broadcast primitives designed for propagating events across nodes in real time. As these capabilities are integrated into more Orchestrator workflows, they will enable scenarios like global session revocation and real-time policy enforcement — where an action taken on one node is enforced across the entire cluster within seconds.

Field Reference

Key	Type	Default	Required	Description
`clusters[].name`	string	—	Yes	Cluster name, referenced by session store, cache, and routing configuration
`clusters[].disabled`	boolean	`false`	No	Disable the cluster without removing its configuration
`clusters[].addresses.membership`	string (multiaddr)	—	Yes	Gossip protocol bind address for cluster membership
`clusters[].addresses.data`	string (multiaddr)	—	Yes	Data plane bind address for session, cache, and routing traffic
`clusters[].discovery.method`	string	—	Yes	Discovery method: `"static"` or `"srvdns"`
`clusters[].discovery.static.nodes[].endpoints.membership`	string (multiaddr)	—	Yes (static)	Peer node membership address
`clusters[].discovery.srvDNS.dnsAddress`	string	—	No (srvdns)	DNS server address and port for SRV lookups
`clusters[].discovery.srvDNS.pollInterval`	duration string	—	No (srvdns)	How often to poll DNS for membership changes
`clusters[].discovery.srvDNS.names`	array of strings	—	Yes (srvdns)	SRV record names to query for peer discovery
`clusters[].encryption.psk`	string	—	Yes	Pre-shared key for establishing trust between cluster nodes (use a secret reference)
`clusters[].encryption.nodeKey.file`	string	—	No	File path for a persistent node key pair

Addresses use multiaddr format. Both IPv4 and IPv6 are supported:

IPv4: /ip4/<address>/tcp/<port> (e.g., /ip4/10.0.0.1/tcp/9450)
IPv6: /ip6/<address>/tcp/<port> (e.g., /ip6/::1/tcp/9450 or /ip6/2001:db8::1/tcp/9450)

Experimental Features

Overview of all experimental features and important caveats

Cluster Sessions

Distributed session storage across cluster nodes

Cluster Cache

Distributed key-value caching across cluster nodes

Scale Guide

Step-by-step guide for scaling Orchestrator deployments

Console Administration

Orchestration

Why Clustering

Any node handles any request

A substrate for global enforcement

Intelligent request routing

No more sticky sessions or external caches

How It Works

Multi-Cluster Topologies

Regional data with global reach

Example scenarios

Configuration example

Discovery Methods

Static Discovery

DNS SRV Discovery

Cluster-Backed Services

Session Store

Cache

HTTP Routing

Event Broadcast

Field Reference

Experimental Features

Cluster Sessions

Cluster Cache

Scale Guide

Console Administration

Orchestration

​Why Clustering

​Any node handles any request

​A substrate for global enforcement

​Intelligent request routing

​No more sticky sessions or external caches

​How It Works

​Multi-Cluster Topologies

​Regional data with global reach

​Example scenarios

​Configuration example

​Discovery Methods

​Static Discovery

​DNS SRV Discovery

​Cluster-Backed Services

​Session Store

​Cache

​HTTP Routing

​Event Broadcast

​Field Reference

​Related Pages

Experimental Features

Cluster Sessions

Cluster Cache

Scale Guide

Why Clustering

Any node handles any request

A substrate for global enforcement

Intelligent request routing

No more sticky sessions or external caches

How It Works

Multi-Cluster Topologies

Regional data with global reach

Example scenarios

Configuration example

Discovery Methods

Static Discovery

DNS SRV Discovery

Cluster-Backed Services

Session Store

Cache

HTTP Routing

Event Broadcast

Field Reference

Related Pages