Clustering
Clustering enables a multi-node deployment against a shared repository. The content (database) and blob storage are shared, while each node keeps its own search index and temporary area. Nodes coordinate through database leases (locks) and a journal that replays transactions across nodes.
Configuration
Repository
# <repository>/etc/repository.yml
cluster:
enabled: true
nodeId: node-1 # optional; unique per node
Override per node with the env var CMS_CLUSTER_NODE_ID, or framework properties org.mintjams.jcr.cluster.nodeId / org.mintjams.jcr.cluster.enabled. If nodeId is omitted, the host name is used (or a random id).
Workspace (shared storage)
# <workspace>/etc/jcr/jcr.yml
datasource:
jdbcURL: jdbc:postgresql://db:5432/jcr_\${workspace.name}
username: jcr
password: secret
driverClassName: org.postgresql.Driver
blobstore:
type: fs
directory: /mnt/shared/cms/blobs/\${workspace.name}
search:
indexPath: /var/lib/cms/search/\${workspace.name} # node-local fast storage
Variables such as \${repository.home}, \${workspace.name} and \${cluster.nodeId} are substituted. The search index is kept per node and rebuilt automatically from content if empty.
Where persistent state lives
| State | Standalone (default) | Clustered |
|---|---|---|
| Content, ACLs, journal | embedded H2 | shared DB (e.g. PostgreSQL), one DB per workspace |
| Blobs (binaries) | local files | shared storage (NFS, etc.) |
| Full-text search index | local | node-local |
Files that must be identical on every node
The following "identity files" must be identical across all nodes (auto-generated on first boot; do not regenerate on the second and later nodes — copy them from the first):
secrets/secret-key.yml(encryption key for stored secrets)etc/boot.id(repository identifier; used to derive keys for masked values)etc/idp-keystore.p12/etc/sp-keystore.p12(SAML keys)etc/idp.yml/etc/saml2.yml
The recommended approach is to put the repository directory on shared storage (so
etc/andsecrets/are shared automatically). The temporary directory (tmp/) is wiped at startup, so in a cluster it automatically usestmp/nodes/<nodeId>and must not be shared.
Journal & coordination
Every transaction is recorded in a journal, and each node's poller (every 2 seconds) replays transactions from other nodes. This makes cache invalidation, index updates and OSGi events (Camel route redeployment, CMS events, SSE/GraphQL subscriptions) cluster-aware.
Coordination tables are created automatically:
jcr_cluster_nodes— node registry; refresheslast_heartbeatevery 30sjcr_cluster_locks— lease locks (with TTL, so a crash never blocks indefinitely)jcr_cluster_signals— a signal bus for short-lived control notifications
Single-node work — workspace startup, blob cleanup, content deployment — is serialized with leases.
Procedure (overview)
- Provision a PostgreSQL database per workspace for JCR (and one for BPM if used)
- Install the PostgreSQL JDBC driver bundle into Felix
- Put the repository directory on shared storage (at minimum, share
blobstore.directoryacross nodes) - Configure each workspace's
jcr.yml#datasource(andbpm.yml#jdbcURLif needed) identically on all nodes - Share the identity files across nodes (on first boot, start a single node alone)
- Enable
cluster.enabledand give each node a uniquenodeId. Keep node clocks NTP-synchronized - Place the nodes behind a load balancer (sticky sessions recommended)
Coordination from application code
The script API can run a piece of work on exactly one node in the cluster.
def lease = cluster.tryLock("nightly-report", 600000)
if (lease != null) {
try {
// ... runs on exactly one node ...
} finally {
lease.close()
}
}
cluster.isClusterEnabled(), cluster.nodeId and cluster.listMembers() are also available. In standalone mode the lock is granted immediately and the same code runs unchanged.
Monitoring
Use the GraphQL cluster query (admin), or the Cluster card in the Dashboard Operations section, to review each node's heartbeat (liveness). A node silent for three intervals (~90s) is logged as a warning.
Cautions
- Clock skew breaks the stability window (10s). NTP synchronization is required.
- External databases and blob storage are not auto-managed. Cleaning up the DB/blobs after deleting a workspace, and clearing the DB before recreating one of the same name, are manual steps.
- The search index is per-node and not replicated (it rebuilds automatically when empty).