Almost all aspects of Enterprise Nucleus Server stack are documented via its
.env) file included in the stack tarball we provide.
We try to keep our documentation on settings and options as close to the “code” as possible here.
That file should be self explanatory, with settings and comments talking about what they do.
This document should be considered an addendum to information in the
Monitoring your instances is imperative to understanding the general health of the system and if more resources are necessary. At a minimum, one should monitor:
CPU and LA
Additionally, Nucleus stack itself exposes quite a few metrics about its load characteristics (such as amount of requests per user, per request type, etc).
We recommend to take advantage of these metrics. We expose them to be
consumable by Prometheus. As usual, the port
for scraping metrics can be found in the Stack’s
Data Management and Backups¶
Nucleus Data Directory contains multiple sub-directories utilized by various Nucleus components:
datasubdir contains core data - actual data residing in Nucleus (elements of Nucleus’s file tree: their content and metadata (ACLs, timestamps, etc)) uploaded by its users
tags-dbare Authentication and Tagging services’ databases respectively
logsubdir contains log files
tmpcontain internal cache and scratch spaces
Core Nucleus Data¶
data directory with the actual data hosted by Nucleus is opaque and should
not be changed or modified externally.
If making a copy of this directory (for migration to another machine for example), Nucleus stack must be stopped. This bears repeating - copying Core data directory “hot” can not be done safely.
If backups of Core Data are desired, nucleus-tools package contains necessary tooling to create and restore copies of Core Data.
Services Data Dirs¶
These include accounts and tags’ databases. They can be safely copied and backed up “hot” (while respective services are running).
Logs are text files and should be self-evident, with one major note: live logs (files that are being appended to by services) in general are not externally rotatable. However, our stack includes rotation and archival sidecars in it, and you can certainly blow away archives with no ill effects (aside from losing log data, of course). They can be copied without stopping services without problem.
Scratch, Temp, Cache Data¶
Data located in internal caches and scratch spaces of Nucleus does not require backups, and can be deleted without ill effects, but only when the stack is not running.
Migration and Upgrades: Methodology¶
We do not support data migration from pre-2021.2.0 versions to 2021.2.0.
When moving between servers, a method we find convenient is so-called blue-green approach: where a new instance is brought up alongside the old one and validated prior to switching users over to it.
Here’s a helpful recipe for server migration with as minimal a downtime as possible (using short hostnames here for clarity, in practice, we recommend using FQDNs when deploying Nucleus):
Suppose Nucleus deployed on
nucleus-host-1, and we desire to migrate to
nucleus-host-2. A DNS CNAME users are utilizing when accessing this instance is
my-nucleus, and currently it resolves into
nucleus-host-2is brought up and the entire data directory is
nucleus-host-1. Note that while hot copies are not supported (see above), we can do that to transfer the bulk of data before shutting down the source instance. Also, there’s a high degree of probability that this copy will not be corrupt enough to preclude Nucleus from starting.
Nucleus Stack is configured and launched on
nucleus-host-2. Data upgrade is performed, if required. It then is validated to be operational.
my-nucleusis scheduled, and users are notified.
During the downtime,
Before the scheduled downtime window,
nucleus-host-1’s data is
nucleus-host-2again - this will not be the final sync, but will bring those two servers’ data very close in state
nucleus-host-1is shut down
nucleus-host-2is shut down
rsync’d again. This will be very quick (seconds for terabytes of data)
Data upgrade is performed (because
rsyncwill “revert” the upgrade that was made when doing the initial test deployment)
nucleus-host-2is brought up and quickly validated
my-nucleusis updated to resolve to