Administration
General Notes
Almost all aspects of Enterprise Nucleus Server stack are documented via its
settings (.env
) file included in the stack tarball we provide.
We try to keep our documentation on settings and options as close to the “code” as possible here.
That file should be self explanatory, with settings and comments talking about what they do.
This document should be considered an addendum to information in the
.env
file.
Monitoring
Monitoring your instances is imperative to understanding the general health of the system and if more resources are necessary. At a minimum, one should monitor:
Disk space
CPU and LA
Memory
Additionally, Nucleus stack itself exposes quite a few metrics about its load characteristics (such as amount of requests per user, per request type, etc).
We recommend to take advantage of these metrics. We expose them to be
consumable by Prometheus. As usual, the port
for scraping metrics can be found in the Stack’s .env
file.
Data Management and Backups
Nucleus Data Directory contains multiple sub-directories utilized by various Nucleus components:
data
subdir contains core data - actual data residing in Nucleus (elements of Nucleus’s file tree: their content and metadata (ACLs, timestamps, etc)) uploaded by its userslocal-accounts-db
andtags-db
are Authentication and Tagging services’ databases respectivelylog
subdir contains log filesscratch
,resolver-cache
, andtmp
contain internal cache and scratch spaces
Core Nucleus Data
data
directory with the actual data hosted by Nucleus is opaque and should
not be changed or modified externally.
If making a copy of this directory (for migration to another machine for example), Nucleus stack must be stopped. This bears repeating - copying Core data directory “hot” can not be done safely.
If backups of Core Data are desired, nucleus-tools package contains necessary tooling to create and restore copies of Core Data.
Services Data Dirs
These include accounts and tags’ databases. They can be safely copied and backed up “hot” (while respective services are running).
Logs
Logs are text files and should be self-evident, with one major note: live logs (files that are being appended to by services) in general are not externally rotatable. However, our stack includes rotation and archival sidecars in it, and you can certainly blow away archives with no ill effects (aside from losing log data, of course). They can be copied without stopping services without problem.
Scratch, Temp, Cache Data
Data located in internal caches and scratch spaces of Nucleus does not require backups, and can be deleted without ill effects, but only when the stack is not running.
Migration and Upgrades: Methodology
We do not support data migration from pre-2021.2.0 versions to 2021.2.0.
Nucleus 2021.2.0
When moving between servers, a method we find convenient is so-called blue-green approach: where a new instance is brought up alongside the old one and validated prior to switching users over to it.
Here’s a helpful recipe for server migration with as minimal a downtime as possible (using short hostnames here for clarity, in practice, we recommend using FQDNs when deploying Nucleus):
Suppose Nucleus deployed on
nucleus-host-1
, and we desire to migrate tonucleus-host-2
. A DNS CNAME users are utilizing when accessing this instance ismy-nucleus
, and currently it resolves intonucleus-host-1
nucleus-host-2
is brought up and the entire data directory isrsync
’d fromnucleus-host-1
. Note that while hot copies are not supported (see above), we can do that to transfer the bulk of data before shutting down the source instance. Also, there’s a high degree of probability that this copy will not be corrupt enough to preclude Nucleus from starting.Nucleus Stack is configured and launched on
nucleus-host-2
. Data upgrade is performed, if required. It then is validated to be operational.Downtime for
my-nucleus
is scheduled, and users are notified.During the downtime,
Before the scheduled downtime window,
nucleus-host-1
’s data isrsync
’d tonucleus-host-2
again - this will not be the final sync, but will bring those two servers’ data very close in stateNucleus on
nucleus-host-1
is shut downNucleus on
nucleus-host-2
is shut downData is
rsync
’d again. This will be very quick (seconds for terabytes of data)Data upgrade is performed (because
rsync
will “revert” the upgrade that was made when doing the initial test deployment)Nucleus on
nucleus-host-2
is brought up and quickly validatedDNS CNAME
my-nucleus
is updated to resolve tonucleus-host-2
Authentication
Covered in Authentication and User Registration document.