Enterprise

General Notes

Almost all aspects of Enterprise Nucleus Server stack are documented via its settings (.env) file included in the stack tarball we provide.

We try to keep our documentation on settings and options as close to the “code” as possible here.

That file should be self explanatory, with settings and comments talking about what they do.

This document should be considered an addendum to information in the .env file.

Monitoring

Monitoring your instances is imperative to understanding the general health of the system and if more resources are necessary. At a minimum, one should monitor:

  • Disk space

  • CPU and LA

  • Memory

Additionally, Nucleus stack itself exposes quite a few metrics about its load characteristics (such as amount of requests per user, per request type, etc).

We recommend to take advantage of these metrics. We expose them to be consumable by Prometheus. As usual, the port for scraping metrics can be found in the Stack’s .env file.

Data Management and Backups

Nucleus Data Directory contains multiple sub-directories utilized by various Nucleus components:

  • data subdir contains core data - actual data residing in Nucleus (elements of Nucleus’s file tree: their content and metadata (ACLs, timestamps, etc)) uploaded by its users

  • local-accounts-db and tags-db are Authentication and Tagging services’ databases respectively

  • log subdir contains log files

  • scratch, resolver-cache, and tmp contain internal cache and scratch spaces

Core Nucleus Data

data directory with the actual data hosted by Nucleus is opaque and should not be changed or modified externally.

If making a copy of this directory (for migration to another machine for example), Nucleus stack must be stopped. This bears repeating - copying Core data directory “hot” can not be done safely.

If backups of Core Data are desired, nucleus-tools package contains necessary tooling to create and restore copies of Core Data.

Services Data Dirs

These include accounts and tags’ databases. They can be safely copied and backed up “hot” (while respective services are running).

Logs

Logs are text files and should be self-evident, with one major note: live logs (files that are being appended to by services) in general are not externally rotatable. However, our stack includes rotation and archival sidecars in it, and you can certainly blow away archives with no ill effects (aside from losing log data, of course). They can be copied without stopping services without problem.

Scratch, Temp, Cache Data

Data located in internal caches and scratch spaces of Nucleus does not require backups, and can be deleted without ill effects, but only when the stack is not running.

Migration and Upgrades: Methodology

We do not support data migration from pre-2021.2.0 versions to 2021.2.0.

Nucleus 2021.2.0

When moving between servers, a method we find convenient is so-called blue-green approach: where a new instance is brought up alongside the old one and validated prior to switching users over to it.

Here’s a helpful recipe for server migration with as minimal a downtime as possible (using short hostnames here for clarity, in practice, we recommend using FQDNs when deploying Nucleus):

  • Suppose Nucleus deployed on nucleus-host-1, and we desire to migrate to nucleus-host-2. A DNS CNAME users are utilizing when accessing this instance is my-nucleus, and currently it resolves into nucleus-host-1

  • nucleus-host-2 is brought up and the entire data directory is rsync’d from nucleus-host-1. Note that while hot copies are not supported (see above), we can do that to transfer the bulk of data before shutting down the source instance. Also, there’s a high degree of probability that this copy will not be corrupt enough to preclude Nucleus from starting.

  • Nucleus Stack is configured and launched on nucleus-host-2. Data upgrade is performed, if required. It then is validated to be operational.

  • Downtime for my-nucleus is scheduled, and users are notified.

  • During the downtime,

    • Before the scheduled downtime window, nucleus-host-1’s data is rsync’d to nucleus-host-2 again - this will not be the final sync, but will bring those two servers’s data very close in state

    • Nucleus on nucleus-host-1 is shut down

    • Nucleus on nucleus-host-2 is shut down

    • Data is rsync’d again. This will be very quick (seconds for terabytes of data)

    • Data upgrade is performed (because rsync will “revert” the upgrade that was made when doing the initial test deployment)

    • Nucleus on nucleus-host-2 is brought up and quickly validated

    • DNS CNAME my-nucleus is updated to resolve to nucleus-host-2

Authentication

Covered in Authentication and User Registration document.

Omniverse Navigator

Covered in Omniverse Navigator document.