Scalability#
This section provides guidance on scaling the Storage Service for larger deployments. All settings below refer to the Storage Service Helm chart values.
Sizing guideline#
For a cluster of 100 GPUs, use 5 storage nodes.
Set the number of replicas with replicaCount (and optionally replicaMinCount, replicaScalingFactor; the effective count is max(replicaMinCount, ceil(replicaCount / replicaScalingFactor))). This ratio is a practical starting point; adjust based on your workload (e.g. request rate, object size, and use of metadata or enumeration). Monitor metrics (see Observability (Metrics, Traces, Logs)) and scale up or down as needed.
How the Storage Service scales#
The Storage Service can be scaled by running multiple replicas. It maintains internal caches to improve performance of storage operations (e.g. metadata and listing). Whether scaling is appropriate depends on cache behavior and your client workloads:
Without bucket notifications — Cache entries are invalidated after a time-to-live (TTL). You can scale by increasing
replicaCount. Each pod has its own cache. TTL is configured per cache in the Helm values:config.smallObjectCache.timeToLive,config.statCache.timeToLive, andconfig.listCache.timeToLive(andconfig.listCache.enabledif you use the list cache).Stale reads are possible: clients may see slightly outdated data for non-version-specific (e.g. “latest”) objects until the TTL expires. If your client workloads can tolerate that—i.e. they read “latest” and do not require immediate consistency—scaling with caches and TTL is fine. To reduce staleness, lower the TTL values or disable caches (
config.smallObjectCache.enabled,config.statCache.enabled,config.listCache.enabled) at the cost of more backend calls.
- With bucket notifications enabled — Bucket notifications are enabled when
config.storageEvents.sqs.enabledorconfig.storageEvents.azureServiceBus.enabledistrue. The service then invalidates caches when it receives storage events (e.g. object created or deleted) from the configured queue;config.statCache.invalidateOnUpdateandconfig.listCache.invalidateOnUpdatecontrol whether those caches are invalidated on writes and on notification events. The current Helm chart does not support multiple pods all consuming from the same queue in a recommended way. When bucket notifications are enabled, run a single Storage Service instance: set
replicaCountto1(and ensurereplicaScalingFactordoes not increase the effective replica count above 1).
- With bucket notifications enabled — Bucket notifications are enabled when
Summary#
Use 5 storage nodes for 100 GPUs as a starting point; set
replicaCount(and related replica values) and tune from metrics.You can scale with multiple replicas when
config.storageEvents.sqs.enabledandconfig.storageEvents.azureServiceBus.enabledare bothfalse—with or without internal caches. With caches and TTL (config.smallObjectCache.timeToLive,config.statCache.timeToLive,config.listCache.timeToLive), clients may see stale reads for non-version-specific objects until TTL expiry; if workloads tolerate that (e.g. reading “latest”), this is acceptable.With notifications enabled (either
config.storageEvents.sqs.enabledorconfig.storageEvents.azureServiceBus.enabledset totrue), run a single instance (replicaCount = 1).