country_code

CI/CD Pipelines#

../_images/ov_cloud_banner.jpg

Overview#

The documentation below comprehensively describes how to implement a version-control-platform-agnostic CI pipeline to:

  • build and test software

  • publish container images

  • deploy to a managed serving platform

  • register deployments to the Portal Sample or other catalog (optional)

The CI pipeline follows a “1 build → N deployments” pattern, using artifacts and environment variables to chain stages, while emphasizing security, efficiency, and portability across CI/CD tools.

Prerequisites#

In order to implement a CI/CD pipeline, the following prerequisites need to be met:

  • Tooling: a container runtime, shell utilities, and a CI/CD system that supports stages, artifacts, and secret variables.

  • Access and credentials: registry access (i.e., NGC), a deployment API token for the serving platform, and (optionally) a portal/catalog API token.

  • Local dev (optional): the NGC CLI and Docker installed to validate the flow and the ability to manage local environment variables.

Note

The instructions below provide a platform-agnostic example of how to integrate your CI/CD pipeline to NGC. Adapt the API calls, authentication, and deployment descriptors to your organization’s serving platform and portal APIs.

Pipeline Setup#

The essential stages for Kit apps include:

  • Build - Compile the application and package it. Produce a container image, or distributable file, and capture the version metadata.

  • Publish - Push the built image to your container registry. Emit machine-readable outputs (image references) for later stages.

  • Deploy - Create or update a running service/function using the published image and a declarative configuration. Wait until healthy/ready.

  • Register (optional) - Publish deployment details (IDs, versions, URLs) to the Portal Sample or catalog for discovery.

Artifacts and Environment Variables#

Persist both JSON and dotenv artifacts to pass structured data between stages (for example: image references, version, function IDs).

Examples:

  • containers map: _build/containers/published_containers.json, _build/containers/containers.env

  • deployments list: _build/deployments/deployments.json, _build/deployments/deployments.env

  • portal sample registrations: _build/portal/registrations.json, _build/portal/portal.env

Configuration discovery#

Keep deployment descriptors under a dedicated directory and mark enabled ones with an enabled flag. CI jobs should discover enabled descriptors and run deployments accordingly.

Example Workflow#

Build and Package#

Inputs are typically provided by the CI/CD tool. Example environment variables used by the pipeline include:

export APP=my_app
export VERSION=1.2.3              # e.g., from a version file or build system
export CI_COMMIT_SHORT_SHA=abc123 # set by CI/CD tool

Build the binary/package via your build system:

your_build_command --app "$APP" --config release

Authenticate and Publish Container to NGC#

Optional: configure the NGC CLI (interactive or non-interactive) and log Docker in to NGC:

# configure ngc (optional)
ngc config set --apikey "$NGC_API_KEY" --format ascii --org "$NGC_ORG_NAME" --team "no-team"

# Docker login to NGC
export NGC_API_KEY=xxx
export NGC_ORG_NAME=my-org
docker login nvcr.io -u '$oauthtoken' -p "$NGC_API_KEY"

Build, Tag, and Push the Image:

export IMAGE_BASE=my-app
export TAG="${VERSION}_${CI_COMMIT_SHORT_SHA}"
docker build -t "${IMAGE_BASE}:${TAG}" .
docker tag "${IMAGE_BASE}:${TAG}" "nvcr.io/${NGC_ORG_NAME}/${IMAGE_BASE}:${TAG}"
docker push "nvcr.io/${NGC_ORG_NAME}/${IMAGE_BASE}:${TAG}"

Maintain Rolling Latest-N Tags (optional; only on protected/default branches)#

Example Script to Rotate Latest Tags (keep last N = 7):

export IMAGE="nvcr.io/${NGC_ORG_NAME}/${IMAGE_BASE}"
for n in 7 6 5 4 3 2 1; do
  old="latest-$((n-1))"; new="latest-$n"
  docker pull "${IMAGE}:${old}" 2>/dev/null && \
    docker tag  "${IMAGE}:${old}" "${IMAGE}:${new}" && \
    docker push "${IMAGE}:${new}"
done

docker tag  "${IMAGE}:${TAG}" "${IMAGE}:latest"
docker push "${IMAGE}:latest"

Emit Artifacts for Downstream Jobs#

Example Artifact Files Written by the Publish Stage:

mkdir -p _build/containers
cat > _build/containers/published_containers.json <<'JSON'
{
  "CONTAINER_IMAGE_MY_APP": "nvcr.io/my-org/my-app:1.2.3_ab12cd3"
}
JSON

cat > _build/containers/containers.env <<'EOF'
CONTAINER_IMAGE_MY_APP=nvcr.io/my-org/my-app:1.2.3_ab12cd3
EOF

Declarative deployment (1 build → N deployments)#

Example deployment descriptor (JSON):

{
  "enabled": true,
  "deployment_name": "my_app_prod",
  "app_name": "my_app",
  "variant": "default",
  "artifact": { "container_image_env": "CONTAINER_IMAGE_MY_APP" },
  "serving": {
    "function_name": "my-app",
    "inference": { "url": "/api", "port": 8080 },
    "health_check": { "protocol": "HTTP", "uri": "/health", "port": 8080, "timeout": 5, "expected_status_code": 200 },
    "cluster": "prod-cluster",
    "instance_type": "standard",
    "gpu_type": "L40",
    "min_instances": 1,
    "max_instances": 3,
    "deployment_timeout": 900,
    "poll_interval": 30,
    "environment": [{ "name": "APP_VERSION", "value": "${VERSION}" }]
  },
  "portal": {
    "app_id": "my-app",
    "url": "https://portal.example.com",
    "title": "My App",
    "description": "Streaming App",
    "authentication_type": "NUCLEUS"
  }
}

Deploy via Serving Platform APIs (general NVCF-style example)#

Example environment variables and create/deploy steps using the NVCF-style APIs:

export NVCF_TOKEN=xxx
export NVCF_API_BASE="https://api.ngc.nvidia.com/v2/nvcf"
export CONTAINER_IMAGE="nvcr.io/my-org/my-app:1.2.3_ab12cd3"
  • Create function

curl -sS -X POST "${NVCF_API_BASE}/functions" \
  -H "Authorization: Bearer ${NVCF_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-app",
    "inferenceUrl": "/api",
    "inferencePort": 8080,
    "health": { "protocol": "HTTP", "uri": "/health", "port": 8080, "timeout": 5, "expectedStatusCode": 200 },
    "containerImage": "'"${CONTAINER_IMAGE}"'",
    "apiBodyFormat": "CUSTOM",
    "description": "my_app_prod",
    "functionType": "STREAMING",
    "containerEnvironment": [{"name":"APP_VERSION","value":"'"${VERSION}"'"}]
  }' | tee function.json

export FUNCTION_ID=$(jq -r '.function.id' function.json)
export FUNCTION_VERSION_ID=$(jq -r '.function.versionId' function.json)
  • Deploy

curl -sS -X POST "${NVCF_API_BASE}/deployments/functions/${FUNCTION_ID}/versions/${FUNCTION_VERSION_ID}" \
  -H "Authorization: Bearer ${NVCF_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "deploymentSpecifications": [{
      "instanceType": "standard",
      "gpu": "L40",
      "minInstances": 1,
      "maxInstances": 3,
      "maxRequestConcurrency": 1,
      "clusters": ["prod-cluster"],
      "attributes": []
    }]
  }' | tee deploy.json
  • Poll for ACTIVE

until STATUS=$(curl -sS -H "Authorization: Bearer ${NVCF_TOKEN}" \
  "${NVCF_API_BASE}/functions/${FUNCTION_ID}/versions/${FUNCTION_VERSION_ID}" | jq -r '.function.status'); do
  sleep 5
done
[ "$STATUS" = "ACTIVE" ] || { echo "Deployment failed: $STATUS"; exit 1; }

Register to the Portal Sample (general)#

Example portal registration:

export PORTAL_API_KEY=xxx
export PORTAL_URL="https://portal.example.com"
export APP_ID="my-app"

curl -sS -X PUT "${PORTAL_URL}/api/apps/${APP_ID}" \
  -H "Accept: application/json" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer ${PORTAL_API_KEY}" \
  -d '{
    "slug": "'"${APP_ID}"'",
    "function_id": "'"${FUNCTION_ID}"'",
    "function_version_id": "'"${FUNCTION_VERSION_ID}"'",
    "title": "My App",
    "description": "Streaming App",
    "version": "'"${VERSION}"'",
    "page": "Template Applications",
    "icon": "",
    "category": "Template Applications",
    "product_area": "Omniverse",
    "authentication_type": "NUCLEUS"
  }'

Pass Deployment Results to Downstream Steps#

Example artifacts produced by the deploy stage:

mkdir -p _build/deployments
cat > _build/deployments/deployments.json <<JSON
[
  {
    "deployment_name": "my_app_prod",
    "app_name": "my_app",
    "variant": "default",
    "function_id": "${FUNCTION_ID}",
    "function_version_id": "${FUNCTION_VERSION_ID}",
    "cluster": "prod-cluster",
    "status": "SUCCESS"
  }
]
JSON

cat > _build/deployments/deployments.env <<EOF
NVCF_FUNCTION_ID=${FUNCTION_ID}
NVCF_FUNCTION_VERSION_ID=${FUNCTION_VERSION_ID}
VERSION=${VERSION}
EOF

Best Practices#

Security#

  • Store tokens/keys in CI/CD secrets; never echo tokens; avoid passing secrets as CLI args where process lists can expose them.

  • Prefer authorization headers or SDKs that keep credentials in memory; use short-lived, scoped credentials and rotate them.

  • Mask secret variables in logs and scrub outputs before persisting artifacts.

Efficiency#

  • Use matrix builds to parallelize; cache dependency layers and base images; split build and publish stages.

  • Emit minimal, structured artifacts (JSON + dotenv) to avoid parsing logs downstream.

  • Push rolling tags only on protected/default branches; include version and build identifiers in tags for traceability.

Resilience#

  • Make deployments idempotent; implement timeouts and retries with backoff for API calls.

  • Poll for readiness with explicit terminal states and provide clear summaries.

Configuration Management#

  • Keep deployment descriptors declarative with an enabled flag; use variable substitution (for example, ${VERSION}) for environment differences.

  • Normalize identifiers (lowercase, hyphenated) for container and environment variable naming consistency.

Observability and Governance#

  • Generate concise summaries (counts, IDs, endpoints); publish machine-readable reports as artifacts.

  • Gate production actions (publish latest, deploy prod) behind approvals or branch rules.

Local Environment Management#

Simple .env for local testing (do not commit):

cat > .env <<'EOF'
NGC_API_KEY=xxx
NGC_ORG_NAME=my-org
NVCF_TOKEN=yyy
PORTAL_API_KEY=zzz
VERSION=1.2.3
EOF

set -a; . ./.env; set +a

Notes#