Skip to main content

Helm Chart Container Images

This document lists all container images used by the Unstract Platform Helm chart for mirroring to a private registry or pre-pulling in air-gapped environments.

Image Reference

The default registry for all Unstract application images is:

us-central1-docker.pkg.dev/pandoras-tamer/unstract-on-prem

This is configured via global.unstract.image.registry in values.yaml.

Core Services

ServiceImage NameDefault TagAlways Deployed
Backend APIbackendUNSTRACT_APPS_VERSIONYes
FrontendfrontendUNSTRACT_APPS_VERSIONYes
Platform Serviceplatform-serviceUNSTRACT_APPS_VERSIONYes
Prompt Serviceprompt-serviceUNSTRACT_APPS_VERSIONYes
RunnerrunnerUNSTRACT_APPS_VERSIONYes
Unified Workerworker-unifiedUNSTRACT_APPS_VERSIONYes
X2Text Servicex2text-serviceUNSTRACT_APPS_VERSIONYes

All core services share the same tag, controlled by the UNSTRACT_APPS_VERSION anchor in values.yaml. The exact tag varies per release — refer to the On-Prem Release Notes for version details.

Infrastructure Services

ServiceImage NameDefault TagRegistry
Redisbitnami-redis7.2.4-debian-12-r13us-central1-docker.pkg.dev/pandoras-tamer/unstract-on-prem
Redis Sentinelbitnami/redis-sentinel7.2.4-debian-12-r13us-central1-docker.pkg.dev/pandoras-tamer/unstract-on-prem
MinIO (standalone)bitnami-minio2024.12.18-debian-12-r0us-central1-docker.pkg.dev/pandoras-tamer/unstract-on-prem
MinIO HA (Tenant)minioRELEASE.2024-12-18T13-15-44Zus-central1-docker.pkg.dev/pandoras-tamer/unstract-on-prem
PgBouncerpgbouncer1.18.0us-central1-docker.pkg.dev/pandoras-tamer/unstract-on-prem
note

HA-mode images — Redis Sentinel and MinIO HA (Tenant) are only required when Redis or MinIO are configured in HA mode respectively. These images are not emitted by the list-onprem-images.sh script because they are gated by nested chart flags (redis.sentinel.enabled, minio.ha.enabled) that the script's --all discovery does not traverse. Mirror them manually if you enable HA for either service.

Init Containers & Utilities

ServiceImage NameDefault TagAlways Deployed
DB Connection Checkdb-connection-checkv1.0.0Yes
Python Utilitypython3.12-slimYes

The DB connection check init container verifies DB, Redis, and RabbitMQ connectivity before service startup. The Python utility image is used for deployment jobs (deployment-notifier, regression-trigger).

Document Processing

ServiceImage NameDefault TagRegistry
LibreOffice X2PDFlibreoffice-x2pdfv2.2.3us-central1-docker.pkg.dev/pandoras-tamer/unstract-llm-whisperer-on-prem

Tool Images (Pulled at Runtime)

These images are not pulled during Helm deployment but are referenced in ConfigMaps and pulled by the runner at workflow execution time. No additional deployment configuration is needed — they use the same global registry:

ServiceImage Name
Structure Tooltool-structure
Tool Sidecartool-sidecar

Worker Pods

All worker pods (file processing, API deployment, callbacks, notifications, scheduling, log consumer, bulk download) use the worker-unified image with different Celery queue configurations. No additional images are required beyond those listed above.

Copy Images to On-Prem Registry

For air-gapped or restricted environments where images cannot be pulled from the default registry, the Unstract Platform Helm chart ships with a ready-to-use values overlay file: values-private-registry.yaml. This guide walks through generating the image list, copying every image to your registry, and pointing the Helm chart at it.

Prerequisites

  • A private container registry accessible from your Kubernetes cluster
  • helm 3.x and docker installed on the workstation that will mirror the images
  • The list-onprem-images.sh script (provided below) to generate the full image list
  • Credentials for both registries:
    • Source — the Google Artifact Registry service-account key (artifact-key.json) provided by Zipstack
    • Destination — credentials with push access to your private registry

Step 1: Generate the Image List

Use the helper script below to render the chart with helm template and write the complete list of images for your deployment to a file. The mirror snippet in Step 3 reads from this file, so writing to disk (rather than printing to stdout) is the recommended path. Save the script as list-onprem-images.sh and make it executable (chmod +x list-onprem-images.sh).

list-onprem-images.sh
#!/usr/bin/env bash
# ---------------------------------------------------------------------------
# list-onprem-images.sh
#
# Generates the complete list of container images required for an on-prem
# deployment by rendering Helm chart templates.
#
# Usage:
# ./list-onprem-images.sh -r <release-name> -f <values-file> [OPTIONS]
#
# Options:
# -c, --chart PATH Chart reference (local path or remote). Default: . (current directory)
# -r, --release NAME Helm release name for rendering (required)
# -v, --version VER Chart version (for remote charts)
# -f, --values FILE Values file(s) to pass to helm template (repeatable)
# -a, --all Include optional services (disabled by default)
# -o, --output FILE Write image list to a file instead of stdout
# -h, --help Show this help message
# ---------------------------------------------------------------------------
set -euo pipefail

CHART_REF="."

RELEASE_NAME=""
INCLUDE_OPTIONAL=false
OUTPUT_FILE=""
CHART_VERSION=""
VALUES_FILES=()

usage() {
sed -n '2,/^# ----/p' "$0" | grep '^#' | sed 's/^# \?//'
exit 0
}

while [[ $# -gt 0 ]]; do
case "$1" in
-c|--chart) CHART_REF="$2"; shift 2 ;;
-r|--release) RELEASE_NAME="$2"; shift 2 ;;
-v|--version) CHART_VERSION="$2"; shift 2 ;;
-f|--values) VALUES_FILES+=("-f" "$2"); shift 2 ;;
-a|--all) INCLUDE_OPTIONAL=true; shift ;;
-o|--output) OUTPUT_FILE="$2"; shift 2 ;;
-h|--help) usage ;;
*) echo "Unknown option: $1" >&2; exit 1 ;;
esac
done

if ! command -v helm &>/dev/null; then
echo "Error: helm is required but not found in PATH" >&2
exit 1
fi

if [[ -z "$RELEASE_NAME" ]]; then
echo "Error: release name is required (-r <name>)" >&2
exit 1
fi

if [[ ${#VALUES_FILES[@]} -eq 0 ]]; then
echo "Error: at least one values file is required (-f <file>)" >&2
exit 1
fi

# Auto-discover placeholder values required for helm template to render.
# Helm template fails when secret/config values are missing — either with
# nil-pointer errors (.Values.x.y) or with required/fail messages
# ("Set global.x.y"). This iteratively fills them with dummy values.
discover_dummy_sets() {
local -a sets=()
local max_attempts=50
local attempt=0
local version_args=()
[[ -n "${CHART_VERSION}" ]] && version_args=(--version "$CHART_VERSION")

while (( attempt < max_attempts )); do
local err
err=$(helm template "$RELEASE_NAME" "$CHART_REF" ${version_args[@]+"${version_args[@]}"} "${VALUES_FILES[@]}" ${sets[@]+"${sets[@]}"} "$@" 2>&1 >/dev/null || true)

local missing
# Match nil-pointer style: .Values.global.foo.bar
missing=$(echo "$err" | grep -oE '\.Values\.[A-Za-z0-9_.]+' | head -1 | sed 's/^\.Values\.//' || true)

# Match required/fail style: "Set global.foo.bar"
if [[ -z "$missing" ]]; then
missing=$(echo "$err" | grep -ioE '[Ss]et [a-zA-Z][a-zA-Z0-9_.]+' | head -1 | sed 's/^[Ss]et //' || true)
fi

if [[ -z "$missing" ]]; then
break
fi

sets+=(--set "${missing}=placeholder")
(( ++attempt ))
done

echo ${sets[*]+"${sets[*]}"}
}

# Discover optional services (entries with enabled: false in values.yaml)
discover_optional_services() {
local values_file="$1"
awk '
/^[a-zA-Z_][a-zA-Z0-9_]*:/ {
gsub(/:.*/, "", $0)
top_key = $0
}
/^ enabled:[[:space:]]*false/ {
if (top_key != "") print top_key ".enabled=true"
}
' "$values_file"
}

# Renders helm template and extracts all image references.
extract_images() {
local rendered helm_err
local version_args=()
[[ -n "${CHART_VERSION}" ]] && version_args=(--version "$CHART_VERSION")
helm_err=$(mktemp)
trap 'rm -f "$helm_err"' RETURN
rendered=$(helm template "$RELEASE_NAME" "$CHART_REF" ${version_args[@]+"${version_args[@]}"} "${VALUES_FILES[@]}" "$@" 2>"$helm_err" || true)

if [[ -z "$rendered" && -s "$helm_err" ]]; then
echo "Warning: helm template failed: $(head -3 "$helm_err")" >&2
fi

[[ -z "$rendered" ]] && return

# Extract image references from rendered YAML "image:" fields
echo "$rendered" | grep -E '^\s+-?\s*image:\s' | grep -v '^\s*#' | sed 's/.*image:[[:space:]]*//' | tr -d '"' | grep ':' || true

# Extract split image references from ConfigMaps (IMAGE_NAME + IMAGE_TAG pairs)
local image_name image_tag
while IFS= read -r name_line; do
[[ -z "$name_line" ]] && continue
local key
key=$(echo "$name_line" | grep -oE '[A-Za-z0-9_]+IMAGE_NAME' || true)
[[ -z "$key" ]] && continue
local tag_key="${key/IMAGE_NAME/IMAGE_TAG}"
image_name=$(echo "$name_line" | awk '{print $NF}' | tr -d '"')
image_tag=$(echo "$rendered" | grep "${tag_key}:" | head -1 | awk '{print $NF}' | tr -d '"')
if [[ -n "$image_name" && -n "$image_tag" ]]; then
echo "${image_name}:${image_tag}"
fi
done <<< "$(echo "$rendered" | grep 'IMAGE_NAME:' || true)"
}

read -ra DUMMY_SETS <<< "$(discover_dummy_sets)"

ALL_IMAGES=""
ALL_IMAGES+=$'\n'"$(extract_images ${DUMMY_SETS[@]+"${DUMMY_SETS[@]}"})"

if [[ "$INCLUDE_OPTIONAL" == "true" ]]; then
BASE_VALUES=""
if [[ -d "$CHART_REF" && -f "${CHART_REF}/values.yaml" ]]; then
BASE_VALUES="${CHART_REF}/values.yaml"
fi

if [[ -n "$BASE_VALUES" ]]; then
while IFS= read -r flag; do
[[ -z "$flag" ]] && continue
read -ra OPT_DUMMY <<< "$(discover_dummy_sets --set "$flag")"
ALL_IMAGES+=$'\n'"$(extract_images ${OPT_DUMMY[@]+"${OPT_DUMMY[@]}"} --set "$flag")"
done < <(discover_optional_services "$BASE_VALUES")
else
echo "Warning: --all requires a local chart with values.yaml to discover optional services" >&2
fi
fi

IMAGES=$(echo "$ALL_IMAGES" | grep -v '^$' | sort -u)
IMAGE_COUNT=$(echo "$IMAGES" | grep -c . || true)

if [[ -n "$OUTPUT_FILE" ]]; then
echo "$IMAGES" > "$OUTPUT_FILE"
echo "Image list written to ${OUTPUT_FILE} (${IMAGE_COUNT} images)"
else
echo "$IMAGES"
fi

Usage examples (run from inside your chart directory):

# Write the image list for your configuration to images.txt
./list-onprem-images.sh -r unstract -f on-prem.values.yaml -f on-prem.secret.yaml -o images.txt

# Use a remote chart from OCI registry
./list-onprem-images.sh -r unstract \
-c oci://us-central1-docker.pkg.dev/pandoras-tamer/unstract-on-prem/unstract-platform \
-f on-prem.values.yaml -o images.txt

Step 2: Authenticate with Both Registries

You need to be logged into the source registry (to pull) and your destination registry (to push) before running the mirror loop in Step 3.

Log in to the Unstract source registry using the service-account key file provided by Zipstack:

cat artifact-key.json | docker login -u _json_key --password-stdin https://us-central1-docker.pkg.dev/

Log in to your private registry using whichever method your registry supports — for example:

docker login my-private-registry.example.com

Step 3: Mirror Images

Pull every image listed in images.txt and push it to your private registry. The snippet reads images.txt (created in Step 1) line by line, skipping comments and blank lines, and re-tags each image under your registry:

PRIVATE_REGISTRY="my-private-registry.example.com/unstract"

while IFS= read -r img; do
[[ "$img" =~ ^#.*$ || -z "$img" ]] && continue
# Preserve the /unstract namespace so the push path matches the pull path
# configured via global.unstract.image.registry in values-private-registry.yaml.
target="${PRIVATE_REGISTRY}/${img##*/}"
docker pull "$img"
docker tag "$img" "$target"
docker push "$target"
done < images.txt
warning

Redis Sentinel keeps a nested bitnami/ directory. The Sentinel image lives at unstract-on-prem/bitnami/redis-sentinel in the source registry (unlike the flat unstract-on-prem/bitnami-redis, bitnami-minio, etc.), and the override file expects unstract/bitnami/redis-sentinel on the destination. The simple ${img##*/} flattening above will push it as redis-sentinel and break the pull path. If you have Redis HA enabled, re-tag this one image manually:

docker tag <source>/unstract-on-prem/bitnami/redis-sentinel:<tag> \
${PRIVATE_REGISTRY}/bitnami/redis-sentinel:<tag>
docker push ${PRIVATE_REGISTRY}/bitnami/redis-sentinel:<tag>

Step 4: Configure the Override File

Copy values-private-registry.yaml from the Helm chart and replace my-private-registry.example.com with your actual registry throughout the file.

Override Structure

The file contains overrides organized by service category:

SectionWhat it redirects
global.unstract.image.registryAll core services, init containers, and tool images (backend, frontend, platform-service, prompt-service, runner, worker-unified, x2text-service, db-connection-check, tool-structure, tool-sidecar)
redis / redis.sentinelBitnami Redis and Redis Sentinel images (Sentinel only when HA is enabled)
minio / minio.haStandalone MinIO and MinIO HA tenant image (HA only when minio.ha.enabled is true)
dbproxyPgBouncer connection pooler
libreofficeLibreOffice X2PDF document conversion service

Key Configuration

global:
unstract:
image:
# Redirects all core services, init containers, and tool images
registry: my-private-registry.example.com/unstract
# Uncomment to pin a specific version for all services:
# tag: "rc.XXX"
  • global.unstract.image.registry — This is the primary override. Most services (backend, frontend, platform-service, prompt-service, runner, worker-unified, x2text-service) and init containers (db-connection-check) derive their image path from this value, as do the runtime-pulled tool images.
  • Infrastructure images (Redis, MinIO, PgBouncer, LibreOffice) use their own image.registry or image.repository fields and must be overridden individually.
  • HA-mode images (Redis Sentinel, MinIO HA tenant) are only consumed when their parent service is configured in HA mode. Mirror and override them only if you enable HA for that service.

Step 5: Create Image Pull Secret (if required)

If your private registry requires authentication:

kubectl create secret docker-registry your-registry-secret \
--namespace unstract \
--docker-server=my-private-registry.example.com \
--docker-username=<username> \
--docker-password=<password>

Then uncomment the imagePullSecrets section in values-private-registry.yaml:

global:
imagePullSecrets:
- name: your-registry-secret

Step 6: Deploy

Include the override file in your Helm install or upgrade command:

helm upgrade --install unstract-platform ./unstract-platform \
-f on-prem.values.yaml \
-f values-private-registry.yaml \
-f on-prem.secret.yaml \
-n unstract

values-private-registry.yaml Reference

global:
unstract:
image:
# Redirects all core services, init containers, and tool images
registry: my-private-registry.example.com/unstract
# Uncomment to pin a specific version for all services:
# tag: "rc.XXX"

# Uncomment if your private registry requires authentication:
# imagePullSecrets:
# - name: your-registry-secret

# ====================================================================
# Infrastructure Images
# ====================================================================
redis:
image:
registry: my-private-registry.example.com
repository: unstract/bitnami-redis
tag: 7.2.4-debian-12-r13
sentinel:
image:
registry: my-private-registry.example.com
# Note: slash-separated path matching source registry structure
repository: unstract/bitnami/redis-sentinel
tag: 7.2.4-debian-12-r13

minio:
# NOTE: standalone (image.*) and HA (ha.image.*) share the same top-level
# `minio:` key. Keep them in a single block — duplicating the `minio:` key
# would silently overwrite the first occurrence.
image:
registry: my-private-registry.example.com
repository: unstract/bitnami-minio
tag: 2024.12.18-debian-12-r0
ha:
image:
# MinIO HA tenant image (different from standalone bitnami-minio).
# Only takes effect when minio.ha.enabled is true.
# The minio-tenant template renders the image as
# {{ global.unstract.image.registry }}/{{ ha.image.name }}:{{ ha.image.tag }}
# so the pulled image will be <registry>/minio:<tag> — no separate
# repository/registry field is honored here.
name: minio
tag: RELEASE.2024-12-18T13-15-44Z

dbproxy:
image:
repository: my-private-registry.example.com/unstract/pgbouncer
tag: 1.18.0

libreoffice:
image:
repository: my-private-registry.example.com/unstract/libreoffice-x2pdf
tag: v2.2.3

Registry Domains to Whitelist

If your network uses egress filtering, ensure the following domains are accessible from your Kubernetes nodes:

DomainPurpose
us-central1-docker.pkg.devUnstract container images and Helm chart OCI registry
docker.ioPython utility image (python:3.12-slim) used by deployment jobs
downloads.unstructured.ioUnstructured API image (only if enabled)