Version: 2.0.0

On-Prem Deployment Guide

This guide provides comprehensive instructions for deploying LLMWhisperer in an on-premises environment. LLMWhisperer is deployed inside a Kubernetes cluster, packaged as a Helm chart.

Overview

LLMWhisperer On-Prem is a self-hosted deployment that runs entirely within your infrastructure. It includes:

LLMWhisperer Backend — the core text extraction API service
LLMWhisperer Dashboard — a web UI for usage monitoring and management
OCR Workers — document processing workers that scale based on load
RabbitMQ — message broker for distributed task processing
Redis — caching layer for performance optimization

1. Infrastructure Prerequisites

Kubernetes Cluster

Recommended version: >= 1.29 (latest tested: 1.33)
Node autoscaling should be enabled
Recommended to create in a single Availability Zone since some statefulset workloads do not have HA support yet. Multi-AZ can lead to volume attach errors
Ingress controller as a K8s cluster add-on for load balancer creation (recommended)
- Ingress requires a maximum timeout of 900 seconds to work as expected
In-house or cloud provider observability stack (recommended)

PostgreSQL Database

Supported version: 15.0
Minimum specs: 1 vCPU, 8 GiB RAM, 50 GiB SSD
Autoscale enabled (recommended)
A dedicated database for LLMWhisperer should be created within the PostgreSQL instance

DNS & SSL

A domain for pointing to LLMWhisperer (e.g., llmwhisperer.<customer-domain>.com)
An active SSL certificate is required for the domain

Node Profile

Add 50 GiB SSD for application data to each machine.

Machine Type	Label	Taint (NoSchedule)	Min	Max
8 vCPU and 32 GiB	`service: llmwhisperer`	`service: llmwhisperer`	1	60

GPU Nodes (Optional — for document insights mode)

Cloud Provider	Instance Type	GPU Family	Label	Taint (NoSchedule)	Min	Max
AWS	g6.xlarge	NVIDIA L4 Tensor Core	`service: llmwhisperer-gpu`	`service: llmwhisperer-gpu`	1	1
GCP	g2-standard-4	NVIDIA L4 Tensor Core	`service: llmwhisperer-gpu`	`service: llmwhisperer-gpu`	1	1

warning

It is expected that the workloads are to be deployed on non-spot nodepools.

2. Configuration

Files Provided by Unstract Team

The following files will be provided by the Unstract team:

File	Description
`artifact-key.json`	GCP service account key for Helm chart registry login and container image pull
`sample.onprem.values.yaml`	Sample Helm chart values (non-sensitive configuration)
`onprem-profile.values.yaml`	Profile values for resource allocation and scaling configuration

tip

The sample.onprem.values.yaml and onprem-profile.values.yaml files are bundled inside each Helm chart release. You can extract them for any version — see Download Configuration Files for instructions.

Required Configuration Values

These values must be provided by the customer or the Unstract team to deploy LLMWhisperer:

Variable	Description	Source
`DB_LLMW_HOST`	PostgreSQL host	Customer
`DB_LLMW_USERNAME`	PostgreSQL username	Customer
`DB_LLMW_PASSWORD`	PostgreSQL password	Customer
`DB_LLMW_NAME`	PostgreSQL database name	Customer
`ENCRYPTION_KEY`	Encryption key for sensitive data — must be backed up securely	Self-generated
`LICENSE_PORTAL_API_KEY`	License portal API key	Unstract Team
`endpoint` (azureOcrBilling)	Azure Cognitive Services OCR endpoint	Unstract Team
`apiKey` (azureOcrBilling)	Azure OCR API key	Unstract Team
`INITIAL_PASSWORD`	Initial admin password for the dashboard	Customer
`X_CELERY_BROKER_USERNAME`	RabbitMQ username	Customer
`X_CELERY_BROKER_PASSWORD`	RabbitMQ password	Customer

warning

The ENCRYPTION_KEY is used to encrypt data at rest and is required when retrieving the data. Do not rotate, delete, or lose this key — doing so will render existing encrypted data inaccessible.

Using Kubernetes Secrets (`existingSecret`)

Each configuration section in sample.onprem.values.yaml supports two approaches for providing sensitive values:

Option 1: Inline values — Provide values directly in the values file. Suitable for initial setup and testing.

global:
  sharedConfigs:
    database:
      DB_LLMW_HOST: "postgres.example.com"
      DB_LLMW_USERNAME: "postgres"
      DB_LLMW_PASSWORD: "your-password"
      DB_LLMW_NAME: "llmwhisperer"

Option 2: Kubernetes secrets (recommended for production) — Pre-create Kubernetes secrets with the matching variable names as keys, then reference the secret name via existingSecret. This avoids storing sensitive values in the Helm values file.

global:
  sharedConfigs:
    database:
      existingSecret: "llmwhisperer-db-credentials"

The following configuration sections support existingSecret:

Section	Example Secret Name	Keys
`global.sharedConfigs.database`	`llmwhisperer-db-credentials`	`DB_LLMW_HOST`, `DB_LLMW_USERNAME`, `DB_LLMW_PASSWORD`, `DB_LLMW_NAME`, `DB_LLMW_PORT`
`global.sharedConfigs.redis`	`llmwhisperer-redis-credentials`	`REDIS_HOST`, `REDIS_PORT`, `REDIS_DB`, `REDIS_PASSWORD`, `REDIS_USER`
`global.sharedConfigs.workerRedis`	`llmwhisperer-worker-redis-credentials`	`WORKER_REDIS_HOST`, `WORKER_REDIS_PORT`, `WORKER_REDIS_DB`, `WORKER_REDIS_PASSWORD`
`global.sharedConfigs.celeryBroker`	`llmwhisperer-celery-broker-credentials`	`X_CELERY_BROKER_BASE_URL`, `X_CELERY_BROKER_USERNAME`, `X_CELERY_BROKER_PASSWORD`, `X_CELERY_BACKEND_URL`
`global.sharedConfigs.apiKeys`	`llmwhisperer-api-keys`	`ENCRYPTION_KEY`
`global.sharedConfigs.dashboardCredentials`	`llmwhisperer-dashboard-credentials`	`INITIAL_USER_NAME`, `INITIAL_PASSWORD`
`global.sharedConfigs.license`	`llmwhisperer-license-secret`	`LICENSE_PORTAL_URL`, `LICENSE_PORTAL_API_KEY`
`global.azureOcrBilling`	`azure-ocr-billing-credentials`	`endpoint`, `apiKey`

Example of creating a Kubernetes secret:

kubectl create secret generic llmwhisperer-db-credentials \
  --namespace $NAMESPACE \
  --from-literal=DB_LLMW_HOST="postgres.example.com" \
  --from-literal=DB_LLMW_USERNAME="postgres" \
  --from-literal=DB_LLMW_PASSWORD="your-password" \
  --from-literal=DB_LLMW_NAME="llmwhisperer" \
  --from-literal=DB_LLMW_PORT="5432"

3. Installation (One-Time)

Step 1: Check Cluster Connectivity

kubectl cluster-info

Step 2: Deploy RabbitMQ Operator (Once Per Cluster)

RabbitMQ operator is used for provisioning the RabbitMQ cluster within the namespace using its CRD. Refer to the official documentation.

kubectl apply -f "https://github.com/rabbitmq/cluster-operator/releases/latest/download/cluster-operator.yml"

Step 3: Create Namespace

export NAMESPACE=<namespace_name>
kubectl create namespace $NAMESPACE

Step 4: Authenticate Helm Registry

cat artifact-key.json | helm registry login -u _json_key --password-stdin https://us-central1-docker.pkg.dev

Step 5: Create Image Pull Secret

kubectl create secret docker-registry artifact-registry \
  --namespace $NAMESPACE \
  --docker-server=us-central1-docker.pkg.dev \
  --docker-username=_json_key \
  --docker-password="$(cat artifact-key.json)"

Validate the secret was created successfully:

kubectl get secret artifact-registry -n $NAMESPACE -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d

Step 6: Configure Values File

Create a copy of sample.onprem.values.yaml as onprem.values.yaml
Fill in all values marked with # <REQUIRED> — refer to the Configuration section for details on each value

Step 7: Install Helm Chart

Requires 3 x 8 vCPU 32 GB nodes by default
Processes ~1,800 pages/hour, maximum concurrency of 10 pages, response time of 12–14 seconds
With HPA enabled: up to ~7,200 pages/hour, concurrency of 30 pages, response time of 15–16 seconds (uses ~9 x 8 vCPU machines)
Capacity can be further tuned based on the processing modes in use

helm install whisperer oci://us-central1-docker.pkg.dev/pandoras-tamer/charts/llmwhisperer \
  --version <version> \
  -f /path/to/onprem.values.yaml \
  -f /path/to/onprem-profile.values.yaml \
  -n $NAMESPACE

Replace <version> with the target release version (see Version History).

4. Deployment Validation

Health Checks

Service	Port	Network Type	Endpoint
`whisperer-backend`	3006	HTTP	`/health/ping`
`llmwhisperer-dashboard`	3007	HTTP	`/health/ping`

Validation Steps

Check that all pods in the namespace are running without restarts:
```
kubectl get pods -n $NAMESPACE
```
Validate the ingress configured for both the LLMWhisperer dashboard and the backend
Log in to the dashboard using the credentials configured in onprem.values.yaml
Validate the backend API — refer to the API documentation

5. Upgrading

Configure onprem.values.yaml as required for the target release version
Run the upgrade command:

helm upgrade whisperer oci://us-central1-docker.pkg.dev/pandoras-tamer/charts/llmwhisperer \
  --version <version> \
  -f /path/to/onprem.values.yaml \
  -f /path/to/onprem-profile.values.yaml \
  -n $NAMESPACE

info

If you are on AWS Ingress and upgrading from a version older than v2.36.0, ensure the following annotation is present: alb.ingress.kubernetes.io/target-type: ip (see Appendix a).

Once LLMWhisperer is successfully deployed:

Log in to the LLMWhisperer Dashboard using the INITIAL_PASSWORD configured during installation
Change the password after first login

Appendix

a. Ingress Configuration

All ingress types must support a 900-second timeout.

AWS ALB Ingress Controller

Ingress configuration in EKS Auto Mode

Required annotation:

# REF: https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/how-it-works/#ip-mode
alb.ingress.kubernetes.io/target-type: ip

Nginx Ingress Controller

Required annotations (Community Version syntax):

# Default is 60. Must be increased to 900.
nginx.ingress.kubernetes.io/proxy-read-timeout: "900"
# Default is 1 MB. Must be increased for large document uploads.
# REF: https://docs.nginx.com/nginx-ingress-controller/configuration/ingress-resources/advanced-configuration-with-annotations/
nginx.org/client-max-body-size: "200m"

Configure Nginx to work with AWS EKS

warning

Avoid using nginx.ingress.kubernetes.io/rewrite-target annotation. In Community NGINX Controller versions >= v0.22.0, the old rewrite-target: / syntax causes authentication failures (401 Unauthorized responses). If you encounter login issues, remove any rewrite-target annotations from your ingress configuration.

b. Outgoing Data (OCR Billing)

In an on-prem deployment, the only outgoing data from the OCR containers is billing information sent to Azure for metering purposes. No document content leaves your infrastructure.

You can find details about how Azure container billing works here.

The following billing data is sent to Unstract for license metering:

{
    "subscription_id": "<subscription_id:uuid4>",
    "deployment_id": "<deployment_id:uuid4>",
    "page_count_total": "<total_page_count:int>",
    "native_text_page_count": "<non_ocr_page_count:int>",
    "low_cost_page_count": "<low_cost_page_count:int>",
    "high_quality_page_count": "<high_quality_page_count:int>",
    "form_page_count": "<form_page_count:int>",
    "from_date": "<timestamp>",
    "to_date": "<timestamp>"
}

c. Useful Commands

Kubernetes:

kubectl get pod -n <namespace>
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>

Helm:

helm list -n <namespace>

helm show values oci://us-central1-docker.pkg.dev/pandoras-tamer/charts/llmwhisperer --version <version>

helm rollback whisperer <revision-number> -n <namespace>

helm uninstall whisperer -n <namespace>

Overview​

1. Infrastructure Prerequisites​

Kubernetes Cluster​

PostgreSQL Database​

DNS & SSL​

Node Profile​

2. Configuration​

Files Provided by Unstract Team​

Required Configuration Values​

Using Kubernetes Secrets (existingSecret)​

3. Installation (One-Time)​

Step 1: Check Cluster Connectivity​

Step 2: Deploy RabbitMQ Operator (Once Per Cluster)​

Step 3: Create Namespace​

Step 4: Authenticate Helm Registry​

Step 5: Create Image Pull Secret​

Step 6: Configure Values File​

Step 7: Install Helm Chart​

4. Deployment Validation​

Health Checks​

Validation Steps​

5. Upgrading​

6. Admin Login / Onboarding​

Appendix​

a. Ingress Configuration​

b. Outgoing Data (OCR Billing)​

c. Useful Commands​

Overview

1. Infrastructure Prerequisites

Kubernetes Cluster

PostgreSQL Database

DNS & SSL

Node Profile

2. Configuration

Files Provided by Unstract Team

Required Configuration Values

Using Kubernetes Secrets (`existingSecret`)

3. Installation (One-Time)

Step 1: Check Cluster Connectivity

Step 2: Deploy RabbitMQ Operator (Once Per Cluster)

Step 3: Create Namespace

Step 4: Authenticate Helm Registry

Step 5: Create Image Pull Secret

Step 6: Configure Values File

Step 7: Install Helm Chart

4. Deployment Validation

Health Checks

Validation Steps

5. Upgrading

6. Admin Login / Onboarding

Appendix

a. Ingress Configuration

b. Outgoing Data (OCR Billing)

c. Useful Commands