Cloning Organizations

Clone an Unstract organization's configured resources into another organization. The source can be on the same deployment as the target, or on a completely separate one.

Cloned resources: user groups, adapters, connectors, custom tools, prompts, profiles, workflows, tool instances, workflow endpoints, tags, API deployments, pipelines, and Prompt Studio document files. Each resource's sharing state is replicated too — shared users (matched by email on the target), shared groups (matched by name), and the org-wide sharing flag.

The source organization is left untouched.

Users are not cloned

Two reasons:

The same user may not need access in every environment.
The same user may hold different roles across environments.

Groups are cloned (by name; an existing same-name group on the target is reused), but their members are not — for the same reasons. An admin can add the right users to each group per environment, or pass --clone-group-members to add members to cloned groups by email match. Members (and shared users) whose email doesn't exist on the target are skipped and listed as warnings in the report.

When to use this

Environment promotion. Build in DEV, validate in QA, then promote to PROD. Run the clone after each gate to push the latest validated state forward, and re-run as you iterate — re-runs only create what's missing on the target.
Spinning up a fresh org from a known-good baseline. Copy a curated template org into a new tenant's workspace as a starting point.
Backing up an org's configuration into a sibling org. Configuration only; this isn't a data backup tool.

info

A clone is several dozen Platform API calls in dependency order, with UUID remapping between phases. Rather than scripting that yourself, use the unstract.clone module shipped with the official Python client — it is a thin wrapper over the Platform API endpoints documented elsewhere on this page.

Prerequisites

Unstract v0.165.0 or later on both source and target deployments. Earlier versions are missing Platform API endpoints this feature relies on.
An org admin Platform API key for the source organization (read).
An org admin Platform API key for the target organization (read/write).
Python 3.10+ on the machine running the clone. uv is recommended but not required — pip works too (see Install the client).

See Platform API Keys for how to mint keys.

Quickstart

1. Install the client

The clone tool ships with the unstract-client package on PyPI, which exposes an unstract console script with clone as a subcommand.

Pick one of the following, depending on how you manage Python tooling:

# Recommended — install as a global tool with uv (isolated venv, on PATH)
uv tool install unstract-client

# …or with plain pip into an active virtualenv
pip install unstract-client

Verify the install (run unstract clone --help — should print the usage):

unstract clone --help

Using uv add in an existing project

If you'd rather pin the dependency inside an existing uv project (uv add unstract-client), the console script lives in that project's .venv rather than on your global PATH. In that case, prefix every invocation with uv run (e.g. uv run unstract clone --help and uv run unstract clone … in step 2).

2. Run the clone

The block below uses POSIX shell syntax (bash/zsh). On Windows PowerShell, set environment variables with $env:UNSTRACT_SRC_PLATFORM_KEY="..." (and the target equivalent) on separate lines before invoking unstract clone …, and replace the trailing \ line continuations with backticks (`).

UNSTRACT_SRC_PLATFORM_KEY=src_pk_...  \
UNSTRACT_TGT_PLATFORM_KEY=tgt_pk_...  \
unstract clone \
  --source-url https://source.example.com \
  --source-org my-source-org \
  --target-url https://target.example.com \
  --target-org my-target-org

--source-url and --target-url can point to the same deployment or to two different deployments.

Key handling

Both keys grant broad access to your data. Run from a trusted machine and rotate both keys after the clone completes.

How it works

The command walks resources in dependency order. Each phase clones one type and records a source_uuid → target_uuid map so later phases can rewrite references before posting.

group             7. workflow
adapter           8. tool_instance
connector         9. workflow_endpoint
tag              10. pipeline
custom_tool      11. api_deployment
files

The group phase runs first because every later phase that replicates a resource's sharing state consumes its name→ID remap when rewriting shared_groups. After each resource is created, the tool re-applies its source sharing state on the target: shared users matched by email (missing users skipped with a warning), shared groups matched by name, and the org-wide sharing flag.

Prompts, profile managers, and the tool-registry republish step ride along inside the custom_tool phase rather than getting their own rows. Their UUID remaps still show up in the run summary (e.g. prompt_studio_registry=N) so dependent phases can rewrite references correctly.

Each phase corresponds to one or more endpoints already listed on this site — for example the Adapters, Connectors, Prompt Studio, Workflows, API Deployments, and ETL Pipelines sections.

Re-runs are safe

At the default --on-name-conflict adopt, the clone is idempotent. If a phase fails partway, fix the underlying issue and re-run the same command — resources already on the target are detected by name and reused; only the missing ones are created. A clean re-run after a successful clone does no writes. With --on-name-conflict abort, this property does not hold — a name collision on re-run stops the run instead of adopting.

This is what makes the tool a good fit for iterative environment promotion: run it once after the DEV → QA gate, re-run it after each subsequent DEV change, and only the deltas land on QA.

There is no resume flag and no state file. The target is the state. If you delete a resource on the target between runs, the next run recreates it.

Files

The Prompt Studio document corpus is the only resource type with bytes on disk. By default files are downloaded from source and uploaded to target in parallel (see --concurrency), capped at 25 MB per file.

`--file-strategy`	Behavior
`platform_api` (default)	Transfer each file via the Platform API. Files over `--max-file-size` are reported for manual re-upload.
`skip`	Skip file bytes entirely. Document records are still created on the target. Equivalent to `--skip-files`.

Run during low activity

If users are actively uploading to the source org while the clone is running, you can end up with duplicate file records on the target. Schedule clones for low-activity windows.

Missing files are non-fatal

If a file isn't transferred (skipped, oversize, or mid-run failure), the platform stays usable. Only operations on that specific file (preview, index, prompt run) will error. Re-upload missing files through the UI.

Behavioral notes

Unstract Cloud free-trial adapters are not cloned. On Unstract Cloud, the LLM/embedding/vector-DB/x2text adapters provisioned as part of the free trial are owned by the platform, not the org, and are filtered out of the source listing. Any Prompt Studio project whose default profile references one of these trial adapters is skipped, and that cascades — workflows built on those projects (and their API deployments / pipelines) are skipped too. They show up under "Skipped projects" in the final report. After provisioning your own adapters on the target org, re-run the clone to bring the rest across.
OAuth-backed connectors require re-authorisation on target. Connectors that authenticate via OAuth (e.g. Google Drive) are created on the target without their refresh tokens — the Platform API never exposes OAuth credentials. After the clone, open each OAuth connector on the target and re-connect it before use. Connectors with redacted metadata (e.g. auto-provisioned Unstract Cloud Storage) are skipped entirely and noted in the report.
API deployment keys are regenerated. API deployments and pipelines get a new API key minted on the target. Update any downstream consumers with the new key before cutting over.
Pipelines start paused. Pipelines are created paused on the target so scheduled runs don't fire mid-clone. Unpause them once cut-over is complete. Override with --no-pipelines-paused if you don't want this.
UUIDs are not preserved. Every target resource gets a fresh UUID. Embedded references between resources are rewritten automatically.
Direct storage copies aren't supported. Files always travel through the Platform API. Object-store copy commands (e.g. gsutil cp, aws s3 sync) are out of scope.
The source org is read-only during a clone. No writes go to the source; you can run a clone against a live source org without changing it.

CLI reference

Environment

Variable	Required	Purpose
`UNSTRACT_SRC_PLATFORM_KEY`	yes	Source org admin Platform API key
`UNSTRACT_TGT_PLATFORM_KEY`	yes	Target org admin Platform API key

Flags

Flag	Default	Purpose
`--source-url` / `--target-url`	—	Base URLs of both deployments
`--source-org` / `--target-org`	—	Org slugs
`--api-prefix`	`api/v1`	URL prefix for the Platform API
`--include` / `--exclude`	all / none	Comma-separated phase names — use the snake_case identifiers from How it works, e.g. `adapter,workflow,pipeline`
`--clone-group-members`	off	Also add members to cloned groups, matched by email on the target. Members missing on the target are skipped and listed as warnings
`--dry-run`	off	List actions without writing
`--on-name-conflict`	`adopt`	`adopt` reuses existing target resources; `abort` stops on conflict
`--file-strategy`	`platform_api`	`platform_api` or `skip`
`--max-file-size`	`25MB`	Per-file cap
`--skip-files`	off	Alias for `--file-strategy=skip`
`--concurrency`	`4`	Per-phase worker count (1–32). `1` is strictly sequential. Each phase fans entity-level work out across this many threads; the files phase uses it for parallel download/upload of documents. Lower it if your deployment is rate-limited or under load.
`--pipelines-paused` / `--no-pipelines-paused`	on	Create pipelines paused on target; pass `--no-pipelines-paused` to leave them active
`--verbose`	off	Per-entity log lines

What you'll see at the end

A CloneReport is printed at the end of every run. It opens with the source and target endpoints, then a per-phase table, then any files that need manual follow-up, a remap summary, and a status footer.

Column meanings: Created = new on target, Adopted = already existed on target and reused, Skipped = intentionally not cloned (including oversize files), Failed = errored, Time = wall time the phase took.

Source: org_L4hharroun0ZE01l @ https://globe.unstract.com
Target: org_HJRi1wVgyB4IpVZT @ https://onpremha.globe.unstract.com
                              Clone Report
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━┓
┃ Phase                ┃ Created ┃ Adopted ┃ Skipped ┃ Failed ┃ Time ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━┩
│ group                │       0 │       2 │       0 │      0 │ 0.2s │
│ adapter              │       0 │       4 │       0 │      0 │ 0.9s │
│ connector            │       0 │       0 │       1 │      0 │ 0.2s │
│ tag                  │       0 │       0 │       0 │      0 │ 0.1s │
│ custom_tool          │       0 │       1 │       4 │      0 │ 0.7s │
│ files                │       0 │       0 │       1 │      0 │ 0.3s │
│ workflow             │       0 │       1 │       0 │      0 │ 0.3s │
│ tool_instance        │       0 │       1 │       0 │      0 │ 0.2s │
│ workflow_endpoint    │       2 │       0 │       0 │      0 │ 0.3s │
│ pipeline             │       0 │       0 │       0 │      0 │ 0.1s │
│ api_deployment       │       0 │       1 │       0 │      0 │ 0.2s │
├──────────────────────┼─────────┼─────────┼─────────┼────────┼──────┤
│ TOTAL                │       2 │      10 │       6 │      0 │ 3.6s │
└──────────────────────┴─────────┴─────────┴─────────┴────────┴──────┘
Remap entries: group=2, adapter=4, custom_tool=1, prompt_studio_registry=1, workflow=1, tool_instance=1, workflow_endpoint=2, api_deployment=1
Completed successfully

Each phase that runs gets a row regardless of count — a row of zeros means the phase ran but found nothing to do. Remap entries only list entities that actually picked up a UUID mapping, so tag and files are absent here.

When a run isn't this clean, extra sections appear above the status footer:

Warnings. A Warnings (non-fatal; operator follow-up may be needed) block lists per-phase warnings — e.g. group members or shared users whose email doesn't exist on the target and were skipped.
Files needing manual follow-up. Files uploaded: N, then one or more of Oversize files (manual upload required), Unsupported mime files (manual upload required), Skipped files (operator action required), and Failed files. Each row identifies the tool and file so you can act without scrolling back through the run log. Oversize and unsupported-mime files count under Skipped in the per-phase row.
Failures summary. A Failures (see WARNING/ERROR log lines above for full detail) block lists each errored entity by phase, capped at 30 rows and 200 characters per line — enough to triage without drowning the report.

The status footer reads Completed successfully on a clean run, Completed with N failure(s) — see WARNING/ERROR log lines above for details when phases failed, or ABORTED: <reason> when the run was halted (e.g. --on-name-conflict abort).

The same data is also available programmatically via report.as_dict() if you wrap the command in your own automation.

Recovering from a failure

Read the printed report — completed phases and the entity that failed are both listed.
Fix the underlying issue (network, permissions, missing credentials, oversized payload, etc.).
Re-run the same command.

There is no --resume-from flag. The pre-checks make a re-run safe and cheap.

Cascading Workflows — promote a single Prompt Studio project or deployment across environments. Use this when you only need to move one project, not the whole org.

When to use this​

Prerequisites​

Quickstart​

1. Install the client​

2. Run the clone​

How it works​

Re-runs are safe​

Files​

Behavioral notes​

CLI reference​

Environment​

Flags​

What you'll see at the end​

Recovering from a failure​

Related​