Cloning Organizations
Clone an Unstract organization's configured resources into another organization. The source can be on the same deployment as the target, or on a completely separate one.
Cloned resources: adapters, connectors, custom tools, prompts, profiles, workflows, tool instances, workflow endpoints, tags, API deployments, pipelines, and Prompt Studio document files.
The source organization is left untouched.
Two reasons:
- The same user may not need access in every environment.
- The same user may hold different roles across environments.
Groups will be cloned (upcoming — not yet implemented). Once available, an admin can add the right users to each group per environment.
When to use this
- Environment promotion. Build in DEV, validate in QA, then promote to PROD. Run the clone after each gate to push the latest validated state forward, and re-run as you iterate — re-runs only create what's missing on the target.
- Spinning up a fresh org from a known-good baseline. Copy a curated template org into a new tenant's workspace as a starting point.
- Backing up an org's configuration into a sibling org. Configuration only; this isn't a data backup tool.
A clone is several dozen Platform API calls in dependency order, with UUID remapping between phases. Rather than scripting that yourself, use the unstract.clone module shipped with the official Python client — it is a thin wrapper over the Platform API endpoints documented elsewhere on this page.
Prerequisites
- Unstract v0.165.0 or later on both source and target deployments. Earlier versions are missing Platform API endpoints this feature relies on.
- An org admin Platform API key for the source organization (read).
- An org admin Platform API key for the target organization (read/write).
- Python 3.10+ on the machine running the clone.
uvis recommended but not required —pipworks too (see Install the client).
See Platform API Keys for how to mint keys.
Quickstart
1. Install the client
The clone tool ships as the clone extra of the unstract-client package on PyPI. Installing the extra exposes an unstract-clone console script.
Pick one of the following, depending on how you manage Python tooling:
# Recommended — install as a global tool with uv (isolated venv, on PATH)
uv tool install 'unstract-client[clone]'
# …or with plain pip into an active virtualenv
pip install 'unstract-client[clone]'
Verify the install (run unstract-clone --help — should print the usage):
unstract-clone --help
uv add in an existing projectIf you'd rather pin the dependency inside an existing uv project (uv add 'unstract-client[clone]'), the console script lives in that project's .venv rather than on your global PATH. In that case, prefix every invocation with uv run (e.g. uv run unstract-clone --help and uv run unstract-clone clone … in step 2).
2. Run the clone
The block below uses POSIX shell syntax (bash/zsh). On Windows PowerShell, set environment variables with $env:UNSTRACT_SRC_PLATFORM_KEY="..." (and the target equivalent) on separate lines before invoking unstract-clone …, and replace the trailing \ line continuations with backticks (`).
UNSTRACT_SRC_PLATFORM_KEY=src_pk_... \
UNSTRACT_TGT_PLATFORM_KEY=tgt_pk_... \
unstract-clone clone \
--source-url https://source.example.com \
--source-org my-source-org \
--target-url https://target.example.com \
--target-org my-target-org
--source-url and --target-url can point to the same deployment or to two different deployments.
Both keys grant broad access to your data. Run from a trusted machine and rotate both keys after the clone completes.
How it works
The command walks resources in dependency order. Each phase clones one type and records a source_uuid → target_uuid map so later phases can rewrite references before posting.
1. adapter 6. workflow
2. connector 7. tool_instance
3. tag 8. workflow_endpoint
4. custom_tool 9. pipeline
5. files 10. api_deployment
Prompts, profile managers, and the tool-registry republish step ride along inside the custom_tool phase rather than getting their own rows. Their UUID remaps still show up in the run summary (e.g. prompt_studio_registry=N) so dependent phases can rewrite references correctly.
Each phase corresponds to one or more endpoints already listed on this site — for example the Adapters, Connectors, Prompt Studio, Workflows, API Deployments, and ETL Pipelines sections.
Re-runs are safe
At the default --on-name-conflict adopt, the clone is idempotent. If a phase fails partway, fix the underlying issue and re-run the same command — resources already on the target are detected by name and reused; only the missing ones are created. A clean re-run after a successful clone does no writes. With --on-name-conflict abort, this property does not hold — a name collision on re-run stops the run instead of adopting.
This is what makes the tool a good fit for iterative environment promotion: run it once after the DEV → QA gate, re-run it after each subsequent DEV change, and only the deltas land on QA.
There is no resume flag and no state file. The target is the state. If you delete a resource on the target between runs, the next run recreates it.
Files
The Prompt Studio document corpus is the only resource type with bytes on disk. By default files are downloaded from source and uploaded to target in parallel (see --concurrency), capped at 25 MB per file.
--file-strategy | Behavior |
|---|---|
platform_api (default) | Transfer each file via the Platform API. Files over --max-file-size are reported for manual re-upload. |
skip | Skip file bytes entirely. Document records are still created on the target. Equivalent to --skip-files. |
If users are actively uploading to the source org while the clone is running, you can end up with duplicate file records on the target. Schedule clones for low-activity windows.
If a file isn't transferred (skipped, oversize, or mid-run failure), the platform stays usable. Only operations on that specific file (preview, index, prompt run) will error. Re-upload missing files through the UI.
Behavioral notes
- Unstract Cloud free-trial adapters are not cloned. On Unstract Cloud, the LLM/embedding/vector-DB/x2text adapters provisioned as part of the free trial are owned by the platform, not the org, and are filtered out of the source listing. Any Prompt Studio project whose default profile references one of these trial adapters is skipped, and that cascades — workflows built on those projects (and their API deployments / pipelines) are skipped too. They show up under "Skipped projects" in the final report. After provisioning your own adapters on the target org, re-run the clone to bring the rest across.
- OAuth-backed connectors require re-authorisation on target. Connectors that authenticate via OAuth (e.g. Google Drive) are created on the target without their refresh tokens — the Platform API never exposes OAuth credentials. After the clone, open each OAuth connector on the target and re-connect it before use. Connectors with redacted metadata (e.g. auto-provisioned Unstract Cloud Storage) are skipped entirely and noted in the report.
- API deployment keys are regenerated. API deployments and pipelines get a new API key minted on the target. Update any downstream consumers with the new key before cutting over.
- Pipelines start paused. Pipelines are created paused on the target so scheduled runs don't fire mid-clone. Unpause them once cut-over is complete. Override with
--no-pipelines-pausedif you don't want this. - UUIDs are not preserved. Every target resource gets a fresh UUID. Embedded references between resources are rewritten automatically.
- Direct storage copies aren't supported. Files always travel through the Platform API. Object-store copy commands (e.g.
gsutil cp,aws s3 sync) are out of scope. - The source org is read-only during a clone. No writes go to the source; you can run a clone against a live source org without changing it.
CLI reference
Environment
| Variable | Required | Purpose |
|---|---|---|
UNSTRACT_SRC_PLATFORM_KEY | yes | Source org admin Platform API key |
UNSTRACT_TGT_PLATFORM_KEY | yes | Target org admin Platform API key |
Flags
| Flag | Default | Purpose |
|---|---|---|
--source-url / --target-url | — | Base URLs of both deployments |
--source-org / --target-org | — | Org slugs |
--api-prefix | api/v1 | URL prefix for the Platform API |
--include / --exclude | all / none | Comma-separated phase names — use the snake_case identifiers from How it works, e.g. adapter,workflow,pipeline |
--dry-run | off | List actions without writing |
--on-name-conflict | adopt | adopt reuses existing target resources; abort stops on conflict |
--file-strategy | platform_api | platform_api or skip |
--max-file-size | 25MB | Per-file cap |
--skip-files | off | Alias for --file-strategy=skip |
--concurrency | 4 | Per-phase worker count (1–32). 1 is strictly sequential. Each phase fans entity-level work out across this many threads; the files phase uses it for parallel download/upload of documents. Lower it if your deployment is rate-limited or under load. |
--pipelines-paused / --no-pipelines-paused | on | Create pipelines paused on target; pass --no-pipelines-paused to leave them active |
--verbose | off | Per-entity log lines |
What you'll see at the end
A CloneReport is printed at the end of every run. It opens with the source and target endpoints, then a per-phase table, then any files that need manual follow-up, a remap summary, and a status footer.
Column meanings: Created = new on target, Adopted = already existed on target and reused, Skipped = intentionally not cloned (including oversize files), Failed = errored, Time = wall time the phase took.
Source: org_L4hharroun0ZE01l @ https://globe.unstract.com
Target: org_HJRi1wVgyB4IpVZT @ https://onpremha.globe.unstract.com
Clone Report
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━┓
┃ Phase ┃ Created ┃ Adopted ┃ Skipped ┃ Failed ┃ Time ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━┩
│ adapter │ 0 │ 4 │ 0 │ 0 │ 0.9s │
│ connector │ 0 │ 0 │ 1 │ 0 │ 0.2s │
│ tag │ 0 │ 0 │ 0 │ 0 │ 0.1s │
│ custom_tool │ 0 │ 1 │ 4 │ 0 │ 0.7s │
│ files │ 0 │ 0 │ 1 │ 0 │ 0.3s │
│ workflow │ 0 │ 1 │ 0 │ 0 │ 0.3s │
│ tool_instance │ 0 │ 1 │ 0 │ 0 │ 0.2s │
│ workflow_endpoint │ 2 │ 0 │ 0 │ 0 │ 0.3s │
│ pipeline │ 0 │ 0 │ 0 │ 0 │ 0.1s │
│ api_deployment │ 0 │ 1 │ 0 │ 0 │ 0.2s │
├──────────────────────┼─────────┼─────────┼─────────┼────────┼──────┤
│ TOTAL │ 2 │ 8 │ 6 │ 0 │ 3.4s │
└──────────────────────┴─────────┴─────────┴─────────┴────────┴──────┘
Remap entries: adapter=4, custom_tool=1, prompt_studio_registry=1, workflow=1, tool_instance=1, workflow_endpoint=2, api_deployment=1
Completed successfully
Each phase that runs gets a row regardless of count — a row of zeros means the phase ran but found nothing to do. Remap entries only list entities that actually picked up a UUID mapping, so tag and files are absent here.
When a run isn't this clean, two extra sections appear above the status footer:
- Files needing manual follow-up.
Files uploaded: N, then one or more ofOversize files (manual upload required),Unsupported mime files (manual upload required),Skipped files (operator action required), andFailed files. Each row identifies the tool and file so you can act without scrolling back through the run log. Oversize and unsupported-mime files count under Skipped in the per-phase row. - Failures summary. A
Failures (see WARNING/ERROR log lines above for full detail)block lists each errored entity by phase, capped at 30 rows and 200 characters per line — enough to triage without drowning the report.
The status footer reads Completed successfully on a clean run, Completed with N failure(s) — see WARNING/ERROR log lines above for details when phases failed, or ABORTED: <reason> when the run was halted (e.g. --on-name-conflict abort).
The same data is also available programmatically via report.as_dict() if you wrap the command in your own automation.
Recovering from a failure
- Read the printed report — completed phases and the entity that failed are both listed.
- Fix the underlying issue (network, permissions, missing credentials, oversized payload, etc.).
- Re-run the same command.
There is no --resume-from flag. The pre-checks make a re-run safe and cheap.
Related
- Cascading Workflows — promote a single Prompt Studio project or deployment across environments. Use this when you only need to move one project, not the whole org.