Skip to main content

Setting up Human in the Loop (HITL) ETL

Human-in-the-Loop (HITL) ETL extends standard ETL pipelines by adding manual review and approval capabilities for extracted data before it reaches the destination database.

Prerequisites

  1. Export a Prompt Studio project as a tool

    Export Prompt Studio Project Export Confirmation

  2. Configure your source filesystem connector

  3. Configure your destination database connector

Creating the HITL Workflow

Step 1: Create Workflow with HITL Configuration

Follow the Workflow Setup Guide to create your workflow:

  1. Navigate to Workflows in the side navigation
  2. Create a new workflow
  3. Configure source, destination, and select your exported tool
  4. Enable HITL: In the destination connector configuration, access the Human in the Loop tab
  5. Configure HITL rules and settings:
    • Percentage of files sent to Manual Review: The user can set the percentage of files to be sent to HITL (Human-in-the-Loop). Based on this percentage, files will be sent to HITL accordingly. For example, if the percentage is set to 50 and 2 files are processed, then 1 file will be sent to HITL, while the other will be sent directly to the Destination DB.
    • Rule Logic: The user can set the logic to either AND or OR
    • Add Rule: Here, the user can set rules based on specific conditions. All prompt keys (used in Prompt Studio) will be listed, allowing the user to validate conditions. For example, if the key is 'Name,' the user can set a rule such as 'Name starts with Joseph.' Only files that satisfy the condition will be sent to HITL.
    • After approval, send result to: Configure approval destination (Destination DB or Queue)

For detailed HITL configuration options, see the Workflow Overview guide.

Step 2: Deploy as ETL

Click Deploy as ETL to create your HITL-enabled ETL pipeline.

HITL ETL Creation

Once deployed, files matching your HITL rules will be routed to the review queue.

User Roles and Permissions

Invite users with appropriate access rights:

  • Unstract Admin - Full access to Human Quality Review
  • Unstract User - No access to Human Quality Review
  • Unstract Reviewer - Can review documents in the queue
  • Unstract Supervisor - Can review and approve documents

Invite Users User Roles

Reviewing Documents

Accessing the Review Queue

  1. Navigate to Human Quality Review in the side navigation
  2. Select the project from the dropdown
  3. Click Fetch Next to retrieve the next document for review
  4. Click Queue Details to view queue status

Review Queue Document Review

Interacting with Results

  • Single-click on a result to highlight the corresponding text in the document Highlight Result

  • Double-click on a result to edit it Edit Result

  • Click Finish Review to send the document to the approval queue Finish Review

Approving Documents

Reviewed files move to the Approval Queue where supervisors can:

  1. Review the extracted data
  2. Highlight and verify results
  3. Make final edits if needed
  4. Approve the document for final processing

Approval Queue Approver View Approval Highlighting Final Approval

Retrieving Approved Results

Option 1: Destination Database

If you configured "After approval, send result to" as "Destination DB", approved results are automatically inserted into your configured database table.

Option 2: Queue (API Retrieval)

If you configured the approval destination as "Queue", retrieve results using the API endpoint:

curl --location 'https://us-central.unstract.com/mr/api/<organization_id>/approved/result/<class_id>/' \
--header 'Authorization: Bearer <api_key>'

Getting Required Parameters:

Organization ID:

The organization ID is available in your ETL endpoint URL.

Organization ID

API Key:

Create an API key for Human Review:

  1. Navigate to Human Quality Review
  2. Click Create API Key (follow screenshots)

Create API Key API Key Created

Class ID:

Get the class ID from Download and Sync Manager:

  1. Click on Profile IconDownload and Sync Manager
  2. Find your class ID in the list

Access Sync Manager Get Class ID