Skip to main content
Version: 2.0.0

LLMWhisperer Python Client

note

This documentation is for the V2 version of the LLMWhisperer API. The corresponding Python client version is 2.x.y. V1 and V2 are not backward compatible.

This Python client provides a simple and efficient way to interact with the LLMWhisperer API. LLMWhisperer is a technology that presents data from complex documents (different designs and formats) to LLMs in a way that they can best understand.

Features

  • Easy to use Pythonic interface.
  • Handles all the HTTP requests and responses for you.
  • Raises Python exceptions for API errors.

Installation

You can install the LLMWhisperer Python Client using pip:

pip install llmwhisperer-client

Environment Variables

VariableDescription
LLMWHISPERER_BASE_URL_V2The base URL of the API. When left undefined, default https://llmwhisperer-api.us-central.unstract.com/api/v2 is used
LLMWHISPERER_API_KEYThe API key to use for authenticating requests to the API.
LLMWHISPERER_LOGGING_LEVELThe logging level to use. Possible values are ERROR, WARN, INFO, DEBUG

All environment variables are optional. If LLMWHISPERER_API_KEY is not set, you must provide the API key when creating a new client. The environment variables can be overridden by providing the values in the client constructor.

Usage

First, import the LLMWhispererClientV2 from the client module:

from unstract.llmwhisperer import LLMWhispererClientV2

Then, create an instance of the LLMWhispererClientV2:

# All parameters are optional when environment variables are set
client = LLMWhispererClientV2()

or

# Provide the base URL and API key explicitly
client = LLMWhispererClientV2(base_url="https://llmwhisperer-api.us-central.unstract.com/api/v2", api_key="your_api_key")

Now, you can use the client to interact with the LLMWhisperer API:

# Get usage info
usage_info = client.get_usage_info()

# Process a document in async mode
# The client will return with a whisper hash which can be used to check the status and retrieve the result
whisper = client.whisper(file_path="path_to_your_file")

# Get the status of a whisper operation
# whisper_hash is available in the 'whisper_hash' field of the result of the whisper operation
status = client.whisper_status(whisper_hash)

# Retrieve the result of a whisper operation
# whisper_hash is available in the 'whisper_hash' field of the result of the whisper operation
whisper = client.whisper_retrieve(whisper_hash)

# Or, call the whisper method in sync mode
# The client will wait for the extraction to complete and return the result
whisper = client.whisper(
file_path="path_to_your_file",
wait_for_completion=True,
wait_timeout=200
)

Error Handling

The client raises LLMWhispererClientException for API errors:

try:
result = client.whisper_retrieve("invalid_hash")
except LLMWhispererClientException as e:
print(f"Error: {e.message}, Status Code: {e.status_code}")

Typical usage

Using the default async mode, the client will return with a whisper hash which can be used to check the status and retrieve the result.

client = LLMWhispererClientV2()
try:
result = client.whisper(
file_path="sample_files/credit_card.pdf",
)
if result["status_code"] == 202:
print("Whisper request accepted.")
print(f"Whisper hash: {result['whisper_hash']}")
while True:
print("Polling for whisper status...")
status = client.whisper_status(whisper_hash=result["whisper_hash"])
if status["status"] == "processing":
print("STATUS: processing...")
elif status["status"] == "delivered":
print("STATUS: Already delivered!")
break
elif status["status"] == "unknown":
print("STATUS: unknown...")
break
elif status["status"] == "processed":
print("STATUS: processed!")
print("Let's retrieve the result of the extraction...")
resultx = client.whisper_retrieve(
whisper_hash=result["whisper_hash"]
)
# Refer to documentation for result format
print(resultx)
break
# Poll every 5 seconds
time.sleep(5)
except LLMWhispererClientException as e:
print(e)

or, you can call the whisper method in sync mode, which is a helper implementation of the above code:

client = LLMWhispererClientV2()
try:
result = client.whisper(
file_path="sample_files/credit_card.pdf",
wait_for_completion=True,
wait_timeout=200,
)
print(result)
except LLMWhispererClientException as e:
print(e)

Highlighting data helper function

Refer to API documentation for detailed information. The following is a helper function to conveniently get the box coordinates of the highlighted data.

client = LLMWhispererClientV2()
whisper = client.whisper(
file_path="path_to_your_file",
wait_for_completion=True,
wait_timout=200
)
ht_line = 10 # Line number to highlight
target_width = 2480 # Target width of the image in UI
target_height = 3508 # Target height of the image in UI
page, x1, y1, x2, y2 = client.get_highlight_rect(
line_metadata = whisper["extraction"]["line_metadata"][ht_line],
line_no = ht_line,
target_width = target_width,
target_height = target_height
)

# Use the page, x1, y1, x2, y2 to highlight the line in the UI

Result format

whisper

The whisper method returns a dictionary

For Asyn operation (default)

{
"message": "Whisper Job Accepted",
"status": "processing",
"whisper_hash": "XXX37efd|XXXXXXXe92b30823c4ed3da759ef670f",
"status_code": 202,
"extraction": {}
}

The whisper_hash can be used to check the status of the extraction and retrieve the result. extraction will be empty for async operations.

For Sync operation

{
"message": "Whisper Job Accepted",
"status": "processed",
"whisper_hash": "XXX37efd|XXXXXXXe92b30823c4ed3da759ef670f",
"status_code": 200,
"extraction": {
"confidence_metadata" : [],
"line_metadata" : [],
"metadata" : {},
"result_text" : "<Extracted Text>",
"webhook_metadata" : ""
}
}

Refer to the whisper_retrieve API for details on the result format.