LLMWhisperer Python Client
This documentation is for the V2 version of the LLMWhisperer API. The corresponding Python client version is 2.x.y. V1 and V2 are not backward compatible.
This Python client provides a simple and efficient way to interact with the LLMWhisperer API. LLMWhisperer is a technology that presents data from complex documents (different designs and formats) to LLMs in a way that they can best understand.
Features
- Easy to use Pythonic interface.
- Handles all the HTTP requests and responses for you.
- Raises Python exceptions for API errors.
Installation
You can install the LLMWhisperer Python Client using pip:
pip install llmwhisperer-client
Environment Variables
| Variable | Description |
|---|---|
| LLMWHISPERER_BASE_URL_V2 | The base URL of the API. When left undefined, default https://llmwhisperer-api.us-central.unstract.com/api/v2 is used. If you are in EU region, use https://llmwhisperer-api.eu-west.unstract.com/api/v2 |
| LLMWHISPERER_API_KEY | The API key to use for authenticating requests to the API. |
| LLMWHISPERER_LOGGING_LEVEL | The logging level to use. Possible values are ERROR, WARN, INFO, DEBUG |
All environment variables are optional. If LLMWHISPERER_API_KEY is not set, you must provide the API key when creating a new client. The environment variables can be overridden by providing the values in the client constructor.
Usage
First, import the LLMWhispererClientV2 from the client module:
from unstract.llmwhisperer import LLMWhispererClientV2
from unstract.llmwhisperer.client_v2 import LLMWhispererClientException
Then, create an instance of the LLMWhispererClientV2:
# All parameters are optional when environment variables are set
client = LLMWhispererClientV2()
or
# Provide the base URL and API key explicitly
client = LLMWhispererClientV2(base_url="https://llmwhisperer-api.us-central.unstract.com/api/v2", api_key="your_api_key")
Now, you can use the client to interact with the LLMWhisperer API:
# Get usage info
usage_info = client.get_usage_info()
# Process a document in async mode
# The client will return with a whisper hash which can be used to check the status and retrieve the result
whisper = client.whisper(file_path="path_to_your_file")
# Get the status of a whisper operation
# whisper_hash is available in the 'whisper_hash' field of the result of the whisper operation
status = client.whisper_status(whisper_hash)
# Retrieve the result of a whisper operation
# whisper_hash is available in the 'whisper_hash' field of the result of the whisper operation
whisper = client.whisper_retrieve(whisper_hash)
# Or, call the whisper method in sync mode
# The client will wait for the extraction to complete and return the result
whisper = client.whisper(
file_path="path_to_your_file",
wait_for_completion=True,
wait_timeout=200
)
Error Handling
The client raises LLMWhispererClientException for API errors:
try:
result = client.whisper_retrieve("invalid_hash")
except LLMWhispererClientException as e:
print(f"Error: {e.message}, Status Code: {e.status_code}")
Typical usage
Using the default async mode, the client will return with a whisper hash which can be used to check the status and retrieve the result.
client = LLMWhispererClientV2()
try:
result = client.whisper(
file_path="sample_files/credit_card.pdf",
)
if result["status_code"] == 202:
print("Whisper request accepted.")
print(f"Whisper hash: {result['whisper_hash']}")
while True:
print("Polling for whisper status...")
status = client.whisper_status(whisper_hash=result["whisper_hash"])
if status["status"] == "processing":
print("STATUS: processing...")
elif status["status"] == "delivered":
print("STATUS: Already delivered!")
break
elif status["status"] == "unknown":
print("STATUS: unknown...")
break
elif status["status"] == "processed":
print("STATUS: processed!")
print("Let's retrieve the result of the extraction...")
resultx = client.whisper_retrieve(
whisper_hash=result["whisper_hash"]
)
# Refer to documentation for result format
print(resultx)
break
# Poll every 5 seconds
time.sleep(5)
except LLMWhispererClientException as e:
print(e)
or, you can call the whisper method in sync mode, which is a helper implementation of the above code:
client = LLMWhispererClientV2()
try:
result = client.whisper(
file_path="sample_files/credit_card.pdf",
wait_for_completion=True,
wait_timeout=200,
)
print(result)
except LLMWhispererClientException as e:
print(e)
Highlighting data helper function
Refer to API documentation for detailed information. The following is a helper function to conveniently get the box coordinates of the highlighted data.
client = LLMWhispererClientV2()
whisper = client.whisper(
file_path="path_to_your_file",
wait_for_completion=True,
wait_timout=200
)
ht_line = 10 # Line number to highlight
target_width = 2480 # Target width of the image in UI
target_height = 3508 # Target height of the image in UI
page, x1, y1, x2, y2 = client.get_highlight_rect(
line_metadata = whisper["extraction"]["line_metadata"][ht_line],
line_no = ht_line,
target_width = target_width,
target_height = target_height
)
# Use the page, x1, y1, x2, y2 to highlight the line in the UI
Result format
whisper
The whisper method returns a dictionary
For Asyn operation (default)
{
"message": "Whisper Job Accepted",
"status": "processing",
"whisper_hash": "XXX37efd|XXXXXXXe92b30823c4ed3da759ef670f",
"status_code": 202,
"extraction": {}
}
The whisper_hash can be used to check the status of the extraction and retrieve the result. extraction will be empty for async operations.
For Sync operation
{
"message": "Whisper Job Accepted",
"status": "processed",
"whisper_hash": "XXX37efd|XXXXXXXe92b30823c4ed3da759ef670f",
"status_code": 200,
"extraction": {
"confidence_metadata" : [],
"line_metadata" : [],
"metadata" : {},
"result_text" : "<Extracted Text>",
"webhook_metadata" : ""
}
}
Refer to the whisper_retrieve API for details on the result format.