LLMWhisperer Python Client
This documentation is for the V2 version of the LLMWhisperer API. The corresponding Python client version is 2.x.y
. V1 and V2 are not backward compatible.
This Python client provides a simple and efficient way to interact with the LLMWhisperer API. LLMWhisperer is a technology that presents data from complex documents (different designs and formats) to LLMs in a way that they can best understand.
Features
- Easy to use Pythonic interface.
- Handles all the HTTP requests and responses for you.
- Raises Python exceptions for API errors.
Installation
You can install the LLMWhisperer Python Client using pip:
pip install llmwhisperer-client
Environment Variables
Variable | Description |
---|---|
LLMWHISPERER_BASE_URL_V2 | The base URL of the API. When left undefined, default https://llmwhisperer-api.us-central.unstract.com/api/v2 is used |
LLMWHISPERER_API_KEY | The API key to use for authenticating requests to the API. |
LLMWHISPERER_LOGGING_LEVEL | The logging level to use. Possible values are ERROR , WARN , INFO , DEBUG |
All environment variables are optional. If LLMWHISPERER_API_KEY
is not set, you must provide the API key when creating a new client. The environment variables can be overridden by providing the values in the client constructor.
Usage
First, import the LLMWhispererClientV2
from the client
module:
from unstract.llmwhisperer import LLMWhispererClientV2
Then, create an instance of the LLMWhispererClientV2
:
# All parameters are optional when environment variables are set
client = LLMWhispererClientV2()
or
# Provide the base URL and API key explicitly
client = LLMWhispererClientV2(base_url="https://llmwhisperer-api.us-central.unstract.com/api/v2", api_key="your_api_key")
Now, you can use the client to interact with the LLMWhisperer API:
# Get usage info
usage_info = client.get_usage_info()
# Process a document in async mode
# The client will return with a whisper hash which can be used to check the status and retrieve the result
whisper = client.whisper(file_path="path_to_your_file")
# Get the status of a whisper operation
# whisper_hash is available in the 'whisper_hash' field of the result of the whisper operation
status = client.whisper_status(whisper_hash)
# Retrieve the result of a whisper operation
# whisper_hash is available in the 'whisper_hash' field of the result of the whisper operation
whisper = client.whisper_retrieve(whisper_hash)
# Or, call the whisper method in sync mode
# The client will wait for the extraction to complete and return the result
whisper = client.whisper(
file_path="path_to_your_file",
wait_for_completion=True,
wait_timeout=200
)
Error Handling
The client raises LLMWhispererClientException
for API errors:
try:
result = client.whisper_retrieve("invalid_hash")
except LLMWhispererClientException as e:
print(f"Error: {e.message}, Status Code: {e.status_code}")
Typical usage
Using the default async mode, the client will return with a whisper hash which can be used to check the status and retrieve the result.
client = LLMWhispererClientV2()
try:
result = client.whisper(
file_path="sample_files/credit_card.pdf",
)
if result["status_code"] == 202:
print("Whisper request accepted.")
print(f"Whisper hash: {result['whisper_hash']}")
while True:
print("Polling for whisper status...")
status = client.whisper_status(whisper_hash=result["whisper_hash"])
if status["status"] == "processing":
print("STATUS: processing...")
elif status["status"] == "delivered":
print("STATUS: Already delivered!")
break
elif status["status"] == "unknown":
print("STATUS: unknown...")
break
elif status["status"] == "processed":
print("STATUS: processed!")
print("Let's retrieve the result of the extraction...")
resultx = client.whisper_retrieve(
whisper_hash=result["whisper_hash"]
)
# Refer to documentation for result format
print(resultx)
break
# Poll every 5 seconds
time.sleep(5)
except LLMWhispererClientException as e:
print(e)
or, you can call the whisper method in sync mode, which is a helper implementation of the above code:
client = LLMWhispererClientV2()
try:
result = client.whisper(
file_path="sample_files/credit_card.pdf",
wait_for_completion=True,
wait_timeout=200,
)
print(result)
except LLMWhispererClientException as e:
print(e)
Highlighting data helper function
Refer to API documentation for detailed information. The following is a helper function to conveniently get the box coordinates of the highlighted data.
client = LLMWhispererClientV2()
whisper = client.whisper(
file_path="path_to_your_file",
wait_for_completion=True,
wait_timout=200
)
ht_line = 10 # Line number to highlight
target_width = 2480 # Target width of the image in UI
target_height = 3508 # Target height of the image in UI
page, x1, y1, x2, y2 = client.get_highlight_rect(
line_metadata = whisper["extraction"]["line_metadata"][ht_line],
line_no = ht_line,
target_width = target_width,
target_height = target_height
)
# Use the page, x1, y1, x2, y2 to highlight the line in the UI
Result format
whisper
The whisper
method returns a dictionary
For Asyn operation (default)
{
"message": "Whisper Job Accepted",
"status": "processing",
"whisper_hash": "XXX37efd|XXXXXXXe92b30823c4ed3da759ef670f",
"status_code": 202,
"extraction": {}
}
The whisper_hash
can be used to check the status of the extraction and retrieve the result. extraction
will be empty for async operations.
For Sync operation
{
"message": "Whisper Job Accepted",
"status": "processed",
"whisper_hash": "XXX37efd|XXXXXXXe92b30823c4ed3da759ef670f",
"status_code": 200,
"extraction": {
"confidence_metadata" : [],
"line_metadata" : [],
"metadata" : {},
"result_text" : "<Extracted Text>",
"webhook_metadata" : ""
}
}
Refer to the whisper_retrieve
API for details on the result format.