Version: 2.0.0

Documentation

LLMWhispererClientV2

The LLMWhispererClientV2 class provides a client for interacting with the LLMWhisperer API (v2.x). This client simplifies sending documents for processing and retrieving results from the LLMWhisperer API.

Constructor

__init__(base_url: str = "", api_key: str = "", logging_level: str = "")

Initializes a new instance of the LLMWhispererClientV2 with optional parameters for the base URL, API key, and logging level.

Parameters:

base_url (str, optional): The base URL for the LLMWhisperer API. If not provided, it defaults to the LLMWHISPERER_BASE_URL_V2 environment variable or a default URL.
api_key (str, optional): The API key for authenticating requests. If not provided, it defaults to the LLMWHISPERER_API_KEY environment variable.
logging_level (str, optional): The logging level for the client. Accepts values such as:
- "DEBUG"
- "INFO"
- "WARNING"
- "ERROR"
Defaults to "DEBUG" or uses the LLMWHISPERER_LOGGING_LEVEL environment variable.

Methods

get_usage_info() -> dict

Retrieves the usage information of the LLMWhisperer API.

Returns:

dict: A dictionary containing the API usage information.

Raises:

LLMWhispererClientException: If the API request fails.

whisper(
    file_path: str = "", 
    stream: IO[bytes] = None, 
    url: str = "", 
    mode: str = "high_quality", 
    output_mode: str = "layout_preserving", 
    page_seperator: str = "<<<", 
    pages_to_extract: str = "", 
    median_filter_size: int = 0, 
    gaussian_blur_radius: int = 0, 
    line_splitter_tolerance: float = 0.75, 
    horizontal_stretch_factor: float = 1.0, 
    mark_vertical_lines: bool = False, 
    mark_horizontal_lines: bool = False, 
    line_spitter_strategy: str = "left-priority", 
    lang: str = "eng", 
    tag: str = "default", 
    filename: str = "", 
    webhook_metadata: str = "", 
    use_webhook: str = "", 
    wait_for_completion: bool = False, 
    wait_timeout: int = 180,
    encoding: str = "utf-8",
    add_line_nos: bool = False,
) -> dict

info

Refer to the API documentation for detailed information on the parameters.

Processes a document via the LLMWhisperer API using various input methods (file path, stream, or URL).

Parameters:

file_path (str, optional): The file path of the document to be processed. Defaults to an empty string.
stream (IO[bytes], optional): A stream of bytes (e.g., file-like object) to be processed. Defaults to None.
url (str, optional): The URL of the document to be processed. Defaults to an empty string.
mode (str, optional): The processing mode. Can be:
- "high_quality" (default)
- "form"
- "low_cost"
- "native_text"
output_mode (str, optional): The output format for the processed document. Can be:
- "layout_preserving" (default)
- "text"
page_seperator (str, optional): A string used to separate pages in the output. Defaults to "<<<".
pages_to_extract (str, optional): A string specifying which pages to extract. Defaults to an empty string (processes all pages).
median_filter_size (int, optional): The size of the median filter to apply during processing. Defaults to 0.
gaussian_blur_radius (int, optional): The radius of the Gaussian blur to apply. Defaults to 0.
line_splitter_tolerance (float, optional): Tolerance for line splitting. Defaults to 0.75.
horizontal_stretch_factor (float, optional): Factor to stretch the document horizontally. Defaults to 1.0.
mark_vertical_lines (bool, optional): Whether to mark vertical lines in the output. Defaults to False.
mark_horizontal_lines (bool, optional): Whether to mark horizontal lines in the output. Defaults to False.
line_spitter_strategy (str, optional): Strategy for splitting lines. Can be "left-priority" (default).
lang (str, optional): The language of the document. Defaults to "eng".
tag (str, optional): A custom tag for the document. Defaults to "default".
filename (str, optional): The file name for the document in the output. Defaults to an empty string.
webhook_metadata (str, optional): Metadata to be sent to the webhook, if used. Defaults to an empty string.
use_webhook (str, optional): Name of the webhook to call. Defaults to an empty string.
wait_for_completion (bool, optional): Whether to wait for the whisper operation to complete. Defaults to False.
wait_timeout (int, optional): Time to wait (in seconds) for the operation to complete if wait_for_completion is True. Defaults to 180.
encoding (str): The character encoding to use for processing the text. Defaults to "utf-8".
add_line_nos (bool, optional): Adds line numbers to the extracted text and saves line metadata, which can be queried later using the highlights API.

Returns:

dict: The processed document or status information.

Raises:

LLMWhispererClientException: If the API request fails or document input is invalid.

whisper_status(whisper_hash: str) -> dict

Retrieves the status of a whisper operation.

Parameters:

whisper_hash (str): The unique hash for the whisper operation (returned by the whisper method).

Returns:

dict: A dictionary containing the status of the whisper operation, including:
- "status_code": The HTTP status code.
- "status": Status details of the whisper operation.

Raises:

LLMWhispererClientException: If the API request fails.

whisper_retrieve(whisper_hash: str) -> dict

Retrieves the result of a whisper operation.

Parameters:

whisper_hash (str): The unique hash for the whisper operation.

Returns:

dict: A dictionary containing the result of the whisper operation, including:
- "status_code": The HTTP status code.
- "extraction": The extracted text from the document.

Raises:

LLMWhispererClientException: If the API request fails.

register_webhook(url: str, auth_token: str, webhook_name: str) -> dict

Registers a webhook to the LLMWhisperer API for event notifications.

Parameters:

url (str): The URL for the webhook.
auth_token (str): The authentication token to be sent with webhook events.
webhook_name (str): A custom name for the webhook.

Returns:

dict: A dictionary containing the webhook registration status.

Raises:

LLMWhispererClientException: If the API request fails.

get_webhook_details(webhook_name: str) -> dict

Retrieves details about a previously registered webhook.

Parameters:

webhook_name (str): The name of the webhook whose details are to be retrieved.

Returns:

dict: A dictionary containing the details of the webhook, including:
- "status_code": The HTTP status code.
- "details": Webhook information.

Raises:

LLMWhispererClientException: If the API request fails.

get_highlight_rect(self,
        line_metadata: list[int],
        target_width: int,
        target_height: int)

Given the line metadata and the line number, this function returns the bounding box of the line in the format (page, x1, y1, x2, y2).

Parameters

line_metadata (list[int]): The line metadata returned by the LLMWhisperer API. The list typically contains information like page number, line position, and line height.
target_width (int): The width of the target image or page in the UI.
target_height (int): The height of the target image or page in the UI.

Returns

tuple[int, int, int, int, int]: A tuple containing the bounding box of the line in the following format:
- page (int): The page number.
- x1 (int): The x-coordinate of the top-left corner of the bounding box.
- y1 (int): The y-coordinate of the top-left corner of the bounding box, scaled to the target height.
- x2 (int): The x-coordinate of the bottom-right corner of the bounding box (equal to target_width).
- y2 (int): The y-coordinate of the bottom-right corner of the bounding box, scaled to the target height.

Exceptions

LLMWhispererClientException

Exception raised for errors occurring within the LLMWhispererClientV2 class.

Attributes:

message (str): A message describing the error.
status_code (int): The HTTP status code returned by the LLMWhisperer API (if applicable).

Methods:

error_message(): Returns the error message.

LLMWhispererClientV2

Constructor​

Parameters:​

Methods​

Returns:​

Raises:​

Parameters:​

Returns:​

Raises:​

Parameters:​

Returns:​

Raises:​

Parameters:​

Returns:​

Raises:​

Parameters:​

Returns:​

Raises:​

Parameters:​

Returns:​

Raises:​

Parameters​

Returns​

Exceptions​

Attributes:​

Methods:​

Constructor

Parameters:

Methods

Returns:

Raises:

Parameters:

Returns:

Raises:

Parameters:

Returns:

Raises:

Parameters:

Returns:

Raises:

Parameters:

Returns:

Raises:

Parameters:

Returns:

Raises:

Parameters

Returns

Exceptions

Attributes:

Methods: