Highlighting API
API to generate highlight locations for a search term in the document which has been extracted using LLMWhisperer. This can be used to highlight the search terms in your frontend application.
Note: The parameter
store_metadata_for_highlightingshould be set totruewhile starting the whisper process to use this API.
| Endpoint | /highlight-data |
| URL | https://llmwhisperer-api.unstract.com/v1/highlight-data |
| Method | POST |
| Headers | unstract-key: <YOUR API KEY> |
Parameters
| Parameter | Type | Default | Required | Description |
|---|---|---|---|---|
| whisper-hash | string | Yes | The whisper hash returned while starting the whisper process. |
Request Body
The request body should contain the search term. The Content-Type should be text/plain.
Example Curl Request
curl -X POST --location 'https://llmwhisperer-api.unstract.com/v1/highlight-data?whisper-hash=XXXXXXXXXXXXXXXXXXX' \
-H 'unstract-key: <Your API Key>' \
-H 'Content-Type: text/plain' \
--data 'search term'
info
To include the headers in the response use curl -i in the request.
Response
| HTTP Status | Content-Type | Headers | Description |
|---|---|---|---|
| 200 | application/json | Highlight information | |
| 400 | application/json | Error while retrieveing. Refer below for JSON format |
Highlight Data
The API will return the highlight data in the response body. The JSON response will contain an array of JSON objects. Each object will contain the following fields:
| Field | Type | Description |
|---|---|---|
| text | string | The search term (or part of) |
| page | integer | The page number where the search term is found. 0 indexed |
| pos | integer | The position of the search term in the text. 0 indexed |
| left | float | The left position of the search term in the page. |
| top | float | The top position of the search term in the page. |
| width | float | The width of the search term in the page. |
| height | float | The height of the search term in the page. |
| distance | integer | The Levenshtein distance of the search term from the match. |
| location_type | string | The unit of the location values. Can be inches or points based on how the original document was created and processed |
Example 200 Response
{
"highlight_data": [
{
"distance": 4,
"height": 0.11460000000000026,
"left": 6.2699,
"location_type": "inches",
"page": 0,
"pos": 960,
"text": "U.S.",
"top": 2.9505,
"width": 0.20060000000000056
},
{
"distance": 4,
"height": 0.11940000000000017,
"left": 6.4944,
"location_type": "inches",
"page": 0,
"pos": 965,
"text": "Citizen",
"top": 2.9505,
"width": 0.34860000000000024
}
]
}
Example 400 Response
{
"message": "No Metadata found"
}